Skip to content

Builds seem to be hanging in some macOS configurations #12718

@baronfel

Description

@baronfel

@captainsafia reported and I have been able to reproduce hangs when doing build that require multiple nodes on some macOS configurations.

The test repo is

> dotnet new webapi
> dotnet publish -t PublishContainer

(meaning, some work that internally spawned requests to service on other worker nodes)

The initial scenario reported was:

  • .Net 10 RC1, macOS latest 16.x

I was able to reproduce on

  • .NET 10 RC2, macOS 26.0.1

What seems to be happening is that the central node tries to spawn worker node(s) but that spawning fails:

 (TID 1) 638977796157357110 +     5.141ms: Command line parameters: System.String[]
.NET TP Worker (TID 4) 638977796211706080 +  4735.016ms: Starting to acquire 1 new or existing node(s) to establish nodes from ID 2 to 2...
.NET TP Worker (TID 4) 638977796211958380 +     25.23ms: Building handshake for node type NodeReuse, Arm64, (version 1): options 16777352.
.NET TP Worker (TID 4) 638977796211980280 +      2.19ms: Handshake salt is 
.NET TP Worker (TID 4) 638977796211996300 +     1.602ms: Tools directory root is /usr/local/share/dotnet/sdk/10.0.100-rc.2.25502.107
.NET TP Worker (TID 4) 638977796212089490 +     9.319ms: Attempting to connect to 6 existing processes 'dotnet'...
.NET TP Worker (TID 4) 638977796212129870 +     4.038ms: Trying to connect to existing process dotnet with id 31712 to establish node 2...
.NET TP Worker (TID 4) 638977796212219450 +     8.958ms: Attempting connect to PID 31712 with pipe /tmp/MSBuild31712 with timeout 0 ms
.NET TP Worker (TID 4) 638977796212275430 +     5.598ms: Failed to connect to pipe /tmp/MSBuild31712. The operation has timed out.
.NET TP Worker (TID 4) 638977796212313560 +     3.813ms: Trying to connect to existing process dotnet with id 31764 to establish node 2...
.NET TP Worker (TID 4) 638977796212367420 +     5.386ms: Attempting connect to PID 31764 with pipe /tmp/MSBuild31764 with timeout 0 ms
.NET TP Worker (TID 4) 638977796212415270 +     4.785ms: Failed to connect to pipe /tmp/MSBuild31764. The operation has timed out.
.NET TP Worker (TID 4) 638977796212445920 +     3.065ms: Trying to connect to existing process dotnet with id 33655 to establish node 2...
.NET TP Worker (TID 4) 638977796212468010 +     2.209ms: Attempting connect to PID 33655 with pipe /tmp/MSBuild33655 with timeout 0 ms
.NET TP Worker (TID 4) 638977796212526250 +     5.824ms: Failed to connect to pipe /tmp/MSBuild33655. The operation has timed out.
.NET TP Worker (TID 4) 638977796212552950 +      2.67ms: Trying to connect to existing process dotnet with id 33734 to establish node 2...
.NET TP Worker (TID 4) 638977796212615290 +     6.234ms: Attempting connect to PID 33734 with pipe /tmp/MSBuild33734 with timeout 0 ms
.NET TP Worker (TID 4) 638977796212639190 +      2.39ms: Failed to connect to pipe /tmp/MSBuild33734. The operation has timed out.
.NET TP Worker (TID 4) 638977796212682320 +     4.313ms: Trying to connect to existing process dotnet with id 33829 to establish node 2...
.NET TP Worker (TID 4) 638977796212725860 +     4.354ms: Attempting connect to PID 33829 with pipe /tmp/MSBuild33829 with timeout 0 ms
.NET TP Worker (TID 4) 638977796212751660 +      2.58ms: Failed to connect to pipe /tmp/MSBuild33829. The operation has timed out.
.NET TP Worker (TID 4) 638977796212808460 +      5.68ms: Trying to connect to existing process dotnet with id 58121 to establish node 2...
.NET TP Worker (TID 4) 638977796212835980 +     2.752ms: Could not connect to existing process, now creating a process...
.NET TP Worker (TID 4) 638977796212864790 +     2.881ms: Launching node from /usr/local/share/dotnet/sdk/10.0.100-rc.2.25502.107/MSBuild.dll
.NET TP Worker (TID 4) 638977796213119480 +    25.469ms: Successfully launched /usr/local/share/dotnet/dotnet node with PID 58132
.NET TP Worker (TID 4) 638977796213142470 +     2.299ms: Attempting connect to PID 58132 with pipe /tmp/MSBuild58132 with timeout 30000 ms
.NET TP Worker (TID 4) 638977796513234050 + 30009.158ms: Failed to connect to pipe /tmp/MSBuild58132. The operation has timed out.
.NET TP Worker (TID 4) 638977796513371850 +     13.78ms: Could not connect to node with PID 58132; it has exited with exit code 1. This can indicate a crash at startup
.NET TP Worker (TID 4) 638977796513435710 +     6.386ms: Launching node from /usr/local/share/dotnet/sdk/10.0.100-rc.2.25502.107/MSBuild.dll
.NET TP Worker (TID 4) 638977796514130080 +    69.437ms: Successfully launched /usr/local/share/dotnet/dotnet node with PID 58160
.NET TP Worker (TID 4) 638977796514186090 +     5.601ms: Attempting connect to PID 58160 with pipe /tmp/MSBuild58160 with timeout 30000 ms
.NET TP Worker (TID 4) 638977796814211940 + 30002.586ms: Failed to connect to pipe /tmp/MSBuild58160. The operation has timed out.
.NET TP Worker (TID 4) 638977796814286920 +     7.498ms: Could not connect to node with PID 58160; it has exited with exit code 1. This can indicate a crash at startup
.NET TP Worker (TID 4) 638977796814419910 +    13.299ms: Launching node from /usr/local/share/dotnet/sdk/10.0.100-rc.2.25502.107/MSBuild.dll
.NET TP Worker (TID 4) 638977796814931700 +    51.179ms: Successfully launched /usr/local/share/dotnet/dotnet node with PID 58173
.NET TP Worker (TID 4) 638977796814971160 +     3.946ms: Attempting connect to PID 58173 with pipe /tmp/MSBuild58173 with timeout 30000 ms
.NET TP Worker (TID 4) 638977797115072790 + 30010.164ms: Failed to connect to pipe /tmp/MSBuild58173. The operation has timed out.
.NET TP Worker (TID 4) 638977797115158920 +     8.613ms: Could not connect to node with PID 58173; it has exited with exit code 1. This can indicate a crash at startup
.NET TP Worker (TID 4) 638977797115256530 +     9.761ms: Launching node from /usr/local/share/dotnet/sdk/10.0.100-rc.2.25502.107/MSBuild.dll
.NET TP Worker (TID 4) 638977797116053630 +     79.71ms: Successfully launched /usr/local/share/dotnet/dotnet node with PID 58179
.NET TP Worker (TID 4) 638977797116094390 +     4.076ms: Attempting connect to PID 58179 with pipe /tmp/MSBuild58179 with timeout 30000 ms

Workarounds include forcing single-node mode and/or otherwise removing parallelization.

The salt seems to be empty, which is concerning, and the command line args aren't rendering well which makes debugging difficult.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions