You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Process runner should capture exit codes and SIGCHLD signals in highly parallel scenarios on all platforms.
Actual behaviour
The ProcessRunner times out when running external executables in highly parallel scenarios. The sub-processes themselves seem to complete successfully but nevertheless timeout. This behaviour only happens when AP.exe is run on the Mono platform.
Steps to reproduce the behaviour
Platform: Mono (Mono JIT compiler version 5.10.1.47)
This problem has already been fixed. This issue is being filed to document the fix.
It appears the problem is some kind of resource exhaustion that means the SIGCHLD signal get's swallowed. This means Mono does not unblock from the .Wait call and the sub-process (whether it exited successfully, exited with an error, or has legitimately timed out) will always timeout. This issue is a known issue with the Mono platform:
A series of experimental patches were made (essentially guessing at how to bypass the bug because simulating resource exhaustion was hard). An experimental analysis that always failed was run repeatedly, using incremental patches and builds, on a 48-core machine. Eventually, the patches resulted in a build that worked twice in a row and I determined the problem was fixed.
Expected behaviour
Process runner should capture exit codes and SIGCHLD signals in highly parallel scenarios on all platforms.
Actual behaviour
The ProcessRunner times out when running external executables in highly parallel scenarios. The sub-processes themselves seem to complete successfully but nevertheless timeout. This behaviour only happens when AP.exe is run on the Mono platform.
Steps to reproduce the behaviour
Any other details
This problem has already been fixed. This issue is being filed to document the fix.
It appears the problem is some kind of resource exhaustion that means the SIGCHLD signal get's swallowed. This means Mono does not unblock from the
.Wait
call and the sub-process (whether it exited successfully, exited with an error, or has legitimately timed out) will always timeout. This issue is a known issue with the Mono platform:A series of experimental patches were made (essentially guessing at how to bypass the bug because simulating resource exhaustion was hard). An experimental analysis that always failed was run repeatedly, using incremental patches and builds, on a 48-core machine. Eventually, the patches resulted in a build that worked twice in a row and I determined the problem was fixed.
The logs from the processes:
bhutan_test.zip
And the patches that fixed this issue:
The text was updated successfully, but these errors were encountered: