Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProcessRunner timeouts in highly parallel Mono invocations #163

Closed
atruskie opened this issue May 8, 2018 · 0 comments
Closed

ProcessRunner timeouts in highly parallel Mono invocations #163

atruskie opened this issue May 8, 2018 · 0 comments

Comments

@atruskie
Copy link
Member

atruskie commented May 8, 2018

Expected behaviour

Process runner should capture exit codes and SIGCHLD signals in highly parallel scenarios on all platforms.

Actual behaviour

The ProcessRunner times out when running external executables in highly parallel scenarios. The sub-processes themselves seem to complete successfully but nevertheless timeout. This behaviour only happens when AP.exe is run on the Mono platform.

Steps to reproduce the behaviour

Any other details

This problem has already been fixed. This issue is being filed to document the fix.

It appears the problem is some kind of resource exhaustion that means the SIGCHLD signal get's swallowed. This means Mono does not unblock from the .Wait call and the sub-process (whether it exited successfully, exited with an error, or has legitimately timed out) will always timeout. This issue is a known issue with the Mono platform:

A series of experimental patches were made (essentially guessing at how to bypass the bug because simulating resource exhaustion was hard). An experimental analysis that always failed was run repeatedly, using incremental patches and builds, on a 48-core machine. Eventually, the patches resulted in a build that worked twice in a row and I determined the problem was fixed.

The logs from the processes:
bhutan_test.zip

And the patches that fixed this issue:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant