Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

waitForProcess race condition #46

Closed
joeyh opened this issue Oct 28, 2015 · 5 comments · Fixed by #58
Closed

waitForProcess race condition #46

joeyh opened this issue Oct 28, 2015 · 5 comments · Fixed by #58

Comments

@joeyh
Copy link
Contributor

joeyh commented Oct 28, 2015

    -- don't hold the MVar while we call c_waitForProcess...
    -- (XXX but there's a small race window here during which another
    -- thread could close the handle or call waitForProcess)

This is a small race condition indeed, but today I managed to stumble over it, on a Linux system. I have a situation where 2 threads both need to wait for the same process to exit, so both call waitForProcess. One of the threads would then fairly reliably crash with "waitForProcess: does not exist (No child processes)"

Of course, I can work around it in my code by adding my own locking so only 1 thread calls waitForProcess on a process at a time. It could be similarly fixed in System.Process.. add another MVar to ProcessHandle, and make waitForProcess use it to prevent concurrent calls to c_waitForProcess

@snoyberg
Copy link
Collaborator

snoyberg commented Nov 1, 2015

I worked around this limitation in streaming-commons. I'm not sure if
there's any advantage to the current structuring of the code, are you aware
of any downsides to changing it? And are you interested in sending a PR?

On Wed, Oct 28, 2015, 12:41 PM Joey Hess notifications@github.com wrote:

-- don't hold the MVar while we call c_waitForProcess...
-- (XXX but there's a small race window here during which another
-- thread could close the handle or call waitForProcess)

This is a small race condition indeed, but today I managed to stumble over
it, on a Linux system. I have a situation where 2 threads both need to wait
for the same process to exit, so both call waitForProcess. One of the
threads would then fairly reliably crash with "waitForProcess: does not
exist (No child processes)"

Of course, I can work around it in my code by adding my own locking so
only 1 thread calls waitForProcess on a process at a time. It could be
similarly fixed in System.Process.. add another MVar to ProcessHandle, and
make waitForProcess use it to prevent concurrent calls to c_waitForProcess


Reply to this email directly or view it on GitHub
#46.

charles-cooper added a commit to charles-cooper/process that referenced this issue Apr 2, 2016
@charles-cooper
Copy link
Contributor

I ran into the same issue today, on a Linux machine (RTS -N, kernel 4.4.0). I also ended up working around it by catching the exception, which looks like the approach you both have taken.

I wrote up a fix along the lines of @joeyh's suggestion at charles-cooper@fa0d45b; I won't have time to test it for a couple days but @snoyberg perhaps you could take a look and see if I'm heading in the right direction?

@snoyberg
Copy link
Collaborator

snoyberg commented Apr 2, 2016

It seems reasonable. My concerns would be (1) a negative performance hit and (2) breaking backwards compatibility. However, I lean towards doing this, as the bug here is significant enough. I'll look into this some time next week.

@charles-cooper
Copy link
Contributor

Upon further inspection it seems the bug is due not to a race condition but failure to interpret the return code of waitpid. From the documentation (https://www.mkssoftware.com/docs/man3/waitpid.3.asp):

If more than one thread is suspended in waitpid() awaiting termination of the same process, exactly one thread returns the process status at the time of the target child process termination. The other threads return -1, with errno set to ECHILD.

Handling the case accordingly (charles-cooper@e70eb47) seems to fix the bug.

charles-cooper added a commit to charles-cooper/process that referenced this issue Apr 2, 2016
charles-cooper added a commit to charles-cooper/process that referenced this issue Apr 3, 2016
Previously an exception was being thrown when multiple threads were
blocking on waitForProcess due to inconsistent handling of the return
code of `waitpid`:

"If more than one thread is suspended in waitpid() awaiting termination
of the same process, exactly one thread returns the process status at
the time of the target child process termination. The other threads
return -1, with errno set to ECHILD."

`getProcessExitCode` was handling the ECHILD case by returning 1, but
`waitForProcess` was returning (-1) in all cases. For consistency this
commit follows the approach in getProcessExitCode, returning 1 to the
caller of c_waitForProcess if errno is ECHILD, thus avoiding throwing
an exception in the calling code.

Fixes haskell#46
@charles-cooper charles-cooper mentioned this issue Apr 3, 2016
@ndmitchell
Copy link
Contributor

I've been observing a similar bug on Windows for a while, and only today did I manage to track it down. If I call waitForProcess in the form:

forkIO $ void $ waitForProcess $ process ghci
void $ waitForProcess $ process ghci

Then I get an error about 5% of the time. A most annoying bug, and one I would encourage something be done to fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants