Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlock in Hbc::SystemCommand #21665

Merged
merged 2 commits into from
Jun 4, 2016
Merged

Fix deadlock in Hbc::SystemCommand #21665

merged 2 commits into from
Jun 4, 2016

Conversation

claui
Copy link
Contributor

@claui claui commented Jun 3, 2016

Summary

The current implementation of Hbc::SystemCommand uses the Open3 library in a way that causes deadlocks. In particular, a deadlock occurs whenever STDERR grows beyond a certain size (> 5000 lines) on the default OS X system Ruby.

This PR fixes the issue; it also provides failing tests to make the deadlock reproducible.

@caskroom/maintainers Please review! Thanks in advance 😊

Analysis

Apparently, the bug was well concealed; it was brought to my attention only today when user @winkelsdorf filed issue #18638 (thanks again!). The deadlock is triggered when brew cask install parallels-desktop --force is run with the app already installed; this triggers a /bin/chmod -R -- u+rwx '/Applications/Parallels Desktop.app', producing an extremely long result on standard error, which in turn triggers the bug.

Deadlock

The Ruby documentation, section Open3 summarizes quite well why the deadlock occurs:

You should be careful to avoid deadlocks. Since pipes are fixed length buffer, ::popen3(“prog”) {|i, o, e, t| o.read } deadlocks if the program generates many output on stderr. You should be [sic] read stdout and stderr simultaneously (using thread or IO.select). However if you don’t need stderr output, ::popen2 can be used. If merged stdout and stderr output is not a problem, you can use ::popen2e. If you really needs stdout and stderr output as separate strings, you can consider ::capture3.

Root cause

The offending code snippet from system_command.rb is:

raw_stdin, raw_stdout, raw_stderr, raw_wait_thr = Open3.popen3(*to_exec)



while line = raw_stdout.gets # <-- deadlock occurs here
  processed_stdout << line
  ohai line.chomp if options[:print_stdout]
end

while line = raw_stderr.gets
  processed_stderr << line
  ohai line.chomp if options[:print_stderr]
end

The deadlock occurs during the first while loop when raw_stderr happens to grow beyond a certain size.

In that particular case, both raw_stdout and raw_stderr will block until someone reads from raw_stderr. In the code above, this condition is not met until the second while loop.

In other words: things go south because the first while loop waits on raw_stdout, which waits on raw_stderr to be consumed, which in turn waits for the while loop to finish; a deadlock.

Triggering the failure

When Parallels Desktop.app is installed, the bundle is owned by root and read-only to everyone else. When brew cask install --force is subsequently called, HBC needs to remove the existing app. It is actually quite smart in handling this; as a first gentle attempt before escalating to sudo, it tries to make the app world-writeable:

/bin/chmod -R -- u+rwx '/Applications/Parallels Desktop.app'

In general, this particular chmod is expected to fail; it is only a first attempt to avoid sudo whenever possible. In our example however, the failing chmod causes a lot of noise on STDERR (> 5000 lines) which triggers the deadlock in Hbc::SystemCommand.

The fix

Following the suggestion from the Ruby documentation, I have replaced the hard-coded order of streams in Hbc::SystemCommand by polling IO.select, which guarantees deadlock-free stream selection.

To clean up the code a bit, I also extracted a few methods and ivars while at it.

Testing

Failing specs

I have included a spec which should detect, and fail on, this particular deadlock when run with the default OS X system Ruby. For details, see spec/cask/system_command_spec.rb.

The new specs are in a separate commit within this PR. They be run against either the old and new implementation.

Additional test dependencies

To make this case testable, and allow for the spec to be more expressive, I have added two new test dependencies: rspec-its and rspec-wait. Both gems are released under the MIT license (1 2). I strongly believe that the former package is going to benefit our specs greatly; the latter is needed for deadlock detecting. In my opinion, either package is well worth being included in our testing toolchain.

Running the tests

To install the added test dependencies, you might want need to run bundle install once.

The quickest way to run the specs is:

bundle exec rake test:rspec

If you want to watch the spec fail, simply git reset one commit back in history, and execute the specs there.

claui added 2 commits June 3, 2016 21:54
The current implementation of `Hbc::SystemCommand` uses the `Open3` library in a way that causes a deadlock in cases where `STDERR` grows beyond a certain size.

This commit fixes the issue by following the practice suggested in the Ruby documentation, i. e. choosing a stream selection method that avoids deadlocks.
@claui claui added bug Issue describing a reproducible bug. awaiting maintainer feedback Issue needs response from a maintainer. core Issue with Homebrew itself rather than with a specific cask. labels Jun 3, 2016
@winkelsdorf
Copy link
Contributor

Great finding, and if I might add - one of the best, quickest and most substantial bug analysis I've seen on GH in a long time!

Great work, thanks for spending your time on this! 💯

@jawshooah
Copy link
Contributor

Awesome to have you here, @claui. LGTM 👍

@vitorgalvao
Copy link
Member

Amazing detective work, here, and so detailed! Big thumbs up to this.

@claui
Copy link
Contributor Author

claui commented Jun 3, 2016

Thanks so much @vitorgalvao!

Do we tolerate the MIT thing as well? I’ve seen comments of yours suggesting between the lines that this project is not exactly in love with that license. 😄

@vitorgalvao
Copy link
Member

Do we tolerate the MIT thing as well? I’ve seen comments of yours suggesting between the lines that this project is not exactly in love with that license.

I’m guessing you’re referring to #14877. To clarify, I’m not at all opposed to using or including projects licensed under the MIT or other permissive licenses, I just find them a bit silly since their terms sound specific but are actually vague (“At which point is the code so modified that it is no longer the original?”) and they’re so close to the public domain they could very well just be public domain and benefit everyone on a greater scale.

I’d also like to be clear those views were my own and not necessarily the views of the project as a whole (after all, the issue was open as a question). That said, every other maintainer supported the notion, so in that sense I guess those are indeed the views of the project.

But just to reiterate, no problem at all in using other people’s less permissive stuff, I’d just like for our stuff to be as permissive as possible.

@jawshooah jawshooah merged commit d10d2c8 into Homebrew:master Jun 4, 2016
@claui claui deleted the fix-system-command-deadlock branch June 4, 2016 01:06
@claui claui removed the awaiting maintainer feedback Issue needs response from a maintainer. label Jun 4, 2016
chizmw pushed a commit to chizmw/homebrew-cask that referenced this pull request Jun 15, 2016
* Write failing tests for issue Homebrew#18638

* Fix `Hbc::SystemCommand` to avoid deadlocks; ref Homebrew#18638

The current implementation of `Hbc::SystemCommand` uses the `Open3` library in a way that causes a deadlock in cases where `STDERR` grows beyond a certain size.

This commit fixes the issue by following the practice suggested in the Ruby documentation, i. e. choosing a stream selection method that avoids deadlocks.
@miccal miccal removed bug Issue describing a reproducible bug. core Issue with Homebrew itself rather than with a specific cask. labels Dec 23, 2016
@Homebrew Homebrew locked and limited conversation to collaborators May 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants