Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid calling closed? outside of synchronize block #534

Merged
merged 2 commits into from
Apr 15, 2024

Conversation

djmb
Copy link
Contributor

@djmb djmb commented Apr 4, 2024

closed? calls process on the connection which is not safe because we have not synchronised the connection pool. Another thread might concurrently checkout the connection and start sending commands as well.

`closed?` calls `process` on the connection which is not safe because
we have not synchronised the connection pool. Another thread might
concurrently checkout the connection and start sending commands as well.
@djmb
Copy link
Contributor Author

djmb commented Apr 4, 2024

When using SSHKit with a 100+ hosts, we sometimes get IOErrors like this:

[ERROR (IOError): Exception while executing on host foo: closed stream]

Debugging showed that a packet was being sent to the remote server twice, which was causing it to close the connection. This was caused by two threads using the connection concurrently - the eviction thread and one of the SSHKit parallel runner threads.

The duplicate packets came from here in net-ssh, where both threads call send before either calls output.consume!.

Looking at the eviction code, the call to closed?(first_conn) outside the synchronize block looked suspicious as it calls conn.process(0) which will send a test packet.

The check looks like a performance optimisation, so I think it should be safe to remove. We've not seen the errors again with this fix running.

@mattbrictson mattbrictson added the 🐛 Bug Fix Fixes a bug label Apr 12, 2024
Copy link
Member

@mattbrictson mattbrictson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! Could you make the following change to address the rubocop failure in CI?

lib/sshkit/backends/connection_pool/cache.rb Outdated Show resolved Hide resolved
Co-authored-by: Matt Brictson <matt@123mail.org>
@djmb
Copy link
Contributor Author

djmb commented Apr 15, 2024

Thanks for the review @mattbrictson - I've committed your suggestion.

@mattbrictson mattbrictson merged commit 3242c55 into capistrano:master Apr 15, 2024
15 checks passed
djmb added a commit to basecamp/kamal that referenced this pull request May 20, 2024
This includes a fix for a bug in the eviction thread that could cause
this error:

```
[ERROR (IOError): Exception while executing on host foo: closed stream]
```

See capistrano/sshkit#534
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 Bug Fix Fixes a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants