-
-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix reconnects when ping timers under/overshoot #289
Conversation
cd7af9e
to
9728f4e
Compare
Generated by 🚫 danger |
This is good! It explains why reconnects are so frequent at scale despite the connection being actually alive (I was trying to debug that.)
|
Thanks for the feedback!
|
Config near ping option interval sounds right. |
The ping timer is set exactly on the interval. Sometimes the timer runs a bit before the ping time so the ping misses. If the next timer overshoots, the ping is not sent and the connection is restarted (because the last message time is > ping_time*2). Scheduling the timer more frequently than the actual ping time fixes these issues as the timer is guaranteed to ping at least once during the ping_time*2 interval.
9728f4e
to
9d3bd82
Compare
@dblock Can you take another look? I wasn't sure how much detail you wanted me to include in the readme, so I provided some coarse details and linked this PR. |
Merged. I am going to cut a release with this. |
This actually broke the integration tests with a slack token. Attempt at a fix in 5c61124. Locally run |
Oops, sorry, I'll take care of this next time. Thanks for fixing it. |
The ping timer is set exactly on the ping interval. Sometimes the timer runs a bit before the ping time so the ping misses. If the next timer overshoots, the ping is not sent and the connection is restarted
(because the last message time is > ping_time*2).
Scheduling the timer more frequently than the actual ping time fixes these issues as the timer is guaranteed to ping at least once during the ping_time*2 interval.
Here is an example of what happens:
Please note that I haven't tested the Celluloid or EventMachine implementations, just the Async one.