-
Notifications
You must be signed in to change notification settings - Fork 741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve lighthouse's disconnect responsiveness #2146
Comments
Looks like this happens because a ping timeout is a |
My similar experience (version: Lighthouse/v1.0.5-9ed65a6) is that very short connection outages, like <30s, often result in one or more missed attestations by Lighthouse, but not Teku, on the same network. The situation is I have a faulty telephone line awaiting repair and periodically the ADSL drops for about 30s before reconnecting. This afternoon I missed 4 attestations in a row on a LH validator but zero missed attestations on a Teku validator on the same network (different machine). If peers are not disconnected for ~10 minutes, then why would a 30s Internet outage result in 15 minutes worth of missed attestations? I guess one answer to this would be that the IP address was changed, although I cannot confirm whether that is the case, nor would I expect that to be the case every time the connection goes down for 30s. |
## Issue Addressed Fixes #2146 ## Proposed Changes Change ping timeout errors to return `LowToleranceErrors` so that we disconnect faster on internet failures/changes.
@pawanjay176 nice. We were not scoring the negotiation timeouts as I had expected. #2147 should make lighthouse significantly more responsive to disconnects. |
## Issue Addressed Fixes #2146 ## Proposed Changes Change ping timeout errors to return `LowToleranceErrors` so that we disconnect faster on internet failures/changes.
## Issue Addressed Fixes #2146 ## Proposed Changes Change ping timeout errors to return `LowToleranceErrors` so that we disconnect faster on internet failures/changes.
Description
Users are reporting Lighthouse taking 5 to 15mins for Lighthouse to drop their peers once a connection drops/changes (#2123).
It is known that discv5 will take at least 5m to update the ENR, but that is unrelated and addressed in a separate issue (#2131)
In current stable we ping every 30 seconds. It should take 2 failed pings to disconnect a peer and therefore I'd expect all peers to be dropped in under a minute. In #2132 I reduced the ping interval to 15 seconds so in the unstable branch I'd expect all peers to be dropped in under 30 seconds.
This doesn't seem to be the case and should be investigated as to why peers are taking longer to drop and reconnect when connectivity drops.
The text was updated successfully, but these errors were encountered: