-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add connection monitor #2644
Conversation
Adds a connection monitor that periodically ensures remote peers are still online and contactable by trying to send a single byte via the ping protocol, and sets the `.rtt` property of the connection to how long it took. If the ping protocol is not supported by the remote, it tries to infer the round trip time by how long it took to fail. If the remote is unresponsive or opening the stream fails for any other reason, the connection is aborted with the throw error. It's possible to configure the ping interval, how long we wait before considering a peer to be inactive and whether or not to close the connection on failure. Closes #2643
This was discussed on the maintainers call and there was a suggestion to make this a service that the user would have to configure. I'm having a hard time thinking of when you wouldn't want this so it's implemented as part of libp2p and on by default. Happy to talk about this further before merge. |
LGTM. This will eliminate the complexity of managing this at the application layer which is a major improvement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work on this, I left a suggestion and I also have a question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.. only comment i have is about being a little more explicit with the assertions in the tests instead of .gte(0)
after a delay of 100ms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see this be disable-able. This is definitely something that most applications will want to use, but users should be able to turn this off and not pay for it.
As an example, in Lodestar, we use an application-specific ping protocol, with a response that means something quite different. We would like to disable this and handle pings / keep-alives in an application-specific manner.
}) | ||
]) | ||
|
||
conn.rtt = Date.now() - start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this not divided by 2 too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one times how long it takes to send and receive one byte, (e.g. one round trip) but after the stream has been opened (e.g. after multistream select has finished).
The ERR_UNSUPPORTED_PROTOCOL
error is thrown by multistream select, which takes two round trips, so that needs dividing by two but the happy path here doesn't, because we reset the start
variable before timing the single byte rtt.
Co-authored-by: Chad Nehemiah <chad.nehemiah94@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work
@wemeetagain a flag has been added: const node = await createLibp2p({
connectionMonitor: {
enabled: false
}
}) |
Adds a connection monitor that periodically ensures remote peers are still online and contactable by trying to send a single byte via the ping protocol, and sets the
.rtt
property of the connection to how long it took.If the remote responds that the ping protocol is not supported, it tries to infer the round trip time by how long it took to fail.
If the remote is unresponsive or opening the stream fails for any other reason, the connection is aborted with the thrown error.
It's possible to configure the ping interval, how long we wait before considering a peer to be inactive and whether or not to close the connection on failure.
Closes #2643
Change checklist