Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swarm: implement blackhole detection #2320

Merged
merged 2 commits into from
Jun 27, 2023

Conversation

sukunrt
Copy link
Member

@sukunrt sukunrt commented Jun 2, 2023

No description provided.

@marten-seemann marten-seemann mentioned this pull request Jun 2, 2023
27 tasks
@sukunrt sukunrt force-pushed the blackhole-detector branch 2 times, most recently from 98ce244 to 57de43c Compare June 5, 2023 06:53
@sukunrt sukunrt changed the title implement black hole detection and happy eyeballs swarm: implement blackhole detection and happy eyeballs dialing Jun 5, 2023
@sukunrt sukunrt marked this pull request as ready for review June 5, 2023 06:53
@sukunrt
Copy link
Member Author

sukunrt commented Jun 5, 2023

@marten-seemann
For blackhole detection if we want just a binary check like it’s a black hole if the last n consecutive calls are failures we can do away with the sliding window.

This doesn't implement happy eyeballs for TCP since the delay required for that to be effective is about 700ms. That feels too long to me. We can implement happy eyeballs for TCP separately where we introduce a mechanism to notify the worker loop that a connection has been established and security and muxer selection is in progress.

@marten-seemann marten-seemann mentioned this pull request Jun 3, 2023
29 tasks
@sukunrt sukunrt force-pushed the blackhole-detector branch 6 times, most recently from f893840 to 8b8e686 Compare June 6, 2023 06:10
p2p/net/swarm/black_hole_detector.go Outdated Show resolved Hide resolved
p2p/net/swarm/black_hole_detector.go Outdated Show resolved Hide resolved
p2p/net/swarm/black_hole_detector.go Outdated Show resolved Hide resolved
p2p/net/swarm/swarm_dial.go Outdated Show resolved Hide resolved
@marten-seemann
Copy link
Contributor

Should we make the changes to the dial prioritization logic in a follow-up PR? Would be nice to disentangle it.

@marten-seemann
Copy link
Contributor

We probably should also add some metrics to our swarm dashboard. What about this: for both IPv6 and UDP, we'd show a row containing

  • a timeline view that shows if the swarm thinks that we're behind a blackhole
  • the dial success rate
  • if behind a blackhole, the next time we'll probe

@sukunrt
Copy link
Member Author

sukunrt commented Jun 7, 2023

Completely agree with the metrics suggestion.

if behind a blackhole, the next time we'll probe

might be difficult since we probe based on number of requests(every nth request is allowed). It should be fine to make this number of requests till next probe.

@sukunrt
Copy link
Member Author

sukunrt commented Jun 7, 2023

This is what the dashboard looks like for me:

Screenshot 2023-06-07 at 6 47 40 PM

@sukunrt sukunrt changed the title swarm: implement blackhole detection and happy eyeballs dialing swarm: implement blackhole detection Jun 7, 2023
p2p/net/swarm/black_hole_detector.go Outdated Show resolved Hide resolved
p2p/net/swarm/black_hole_detector.go Outdated Show resolved Hide resolved
p2p/net/swarm/black_hole_detector.go Outdated Show resolved Hide resolved
p2p/net/swarm/black_hole_detector.go Outdated Show resolved Hide resolved
p2p/net/swarm/black_hole_detector.go Outdated Show resolved Hide resolved
p2p/net/swarm/black_hole_detector.go Show resolved Hide resolved
p2p/net/swarm/black_hole_detector.go Outdated Show resolved Hide resolved
@sukunrt
Copy link
Member Author

sukunrt commented Jun 19, 2023

@marten-seemann
An alternative is to allow 5-10 probes every n requests. If we do this we can move the black hole can dial check to swarm.filterKnownUndialables.
This is useful because eventually I'd like to get rid of the backoff check here too(opening a separate issue for that). https://github.com/libp2p/go-libp2p/blob/master/p2p/net/swarm/swarm_dial.go#L392

@sukunrt
Copy link
Member Author

sukunrt commented Jun 22, 2023

Updated the PR to do address filtering and probing for reevaluating black hole state at a peer level.

@sukunrt sukunrt force-pushed the blackhole-detector branch 4 times, most recently from 9d7ee69 to 555a36a Compare June 23, 2023 07:59
p2p/net/swarm/black_hole_detector.go Outdated Show resolved Hide resolved
p2p/net/swarm/swarm.go Show resolved Hide resolved
p2p/net/swarm/swarm_dial.go Show resolved Hide resolved
@@ -358,7 +358,8 @@ loop:
}

// it must be an error -- add backoff if applicable and dispatch
if res.Err != context.Canceled && !w.connected {
// ErrDialRefusedBlackHole shouldn't end up here, just a safety check
if res.Err != ErrDialRefusedBlackHole && res.Err != context.Canceled && !w.connected {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go in error log? If we're not noisy about it, we'd miss this bug if we ever introduce it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this

p2p/net/swarm/black_hole_detector_test.go Outdated Show resolved Hide resolved
@marten-seemann marten-seemann merged commit 1e31d70 into libp2p:master Jun 27, 2023
@p-shahi
Copy link
Member

p-shahi commented Jun 28, 2023

Add the description in the release issue highlights while its fresh #2326?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants