Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ping unreachable node sometimes times out >1min #5799

Closed
alanshaw opened this issue Nov 27, 2018 · 6 comments
Closed

ping unreachable node sometimes times out >1min #5799

alanshaw opened this issue Nov 27, 2018 · 6 comments
Labels
kind/bug A bug in existing code (including security flaws)

Comments

@alanshaw
Copy link
Member

Sometimes this takes ~10s which is consistent with the code but it frequently takes >1 minute in CI:

screenshot 2018-11-27 at 13 37 41

@magik6k
Copy link
Member

magik6k commented Nov 27, 2018

The real timeout is numPings*pingTimeout -
https://github.com/ipfs/go-ipfs/blob/31099e882469fab12a98f2069e38c847f0b1188c/core/commands/ping.go#L106

What is -n set to in those tests?

@alanshaw
Copy link
Member Author

@alanshaw
Copy link
Member Author

I thought that because the test is for an unreachable node it would actually drop in here https://github.com/ipfs/go-ipfs/blob/31099e882469fab12a98f2069e38c847f0b1188c/core/commands/ping.go#L90 and so the timeout would be 10s - which is consistent with the what we're seeing some of the time...

alanshaw pushed a commit to ipfs-inactive/interface-js-ipfs-core that referenced this issue Nov 29, 2018
This takes around 15s locally and can sometimes take take significantly longer ipfs/kubo#5799

License: MIT
Signed-off-by: Alan Shaw <alan.shaw@protocol.ai>
@Stebalien
Copy link
Member

Triage: @alanshaw are you still seeing this?

and so the timeout would be 10s - which is consistent with the what we're seeing some of the time...

That's the timeout for connecting to the peer. There's another timeout below that's kPingTimeout * numPings. So this should total to kPingTimeout * (numPings+1) which should be 30s in your case.

So yeah, something is broken.

@Stebalien Stebalien added the kind/bug A bug in existing code (including security flaws) label Apr 29, 2019
@alanshaw
Copy link
Member Author

alanshaw commented Apr 30, 2019

I've not removed the additional time we've allocated for the test to pass ;)

Looking at some recent CI builds I can't see it happening.

IMHO we should close this as multiple releases have happened since reporting, we've switched CI and now use the "test" profile in tests - any one of those factors (or something completely different) could have fixed this.

I can open a new issue if I notice it again.

That's the timeout for connecting to the peer. There's another timeout below...

Just FYI, I was not expecting it to get any further since this is for pinging an unreachable node. The Peer ID will not be in the peer store.

@Stebalien
Copy link
Member

Yeah, I just want to make sure we don't have a weird deadlock.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws)
Projects
None yet
Development

No branches or pull requests

3 participants