Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

identify: Fix flaky tests #1555

Merged
merged 7 commits into from
Jun 2, 2022
Merged

identify: Fix flaky tests #1555

merged 7 commits into from
Jun 2, 2022

Conversation

MarcoPolo
Copy link
Collaborator

Depends on the updated version of libp2p/go-libp2p-peerstore#199. Tests will fail until that gets updated.

@marten-seemann do you have any tips on avoiding the WaitForDisconnectNotification hack? We need someway of knowing when things got the disconnected notification so we know when we can actually probe the address book for these tests.

In draft until libp2p/go-libp2p-peerstore#199 gets merged.

@marten-seemann
Copy link
Contributor

@marten-seemann do you have any tips on avoiding the WaitForDisconnectNotification hack? We need someway of knowing when things got the disconnected notification so we know when we can actually probe the address book for these tests.

Haven't done a careful review yet, but 2 things come to mind:

  1. We had this long-standing PR to send swarm notifications synchronously: send notifications synchronously go-libp2p-swarm#295. We blocked this change on some (suspected) problems with bitswap, but we can merge this for the next release. Would that be helpful to you?
  2. Is using the PeerConnectednessChanged event an option?

@MarcoPolo
Copy link
Collaborator Author

MarcoPolo commented May 27, 2022

  1. We had this long-standing PR to send swarm notifications synchronously: send notifications synchronously go-libp2p-swarm#295. We blocked this change on some (suspected) problems with bitswap, but we can merge this for the next release. Would that be helpful to you?

This should obviate the need for this hack as long as identify doesn't handle this notification async

  1. Is using the PeerConnectednessChanged event an option?

No, we need to know when identify finished processing the disconnect notification

@MarcoPolo
Copy link
Collaborator Author

This is basically ready, but I'm going to hold for a bit to see where #1562 ends up. If that gets merged I can remove my hack :)

@MarcoPolo MarcoPolo force-pushed the marco/1523 branch 2 times, most recently from c354900 to 0ec6ab3 Compare June 1, 2022 19:13
@MarcoPolo MarcoPolo marked this pull request as ready for review June 1, 2022 19:15
@MarcoPolo MarcoPolo requested a review from marten-seemann June 1, 2022 19:16
@MarcoPolo
Copy link
Collaborator Author

With the sync notifs change #1562 I could get rid of the old hack. Thanks!

@MarcoPolo MarcoPolo linked an issue Jun 1, 2022 that may be closed by this pull request
@marten-seemann
Copy link
Contributor

There's a failing identify tests in this run: https://github.com/libp2p/go-libp2p/runs/6696246165?check_suite_focus=true.
Is that a separate issue, or should that one have been fixed by this PR?

@MarcoPolo
Copy link
Collaborator Author

There's a failing identify tests in this run: https://github.com/libp2p/go-libp2p/runs/6696246165?check_suite_focus=true. Is that a separate issue, or should that one have been fixed by this PR?

I don't think those are fixed, but I'll add some logging around them since it seems the only way they fail is if identifyConn fails.

@MarcoPolo
Copy link
Collaborator Author

I'm not really sure how something like https://github.com/libp2p/go-libp2p/runs/6696246165?check_suite_focus=true#step:9:923 fails since it presumably waited for identify to happen. And IdentifyConn blocks until identify finishes or errors. If it finished it should have the new addrs since that blocks. If it errored I don't really know why. So I added some logging to the tests and a warn if the identify conn results in an error.

@MarcoPolo
Copy link
Collaborator Author

@marten-seemann I'm trying to reproduce the issues here in this branch (by rerunning ci manually), but I think if this looks good we can also merge and fix up the other flakes in another PR. This PR should help debug those flakey tests with the new logging.

@MarcoPolo MarcoPolo merged commit e6379f5 into master Jun 2, 2022
MarcoPolo added a commit that referenced this pull request Jun 24, 2022
* Fix flaky timing dependent tests

* Update go-libp2p-peerstore dependency

* Register notifiee synchronously

* Only a single connection

* Remove WaitForDisconnectNotification hack since notifs are now synchronous

* Add debug logging to identify tests

* Close chan once
@MarcoPolo MarcoPolo mentioned this pull request Jul 7, 2022
41 tasks
@ajnavarro ajnavarro mentioned this pull request Aug 24, 2022
72 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

identify: flaky TestLargeIdentifyMessage
2 participants