Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

js/go interop tests failing #6389

Closed
Stebalien opened this issue May 30, 2019 · 10 comments
Closed

js/go interop tests failing #6389

Stebalien opened this issue May 30, 2019 · 10 comments
Labels
kind/bug A bug in existing code (including security flaws)

Comments

@Stebalien
Copy link
Member

Stebalien commented May 30, 2019

Version information:

ipfs version 0.4.21-rc3

Description:

The interop tests are currently failing to exchange files. This appears to be due to changes in libp2p (reverting bitswap doesn't help).

To reproduce:

  1. Download the latest RC.
  2. Clone github.com/ipfs/interop and run npm install.
  3. Run PFS_GO_EXEC=/path/to/go-ipfs npm test -- -t node -f test/exchange-files.js -b
@Stebalien Stebalien added the kind/bug A bug in existing code (including security flaws) label May 30, 2019
@Stebalien
Copy link
Member Author

Stebalien commented May 30, 2019

One issue appears to be related to multistream. In multiformats/go-multistream#32, go started pipelining protocol negotiation. Specifically, instead of:

  1. Client sends server /multistream/1.0.0
  2. Server sends client /multistream/1.0.0
  3. Client sends server the first protocol.
  4. Server sends client na or the protocol (if selected).

The protocol now looks like:

  1. Client sends server /multistream/1.0.0
  2. Client sends server the first protocol.
  3. Server sends client /multistream/1.0.0
  4. Server sends client na or the protocol (if selected).

That way, we can half to a full round trip (depending on the server's implementation).

However, this appears to be causing the connection to fail with:

Error: Dial is currently blacklisted for this peer

(an error from javascript).

But I can't reproduce this issue when I manually spin up a js-ipfs node and a go-ipfs node.

Also note, adding a small delay between sending the /multistream/1.0.0 header and the first protocol appears to fix the issue.

@Stebalien
Copy link
Member Author

Stebalien commented May 30, 2019

The other issue isn't related to multistream and I'm not sure what's causing it. If we revert the multistream change, the test proceeds past the setup step but still takes ages in the file transfer step. It looks like it eventually makes progress but it's really slow.

I've tried:

  • Reverting secio.
  • Reverting mplex.
  • Reverting multistream.
  • Reverting the identify changes.
  • Reverting msgio.

It looks like the first step is taking almost exactly 2 minutes. That looks like some kind of timeout.

@Stebalien
Copy link
Member Author

@dignifiedquire or @jacobheun, any idea what might be causing #6389 (comment)?

@jacobheun
Copy link
Contributor

I imagine js isn't handling the change in operations well. The multistream negotiation is likely erroring out the connection. We added some backoff logic to js if a connection fails the upgrade process, hence the blacklisting error message. I can look at adding a test to js-multiaddr to mimic this and see where it's falling down.

We should really start running libp2p rc releases against libp2p/interop. cc @raulk @vyzo @bigs

@Stebalien
Copy link
Member Author

@jacobheun I've filed multiformats/js-multistream-select#48.

@jacobheun
Copy link
Contributor

jacobheun commented May 30, 2019

Okay, so this actually isn't due to multistream at all.

Good news, the compatibility layer we added for js to work with go MDNS is functioning. The problem is that we're somehow discovering and dialing go before connections are being accepted on that connection. The TCP dial fails causing the connection to get blacklisted for a period, ultimately failing the tests. Disabling MDNS discovery in the interop mitigates the issue. Since these tests do direct dials we should probably disable MDNS, but we need to figure out why discovery is happening before the listeners are ready. Looking more into what's causing that.

@jacobheun
Copy link
Contributor

Go is apparently just advertising port 4001 even though the daemon ends up running on a random port.

@jacobheun
Copy link
Contributor

I'd recommend we disable MDNS in the tests for now and file an issue for fixing MDNS so this isn't blocking the release. In reality we should just move to the new MDNS spec for both implementations and not waste time on a fix for this. The default ipfs daemon runs on port 4001 so this shouldn't be an issue in most cases, and the interop fix was a recent update for js. I'll create a PR in interop to disable MDNS in the tests.

@Stebalien
Copy link
Member Author

Note: there does appear to be a regression in go here as go-ipfs 0.4.20 works fine. I've filed an MDNS bug (https://github.com/libp2p/go-libp2p/issues/new) but yeah, we should just implement the new spec (libp2p/go-libp2p#623).

@Stebalien
Copy link
Member Author

@jacobheun thanks for debugging this! You're absolutely right, disabling MDNS has fixed this. I'm guessing the multistream issue was simply due to the perf improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws)
Projects
None yet
Development

No branches or pull requests

2 participants