libp2p-next #5278

romanb · 2020-03-17T13:43:26Z

Supersedes #5066. This PR upgrades substrate to libp2p-0.17. The following are the main libp2p changes which are visible in substrate:

Adapt to Multiple connections per peer libp2p/rust-libp2p#1440 and Full support for multiple connections per peer in libp2p-swarm. libp2p/rust-libp2p#1519.
Adapt to Use upstream version of multihash instead of a fork libp2p/rust-libp2p#1472.
Adapt to Feature-gate all dependencies libp2p/rust-libp2p#1467.

libp2p/rust-libp2p#1440 and libp2p/rust-libp2p#1519

Overview

As described in the libp2p PRs, the underlying changes are primarily in libp2p-core. Realistically, a second connection to the same peer only occurs if two peers connect to each other "at the same time". As a side-effect, existing connections are also no longer closed in favour of new ones, which addresses #4272.

Details

The enabled/disabled status is the same across all connections, as decided by the peerset manager.
send_packet and write_notification always send all data over the same connection to preserve the ordering provided by the transport, as long as that connection is open. If it closes, a second open connection may take over, if one exists, but that case should be no different than a single connection failing and being re-established in terms of potential reordering and dropped messages. Messages can be received on any connection and thus two peers which happened to connect to each other simultaneously may each a different connection for sending data.
The behaviour reports GenericProtoOut::CustomProtocolOpen when the first connection reports NotifsHandlerOut::Open.
The behaviour reports GenericProtoOut::CustomProtocolClosed when the last connection reports NotifsHandlerOut::Closed.

In this way, the number of actual established connections to the peer is an implementation detail of the network behaviours. As mentioned before, in practice and at the time of this writing, there may be at most two connections to a peer and only as a result of simultaneous dialing. The network service even configures a hard limit of 2 connections per peer. However, the implementation in principle accommodates for any number of connections.

Noteworthy

During intermediate testing with the (by default disabled) integration tests test_consensus, test_sync and test_connectivity it was revealed that when run in release mode these tests were very often failing, with the common symptom that the last node to start in a round of testing would often see no other peers (i.e. empty DHT routing table) and thus make no progress while all the others keep on running, causing the tests to time out waiting for the problematic peer to reach a certain state. The tests are mainly using add_reserved_peer on the network to initialise the topology, however, add_reserved_peer ultimately results in a call to add_known_peer on the DiscoveryBehaviour which did not actually add that address to the Kademlia routing table, though it adds it to the user_defined peers which, when passed in the constructor of the behaviour, are added to the Kademlia routing table. I thus changed add_known_peer to also add the given address to the Kademlia routing table and that resolved the issues with these integration tests and the test_connectivity test seems to run notably faster (release mode). My current guess is that the tests were so far unknowingly relying on a timing assumption w.r.t. the initial discovery / connection setup and DHT queries in order for all peers to find each other, in particular when simultaneous connections attempts are in play, as often happens in release mode. Ultimately, the change of letting add_known_peer add the given address to the Kademlia routing table may be a patch worth extracting separately, because it does look like an oversight to me.

libp2p/rust-libp2p#1472

Insubstantial changes to adapt to the new APIs.

libp2p/rust-libp2p#1467

Insubstantial changes to add now necessary feature flags where there is a dependency on libp2p with default-features = false.

romanb · 2020-03-17T14:45:53Z

CC @tomaka, @twittner, @mxinden - Feel free to review at your leisure. This is obviously not mergeable until the next libp2p release but you may already want to skim the PR description.

client/network/src/debug_info.rs

client/network/src/protocol/generic_proto/behaviour.rs

tomaka · 2020-04-02T14:48:00Z

I opened paritytech/polkadot#963, which should fix the Polkadot compilation.

romanb · 2020-04-03T15:24:00Z

client/network/src/service.rs

+				struct SpawnImpl<F>(F);
+				impl<F: Fn(Pin<Box<dyn Future<Output = ()> + Send>>)> Executor for SpawnImpl<F> {
+					fn exec(&self, f: Pin<Box<dyn Future<Output = ()> + Send>>) {
+						(self.0)(f)
+					}


I copied this here from libp2p-core because I forgot to re-expose SwarmBuilder::executor_fn before 0.17, there is only SwarmBuilder::executor, and I wanted to avoid another libp2p release just for that.

tomaka

All right, this looks good to me other than a couple of suggestions. As someone who has been battling through changes and bugs in generic_proto/behaviour.rs, this change is a bit scary, but I also can't spot anything wrong in the code.

Before merging, however, it would be great to deploy it on a validator node and checks its logs and graphs.

client/network/src/service.rs

client/network/src/discovery.rs

client/network/src/protocol/generic_proto/handler/notif_out.rs

client/network/src/service.rs

client/network/src/protocol/light_client_handler.rs

client/network/src/protocol/generic_proto/behaviour.rs

tomaka · 2020-04-06T09:44:51Z

Would you mind merging master, so that we can test Polkadot with it?

romanb · 2020-04-06T11:16:29Z

Would you mind merging master, so that we can test Polkadot with it?

Just did so. For what it is worth, I already ran polkadot with it a few weeks ago, just to check that #4272 is resolved with these changes, which seemed to be the case.

tomaka · 2020-04-06T16:16:12Z

Deployed both on my Google Cloud instance (with incoming connections) around 3pm and on kusama-validator-ew4-0 around 5pm.

There are a couple warnings (notably about reserved nodes getting disconnected), but that was already the case before. In particular, comparing on Kibana the logs before and after this PR, the rate of all warnings seems exactly the same.
Notably there's no log message about a protocols handler enabled/disabled multiple times.

In Grafana everything seems normal as well. The number of GrandPa notifications emitted has increased, but that is expected due to #5520.

So everything looks good to me so far. I suggest to take a look tomorrow morning again and if there's nothing weird happening, we merge this.

tomaka · 2020-04-07T08:49:02Z

Well, the test is kind of inconclusive.
It seems that Parity's logs server has crashed during the night, and my Google Cloud node stopped shortly after I posted the previous message without me realizing 🤦‍♂️ (probably because I closed the SSH connection, forgetting that this would close the node (I'm not a devops)).

On Grafana, however, everything looks normal.

tomaka · 2020-04-07T09:51:57Z

I'd be in favour of merging.
@twittner?

tomaka · 2020-04-07T10:59:39Z

Needs a merge/rebase over master.

tomaka · 2020-04-07T14:55:44Z

I think the test is failing because of the unused import warning.

warning: unused import: channel::mpsc
--> client/network/src/service.rs:39:27
|
39 | use futures::{prelude::*, channel::mpsc};
| ^^^^^^^^^^^^^
|

romanb mentioned this pull request Mar 17, 2020

Adapt to rust-libp2p#1440. #5066

Closed

gavofyork added the A3-in_progress Pull request is in progress. No review needed at this stage. label Mar 18, 2020

romanb force-pushed the libp2p-next branch 2 times, most recently from 59f7df3 to e3d6406 Compare March 18, 2020 13:47

tomaka reviewed Mar 19, 2020

View reviewed changes

client/network/src/debug_info.rs Show resolved Hide resolved

mxinden reviewed Mar 19, 2020

View reviewed changes

client/network/src/protocol/generic_proto/behaviour.rs Show resolved Hide resolved

romanb mentioned this pull request Mar 26, 2020

Fix tried to send handshake twice #5413

Merged

paritytech-ci added the B2-breaksapi label Apr 2, 2020

romanb force-pushed the libp2p-next branch from d2d6818 to 309c75c Compare April 2, 2020 13:44

romanb removed the B2-breaksapi label Apr 2, 2020

Roman S. Borschel added 4 commits April 3, 2020 16:40

Adapt to rust-libp2p#1440.

6443e73

Further adapt to libp2p/master.

1d2335e

Update to libp2p-0.17

e28f642

Finishing touches.

31cdf4f

romanb force-pushed the libp2p-next branch from fd9134a to 31cdf4f Compare April 3, 2020 14:40

Remove stray TODO.

8d38405

romanb commented Apr 3, 2020

View reviewed changes

romanb marked this pull request as ready for review April 3, 2020 15:31

romanb added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Apr 3, 2020

tomaka approved these changes Apr 6, 2020

View reviewed changes

Roman S. Borschel added 2 commits April 6, 2020 12:53

Incorporate review feedback.

db8fcca

Merge branch 'master' into libp2p-next

0b0fd15

gnunicorn added this to the 2.0 milestone Apr 6, 2020

gnunicorn added the A1-onice label Apr 6, 2020

gnunicorn assigned tomaka Apr 6, 2020

gavofyork mentioned this pull request Apr 6, 2020

Prepare for Polkadot launch and Substrate 2.0 freeze #4961

Closed

24 tasks

tomaka added A8-mergeoncegreen and removed A0-please_review Pull request needs code review. A1-onice labels Apr 7, 2020

Merge branch 'master' into libp2p-next

d3d9f7c

Remove unused import.

713d0ae

tomaka merged commit 27d371e into paritytech:master Apr 8, 2020

romanb deleted the libp2p-next branch April 8, 2020 08:10

This was referenced Apr 9, 2020

Improve checking for peer id mismatch #5369

Closed

Local nodes keep disconnecting. #4272

Closed

romanb mentioned this pull request Apr 9, 2020

Do not prematurely emit CustomProtocolClosed on connection close. #5595

Merged

hackfisher mentioned this pull request Jun 2, 2020

Finality block stop proceeding on Crab 0.6.0 and Substrate 2.0 alpha.8 darwinia-network/darwinia#454

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

libp2p-next #5278

libp2p-next #5278

romanb commented Mar 17, 2020 •

edited

Loading

romanb commented Mar 17, 2020

tomaka commented Apr 2, 2020

romanb Apr 3, 2020

tomaka left a comment •

edited

Loading

tomaka commented Apr 6, 2020

romanb commented Apr 6, 2020

tomaka commented Apr 6, 2020 •

edited

Loading

tomaka commented Apr 7, 2020

tomaka commented Apr 7, 2020 •

edited

Loading

tomaka commented Apr 7, 2020

tomaka commented Apr 7, 2020

libp2p-next #5278

libp2p-next #5278

Conversation

romanb commented Mar 17, 2020 • edited Loading

libp2p/rust-libp2p#1440 and libp2p/rust-libp2p#1519

Overview

Details

Noteworthy

libp2p/rust-libp2p#1472

libp2p/rust-libp2p#1467

romanb commented Mar 17, 2020

tomaka commented Apr 2, 2020

romanb Apr 3, 2020

Choose a reason for hiding this comment

tomaka left a comment • edited Loading

Choose a reason for hiding this comment

tomaka commented Apr 6, 2020

romanb commented Apr 6, 2020

tomaka commented Apr 6, 2020 • edited Loading

tomaka commented Apr 7, 2020

tomaka commented Apr 7, 2020 • edited Loading

tomaka commented Apr 7, 2020

tomaka commented Apr 7, 2020

romanb commented Mar 17, 2020 •

edited

Loading

tomaka left a comment •

edited

Loading

tomaka commented Apr 6, 2020 •

edited

Loading

tomaka commented Apr 7, 2020 •

edited

Loading