Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

propose: js active conn mgr #81

Closed
wants to merge 2 commits into from
Closed

Conversation

vasco-santos
Copy link
Contributor

No description provided.

@github-actions github-actions bot requested review from rvagg and jacobheun March 16, 2021 09:58
_What must be true for this project to matter?_
<!--(bullet list)-->

- Web3 developers want to have a reliable pubsub topology out of the box without relying on star servers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vasco-santos @aschmahmann @jacobheun can you help flesh this one out a little more (here in comments is probably fine)?

What can we assert today about the demand for solid pubsub in the browser? Do we have good use-cases that we know users are attempting to rely on today? And what signals (maybe in the form of possible use-cases they are/have been attempting to use) should we be looking for when talking to users to back up this assumption?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flow here is that when I subscribe to a topic, libp2p should automatically find peers for me. In Go, this involves advertising and querying the DHT for topics I am subscribed too. In order for JS to effectively discover those peers, regardless of their implementation language, we need to advertise and discover topics on the same service (the public DHT). So the dependency here is being able to effectively query the DHT to discover and advertise our topics.

Re: the browser, pubsub is one of the more effective forms of communicating with peers browser to browser, as I can leverage indirect links to communicate, which is ideal due to the connection limits of browsers. Usage here is mostly anecdotal at this point, there was a lot of interest here during HackFS last year that Vasco and I spent quite a bit of time supporting people on. Matrix would be a strong potential use case here, and would be great to interview for gaps in the stack.

I don't think this matters for the connection manager though. The key thing here is to be able to tag/weight connections to protect valuable ones, and ensure we're trimming stale/less useful connections:

  • Ability to prioritize connections to close Kademlia space peers to improve peers' ability to discover us
  • Ability to prioritize connections to pubsub peers with n+1 common topics to us to maintain valuable meshes
  • Ability to decay connections so that stale ones are pruned to focus resources on valuable connections

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jacobheun I think you are focused on the flows after we have the connections. The first step should be to have an active connection manager to replace our current autoDial approach.

The connection manager should actively:

  • trigger peer discovery mechanisms for peers of interest
    • if they run pubsub, trigger a discovery query for peers with same subscriptions and establish a connection
    • trigger queries to closest peers to establish connections
    • ...
  • on restart check PeerStore to establish a meaningful set of connections

After the connection manager is able to actively establish important connection rather than let's connect every peer discovered unless we already have too many connections, the second part would be the trimming scope

From the current solution, there is a lot of space for improvement with tremendous value for the users. Either evolving the current connection manager to the state of the go-implementation or implementing a fully fledged [Connection Manager v2](https://github.com/libp2p/specs/pull/161) (+ [more notes](https://github.com/libp2p/notes/issues/13)).

With the connection manager overhaul in JS we aim to:
- Find our closest peers on the network, and attempt to stay connected to n to them
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closest in terms of geography, ping latency or Kademlia XOR distance?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kademlia distance so that we increase the probability that peers searching for our PeerId are able to find us. In Go this is done as part of the DHT refresh interval, but for JS, since we have multiple discovery mechanisms (delegate routers) this happens up a layer. It's worth noting that this behavior does now exist in JS, but reliability is questionable.

The browser doesn't benefit from this, because it won't be able to connect to those peers in most cases. This is where having an active relay (dial the peer for me), or a rendezvous style service would be beneficial.

proposals/js-connection-manager.md Outdated Show resolved Hide resolved
From the current solution, there is a lot of space for improvement with tremendous value for the users. Either evolving the current connection manager to the state of the go-implementation or implementing a fully fledged [Connection Manager v2](https://github.com/libp2p/specs/pull/161) (+ [more notes](https://github.com/libp2p/notes/issues/13)).

With the connection manager overhaul in JS we aim to:
- Find our closest peers on the network, and attempt to stay connected to n to them
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kademlia distance so that we increase the probability that peers searching for our PeerId are able to find us. In Go this is done as part of the DHT refresh interval, but for JS, since we have multiple discovery mechanisms (delegate routers) this happens up a layer. It's worth noting that this behavior does now exist in JS, but reliability is questionable.

The browser doesn't benefit from this, because it won't be able to connect to those peers in most cases. This is where having an active relay (dial the peer for me), or a rendezvous style service would be beneficial.

_What must be true for this project to matter?_
<!--(bullet list)-->

- Web3 developers want to have a reliable pubsub topology out of the box without relying on star servers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flow here is that when I subscribe to a topic, libp2p should automatically find peers for me. In Go, this involves advertising and querying the DHT for topics I am subscribed too. In order for JS to effectively discover those peers, regardless of their implementation language, we need to advertise and discover topics on the same service (the public DHT). So the dependency here is being able to effectively query the DHT to discover and advertise our topics.

Re: the browser, pubsub is one of the more effective forms of communicating with peers browser to browser, as I can leverage indirect links to communicate, which is ideal due to the connection limits of browsers. Usage here is mostly anecdotal at this point, there was a lot of interest here during HackFS last year that Vasco and I spent quite a bit of time supporting people on. Matrix would be a strong potential use case here, and would be great to interview for gaps in the stack.

I don't think this matters for the connection manager though. The key thing here is to be able to tag/weight connections to protect valuable ones, and ensure we're trimming stale/less useful connections:

  • Ability to prioritize connections to close Kademlia space peers to improve peers' ability to discover us
  • Ability to prioritize connections to pubsub peers with n+1 common topics to us to maintain valuable meshes
  • Ability to decay connections so that stale ones are pruned to focus resources on valuable connections

_Why might this project be lower impact than expected? How could this project fail to complete, or fail to be successful?_

#### Alternatives
_How might this project’s intent be realized in other ways (other than this project proposal)? What other potential solutions can address the same need?_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatives is probably not the best place for this, but as prior art/previous discussions here are some previous notes on a topology/mesh approach to connection management libp2p/notes#13. Tagging is likely simpler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more a question of solution design right? The issue where we have been discussing the general solution for ConnectionManager already includes reference for it: libp2p/js-libp2p#744

proposals/js-connection-manager.md Outdated Show resolved Hide resolved
proposals/js-connection-manager.md Outdated Show resolved Hide resolved

- Web3 developers want to have a reliable pubsub topology out of the box without relying on star servers
- Web3 developers want to find and connect to other peers in a given scope
- Browser developers want to have their nodes reachable via more transports than `webrtcSTAR` out of the box
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this fits here, how does this apply to connection management?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what I was trying to get from this, so I will remove it for now

<!--(bullet list)-->

- Web3 developers want to have a reliable pubsub topology out of the box without relying on star servers
- Web3 developers want to find and connect to other peers in a given scope
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to connection gating? This sounds like it's trying to account for discovery mechanisms which would be out of scope for this. Being able to restrict who I am connecting to/is connecting to me would be in this scope though. Although it could be done as a separate piece of work. Priority here being:

  1. Get me more connections to peers so I can effectively use the network (JS is currently weak here, especially in browser)
  2. Allow me to prioritize connections to optimize my resources (tagging)
  3. Allow me to restrict outbound/inbound connections based on definable criteria (custom connection gaters for allow/deny listing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related to: #81 (comment)

@mikeal mikeal added the impact:high Impact rating is 6 or above. label Mar 25, 2021
@vasco-santos vasco-santos force-pushed the vasco-santos/conn-mgr branch from 5990b76 to a8f7018 Compare March 30, 2021 09:20
Co-authored-by: Jacob Heun <jacobheun@gmail.com>
Co-authored-by: Max Inden <mail@max-inden.de>
@vasco-santos vasco-santos force-pushed the vasco-santos/conn-mgr branch from a8f7018 to fa120b1 Compare March 30, 2021 09:23

#### Future opportunities
<!--What future projects/opportunities could this project enable?-->
- Propose a spec for JS, that can also be implemented in Go
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talking with @Stebalien last week, this also needs improvements in GO. Having as a result of this solution design a spec would be super helpful

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm seriously misunderstanding this proposal, this isn't the right approach.

Basically, we don't want to be "well connected", that's not really a thing. Instead, each and every service will likely need to connect to some set of peers for some reason (gossipsub topic, etc.). Given that, what we really need is:

  1. A working rendezvous system (or a working DHT).
  2. Rendezvous aware services that can use rendezvous to find peers that provide the services/content they require.

One way to solve this is some form of service where you can say "I need to be connected to X-Y peers that provide X (content, pubsub topic, etc.)" and this service is responsible for forming and maintaining those connections. But, my experience, it's rarely that simple. Usually, every service will need to manage its peers (e.g., because some peers might not actually offer the services they claim to offer, etc.).


With the existing protocols in libp2p, as well as IPFS subsystems built on top of libp2p, such as Pubsub and the DHT, the need for a connection manager overhaul becomes an import work stream, so that these protocols operate as expected by the users, i.e. out of the box solution.

Currently, js nodes have a reactive connection manager that can be decoupled into two parts: establishing new connections and trimming connections. The former relies on an `autoDial` approach, where each time a new node is discovered it will try to establish a connection with that peer, if it has less connections than its desired minimum. The latter consists on blindly trimming connections when reaching a configurable threshold.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why auto-dial? This seems weird.

In go-ipfs, the DHT will find new nodes when its routing table drops too low, but there's no reason to stay connected to some minimum number of nodes otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, the auto-dial appeared as a temporary solution for making things magically working when relying on webrtc-star. This is the current default on js-libp2p to proactively establish connections.

This means, peers would discover other peers via the webrtc-star and would create a silo network where subsystems like pubsub would just work. Behind the scenes, what libp2p does is simply every time a new peer is discovered, it will attempt to dial it (if within the connMgr threshold).

In go-ipfs, the DHT will find new nodes when its routing table drops too low, but there's no reason to stay connected to some minimum number of nodes otherwise.

It is a requirement to stay connected to a minimum number of nodes to have a reliable gossipsub overlay. In addition, I think go also keeps connections with its closest peers. In JS, it is also essential to guarantee that a node is connected to a set of relays (specially in the browser) to be reachable via other nodes. Then there are dapp use cases, such as Slate, where it is desirable to be connected to other nodes in the dapp context

From the current solution, there is a lot of space for improvement with tremendous value for the users. Either evolving the current connection manager to the state of the go-implementation or implementing a fully fledged [Connection Manager v2](https://github.com/libp2p/specs/pull/161) (+ [more notes](https://github.com/libp2p/notes/issues/13)).

With the connection manager overhaul in JS we aim to:
- Find our closest peers on the network, and attempt to stay connected to n of them
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closest as in DHT, distance, local network?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kademlia distance, more in #81 (comment)


With the connection manager overhaul in JS we aim to:
- Find our closest peers on the network, and attempt to stay connected to n of them
- Finding, connecting to and protecting our gossipsub peers (same topics search)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does gossipsub have peer exchange yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has, but disabled by default. Anyway, the point here is mostly regarding the initial bootstrap of the pubsub overlay

With the connection manager overhaul in JS we aim to:
- Find our closest peers on the network, and attempt to stay connected to n of them
- Finding, connecting to and protecting our gossipsub peers (same topics search)
- Finding and binding to relays with AutoRelay
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go-libp2p now has a list of known-good "relays". We've found that random relays on the network just don't behave well enough to be useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not binding to relays automatically yet, just have things in place if people want to enable it. However, I agree that this should probably be provided (but perhaps in the bootstrap list?)

- Find our closest peers on the network, and attempt to stay connected to n of them
- Finding, connecting to and protecting our gossipsub peers (same topics search)
- Finding and binding to relays with AutoRelay
- Finding and binding to application protocol peers (as needed via MulticodecTopology) -- We should clarify what libp2p will handle intrinsically and what users need to do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expand on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in js-libp2p, we have introduced the concept of Topologies. It has some thoughts from libp2p/notes#13

In summary what happens is:

  • Pubsub, DHT and any other libp2p protocol/subsystem can register themselves as a topology in libp2p registrar
  • Once a connection is established, identify protocol kicks in and running protocols of the peer are stored in the ProtoBook
  • When a peer is added to the protoBook, a topology callback is called and we verify if that peer is running the topology protocol. If so, for instance pubsub subsystem is notified and it can open a pubsub stream from that connection

If a dapp creates a protocol (let's consider Slate example again), they will likely want to have a Slate topology where once a new peer in slate is running they can be notified and establish an "application overlay". Another thing to consider here would be the concept of MetadataTopology together with MulticodecTopology as we should be able to create topologies per metadata as well to enable some scenarios.

Note that the topology receives a minPeer, maxPeer but we are not using them yet. This is related to the goal of this proposal. The idea here is to inform libp2p of the requirements of the topology in terms of numbers of needed peers to operate reliably. With this, libp2p can act to fulfil the needs of each requirements and properly manage the connections that are needed.


From the current solution, there is a lot of space for improvement with tremendous value for the users. Either evolving the current connection manager to the state of the go-implementation or implementing a fully fledged [Connection Manager v2](https://github.com/libp2p/specs/pull/161) (+ [more notes](https://github.com/libp2p/notes/issues/13)).

With the connection manager overhaul in JS we aim to:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd distinguish between connection management as in "managing/closing existing connections" and "peer/service discovery". This proposal currently discusses both but they're pretty separate issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm seriously misunderstanding this proposal, this isn't the right approach.

Probably the misunderstanding with the proposal comes from this. You are probably right that the connection manager should be essentially responsible for "managing/closing existing connections". However, the role of peer/service discovery in my mind is as simple as discovering a given peer (with its announced multiaddrs) and store them in the PeerStore. This also relates on how the peer discovery interface is defined (I think this also is the case in go?).

After adding to the PeerStore, another libp2p "component" should act to figure out if we should try to connect to this peer or not, taking into account the topologies needs as well as the general needs of libp2p, such as relays...

This proposal currently discusses both but they're pretty separate issues.

I think this is the gap here. I am seeing the connection manager to be responsible for two main things:

  • Proactively establishing connections
  • Trimming connections

Proactively establishing connections is divided into two other things:

  • Act on peer discovery
  • Trigger peer discovery per needs (pubsub topics, relays, etc)

So, we essentially need to have this entity who is responsible to act as an orchestrator by leveraging Peer Discovery + Peer Store to fulfil the needs of topologies + libp2p. I see some overlap in this entity with the connection manager, such as decisions on wether we should close a connection in favour of establishing a connection with a peer that has more value for the node's needs, as well as to free connections that are not needed in the long run (like bootstrap nodes).

One way to solve this is some form of service where you can say "I need to be connected to X-Y peers that provide X (content, pubsub topic, etc.)" and this service is responsible for forming and maintaining those connections. But, my experience, it's rarely that simple. Usually, every service will need to manage its peers (e.g., because some peers might not actually offer the services they claim to offer, etc.).

Libp2p topology is probably the service we are talking here. I agree that it is a simpler option to just let services be responsible for forming the connections. However, this also has some implications that should be considered if we want to create a libp2p spec out of this. On top of my mind, we will not take efficient decisions and try to leverage connections that offer us more (decide between a peer running pubsub or a peer running pubsub + part of the dapp context), a peer part of more than one topology will be counted several times and make better decisions when trimming connections.

There is the Connection Overhaul Issue libp2p/js-libp2p#744 with a lot of information on how this would work, as well as an initial draft on how each component would interact. It is a pretty extensive issue, but perhaps it is worth reading.

Let me know what you think, if we can get to a better place to have this logic than the connection manager it might be good to separate the logic between proactive connections vs trimming. But, I overall think there will be overlap on the decision making logic.

@jacobheun
Copy link
Contributor

Closing this as it is not on our immediate roadmap. We will reopen when this work intersects with our future timelines and priorities.

@jacobheun jacobheun closed this Apr 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impact:high Impact rating is 6 or above.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants