Add support for configuring priority peers (connection tagging) #369

jacobheun · 2019-06-10T15:30:11Z

This is a component of connection tagging. Libp2p should support configuring/tagging specific peers/multiaddrs as priority connections. The goal here is to have connections that the Connection Manager does not kill, and that we try to maintain connections too. If connections are killed, libp2p should attempt to automatically reconnect to them.

An example of this is an IPFS browser node setting a preload node as an important connection. Since the preload node acts as a proxy for serving all of its content, these connections are vital to maintain. If the connection is lost, the node can become effectively unusable.

This would also be important for private clusters that expose a single relay/proxy node. Maintaining those internal connections to the relay is critical for those peers.

Future iterations of this could involve a spec to have the nodes coordinate and agree on this connection keep alive behavior. This would allow both nodes to agree to maintain the connection and avoid hanging up on one another. It would also allow overtaxed nodes to decline the keep alive, proving the requesting node the opportunity to find other nodes with connection availability.

jacobheun · 2019-06-12T12:09:48Z

@vasco-santos @dirkmc I've written down some thoughts around this. It also made me think more about how service configuration works currently and how it can improve, but I've left that out of these notes. I'll look at writing more about that soon and posting a new issue.

Peer Management

Prioritizing Peers

Libp2p needs to be able to identify peers that it deems as priority connections, to enable nodes to maintain a connection to peers that are in a critical path for that node to operate. An example of this would be preload nodes for IPFS browser nodes, or signaling servers for webrtc transports. If the connection to these nodes ends, the node may no longer be able to effectively interact with the network, due to current limitations of distributed technologies.

Being able to prioritize peers also enables nodes in the network to create and more easily maintain overlay networks to specific peers. A potential example of this could be a webrtc overlay. Assuming nodes in the network supported a signaling spec, as webrtc nodes became aware of other nodes, they could create an overlay network with a subset of those nodes and signify them as priority peers, similar to how Gossipsub overlays are constructed. This could potentially improve the ability of nodes to query unconnected nodes, without relying on peers being initially connected to the same signaling server.

Ideally, both peers would agree to this priority connection and avoid disconnecting from one another. If only one peer marks the other as a priority peer, this can lead to disconnects and immediate redials to that peer, which would be unnecessarily taxing for both nodes. This could be especially aggravating for the receiving node if they are at their high watermark for connections.

Configuration

As the Peer Store (PeerBook in JS) is the central location of Peer data, it makes sense for it to house the metadata marking the Peer as priority. It may be useful to prioritize specific multiaddrs instead of the peers themselves, but as multiaddrs can change over time, via protocols like AutoNAT and AutoRelay, an initial implementation of just tagging the Peer should be more dependable.

Similar to how Bootstrap peers are currently configured today, priority peers would be configured via their multiaddr. Peers that were previously in the Bootstrap list would be removed from there, and added to the priority configuration, as those peers will aslo have connections established.

Configuration Options

Here are some potential configuration options

Via the config:

new Libp2p({
  ...,
  config: {
    peers: {
      '/ip4/xxx.xx.xx.x/tcp/4001/QmPreload': {
        tags: ['Priority']
      }
    }
  }
})

Via methods

const libp2p = new Libp2p({ ... })
libp2p.peerBook.tagPeer('/ip4/xxx.xx.xx.x/tcp/4001/QmPreload', libp2p.peerBook.TAGS.Priority)

Updates

Update PeerBook to support adding tags to peers
Update Connection Manager to check for priority tags before disconnecting, or to exclude those peers entirely from tracking
Update libp2p to dial the priority peers on startup
Update libp2p to listen to the disconnect for those peers and reconnect should it happen
- The reconnect should have a small random backoff built in, to avoid mass redials if many peers are disconnected from the same priority peer

Additional Thoughts

It may be valuable to make this a standalone service module that takes a libp2p instance. This would avoid the need to add specific functionality for this to libp2p itself, and would make it easier for other developers to build similar modules that leverage tagging.

const PriorityPeerService = require('libp2p-priority-peer-service')
const libp2p = new Libp2p({ 
  modules: {
    services: [ PriorityPeerService ]
  }
})
libp2p.peerBook.tagPeer('/ip4/xxx.xx.xx.x/tcp/4001/QmPreload', PriorityPeerService.TAGS.Priority)

dirkmc · 2019-06-12T13:34:29Z

This sounds like a good improvement to connection management 👍

Ideally connections would be prioritized by the service the remote peer provides, and there would be a mechanism to discover which peers provide a particular service, so that the configuration doesn't need to be hard-coded (eg discovery could be through bootstrap nodes or a rendezvous service).

A less flexible but simpler approach would be to configure a number of candidate peers that provide a particular service, as a means of providing some redundancy and load balancing.

Do we want to maintain permanent connections to bootstrap nodes? It may be more fault tolerant and put less stress on those nodes to prioritize them as "Hi-Lo" - high when there are few connected peers and low once a reliable mesh of connections has formed with other peers.

jacobheun · 2019-06-12T13:53:53Z

Do we want to maintain permanent connections to bootstrap nodes? It may be more fault tolerant and put less stress on those nodes to prioritize them as "Hi-Lo" - high when there are few connected peers and low once a reliable mesh of connections has formed with other peers.

Yeah, we really don't want to stay connected to them unless we're below our min peers watermark, as they're primarily just an entry point workaround to join the network. We currently have this behavior for all discovered peers with auto dial. If we're above the min peers watermark we stop auto dialing, but below it we do.

I think the progression of a node would ideally look something like:

Bootstrap to the network (hopefully migrating to being more distributed in the future)
Actively discover peers that use our protocols (via rendezvous or other means)
Continue until we have at least minPeers connections to those peers (other nodes would be excluded from this count, such as the bootstrap nodes)
Create n+1 overlay networks with those peers, depending on the needs and quantities of those protocols
Prioritize all overlay network connections
Switch to passive discovery? (we probably don't need to keep crawling at this point)

Different node types may end up needing to have different behavior for this, but I think it accounts for a typical node.

I think in general if we fall below the min peers watermark, we should be going through our list of known peers to connect and actively find more peers, so bootstrap nodes wouldn't really need to be tagged. We really don't need the bootstrap module at all, as we should just be pulling from our Peer Store when we have too few connected peers. Instead of configuring the list of them when creating libp2p, we could/should just be adding them to the Peer Store.

vasco-santos · 2019-06-18T10:29:47Z

@jacobheun thanks for putting this together!

I like your proposal and I think that this is definitely the way to go 🚀

I would introduce a keepAlive tag for keeping the connection, and the priority would be used for the priority to start a connection and for pruning. Let me know what you think?

This suggestion also makes me think if that should be an array of tags, or properties. I was more thinking on tags for visualization and debugging purposes, but we can also use them in this use case.

Also agree with the "Hi-Lo" reasoning for the bootstrap peers!

achingbrain · 2022-06-21T17:06:28Z

I'm looking at implementing this piece of functionality, "keep-alive" sounds good, but I think "priority" needs a bit more to it, in terms of connection pruning at least.

If you have two components that mark the same peer as "priority" but then one finds a better peer and cleans up after itself, removing the "priority" tag, the other component will get rug-pulled.

Instead tags could be specific to the tagger, for example "preload" for IPFS, "dht-peer" for libp2p-kad-dht, "topic-mesh-node" for libp2p-gossipsub, etc, to ensure stable connections to high value peers.

So if we have tags with a name and a value, then for connection pruning we might just sum up the value of all tags a peer has, use that to order the connections, and prune the low value connections first.

Some tags like "keep-alive" might have "special" meanings like "best-effort reconnect after disconnect".

libp2p.peerBook.tagPeer(PeerId('QmPreload'), 'keep-alive', {
    value: 100,
    ttl: 60000 // optional, expire tag in 1m
}) => Promise<void>

libp2p.peerBook.removeTag(PeerId('QmPreload'), 'keep-alive') => Promise<void>

libp2p.peerBook.getTags(PeerId('QmPreload')) => Promise<[{
  name: 'keep-alive',
  value: 100
}]>

These could get configured at startup:

new Libp2p({
  // ...
  peerStore: {
    peers: {
      'QmPreload': {
        tags: {
          'keep-alive': { value: 100 },
          'preload', { value: 50 }
        }
      }
    }
  }
})

We might configure bootstrap nodes as keep-alive with a ttl for the first 10 minutes of running a node, for example, after that they become eligible for pruning if we have hit our max connections (e.g. they've done their job).

New connections might be protected for a few minutes so they can't get culled before identify has completed and any interested topologies have tagged the peer connections as valuable.

Allow tagging peers to better prioritise which connections to kill when hitting limits. Also for keeping "priority" connections alive. Refs: libp2p/js-libp2p#369

Allows tagging peers to mark some important or ones we should keep connections open to, etc. Depends on: - [ ] libp2p/js-libp2p-interfaces#255 Refs: libp2p/js-libp2p#369

Allow tagging peers to better prioritise which connections to kill when hitting limits. Also for keeping "priority" connections alive. Refs: libp2p/js-libp2p#369

Allows tagging peers to mark some important or ones we should keep connections open to, etc. Refs: libp2p/js-libp2p#369

BigLep · 2022-09-10T00:30:02Z

@achingbrain : I know there has been work here since your last comment. A few things:

Where did we land?
What is remaining on the original issue?
Does a js-libp2p node now have ALLOW-list functionality so it can ignore the world except for some designated peers? Basically can this be used as a an eclipse attack prevention mechanism like go-libp2p added here: Defend against eclipse attacks with ALLOW-list support go-libp2p-resource-manager#29

BigLep · 2022-09-13T15:46:56Z

2022-09-13 triage conversation: we need to summarize where we got to and discuss what can now be done. We believe we provided the functionality originally outlined and now it's about leveraging it i other areas like gossipsub. That will likely translate to a new issue in gossipsub.

achingbrain · 2022-12-06T18:11:07Z

The final piece here is for interested modules to tag peers they need to keep connections open to.

@libp2p/bootstrap fix!: only discover bootstrap peers once and tag them on discovery js-libp2p-bootstrap#142
@libp2p/kad-dht feat: tag kad-close peers js-libp2p-kad-dht#375
@chainsafe/libp2p-gossipsub Tag mesh peers on graft, remove tags on prune ChainSafe/js-libp2p-gossipsub#380

dapplion · 2023-01-18T02:23:56Z

I have some comments to tagging with an integer score:

For existing PRs the chosen values appear quasi random, since it's very hard to quantify this concept
The value of tags between protocols is arbitrary, currently every protocol having a factor of 1. For eth2 gossipsub, a peer may be grafted 74 times, giving it a tag value of 7400 completely over-running the scores of every other protocol.

This tagging system appears to me as a complicated scoring scheme that has not been properly researched and could have unintended practical and security considerations.

From reading this post a couple times seems that must goals could be achieved without an integer tag value, and instead just expressing a keep alive status like

enum KeepAlive {
    /// If nothing new happens, the connection can be closed at the given `Instant`.
    Until(Instant),
    /// Keep the connection alive.
    Yes,
    /// Close the connection if needed.
    No,
}

Even then, all this decisions are extremely opinionated where you are forcing specific paradigms to libp2p consumers.

Currently lodestar is fighting libp2p features more than necessary due to their opinionated nature. For example, the connection manager should never be deciding what peers to disconnect on the first place, but instead just enforce some limits set by the user. i.e. maxConnections should not mean "disconnect a random peer if above this limit", but instead mean "prevent new connections from being created if at that limit.

If some consumer like IPFS browser wants the default peer manager strategy, then it should buy into it instead of being there by default. A peer manager can be plugged into libp2p easily using the existing APIs. Then multiple peer manager strategies can be developed and shipped as modular components

achingbrain · 2024-04-24T17:37:50Z

Closing as this is now complete.

maxConnections should not mean "disconnect a random peer if above this limit", but instead mean "prevent new connections from being created if at that limit.

It's worth noting that maxConnections does indeed prevent new incoming connections from being created (exceptions are made for whitelisted peers/networks). libp2p doesn't disconnect random peers, the whole point here is to be able to apply heuristics so that the disconnected peers are not randomly chosen.

At any rate, the feature has been implemented in a way that consumers such as Lodestar can opt-out and maintain their own peer ranking system separate to the libp2p connection manager.

jacobheun added kind/enhancement A net-new feature or improvement to an existing feature exp/expert Having worked on the specific codebase is important status/ready Ready to be worked P1 High: Likely tackled by core team if no one steps up labels Jun 10, 2019

jacobheun self-assigned this Jun 10, 2019

jacobheun mentioned this issue Jul 22, 2019

Peer Friends ipfs/js-ipfs#2288

Closed

3 tasks

alanshaw mentioned this issue Oct 22, 2019

refactor: enable DHT by default ipfs/js-ipfs#1994

Closed

10 tasks

vasco-santos mentioned this issue Dec 30, 2020

Connection Manager Overhaul #744

Closed

vishalchangrani mentioned this issue Jun 14, 2021

fix for 5575 - occasional stream reset onflow/flow-go#832

Merged

achingbrain added a commit to libp2p/js-libp2p-interfaces that referenced this issue Jun 22, 2022

feat: add peer tagging

95ce5df

Allow tagging peers to better prioritise which connections to kill when hitting limits. Also for keeping "priority" connections alive. Refs: libp2p/js-libp2p#369

achingbrain added a commit to libp2p/js-libp2p-interfaces that referenced this issue Jun 22, 2022

feat: add peer tagging

9384468

Allow tagging peers to better prioritise which connections to kill when hitting limits. Also for keeping "priority" connections alive. Refs: libp2p/js-libp2p#369

achingbrain mentioned this issue Jun 22, 2022

feat!: add peer tagging libp2p/js-libp2p-interfaces#255

Merged

achingbrain mentioned this issue Jun 22, 2022

feat: add peer tagging libp2p/js-libp2p-peer-store#12

Merged

1 task

achingbrain added a commit to libp2p/js-libp2p-peer-store that referenced this issue Jun 24, 2022

feat: add peer tagging (#12)

c360e41

Allows tagging peers to mark some important or ones we should keep connections open to, etc. Refs: libp2p/js-libp2p#369

BigLep assigned achingbrain and unassigned jacobheun Sep 13, 2022

tinytb moved this to In Progress in js-libp2p Oct 11, 2022

tinytb added this to js-libp2p Oct 11, 2022

achingbrain mentioned this issue Dec 6, 2022

Tag mesh peers on graft, remove tags on prune ChainSafe/js-libp2p-gossipsub#380

Closed

maschad mentioned this issue Jan 10, 2023

Tag relevant peer ChainSafe/lodestar#4997

Merged

p-shahi removed this from js-libp2p Jan 10, 2023

maschad mentioned this issue Jan 23, 2023

feat: graft/prune events and mesh peer tagging ChainSafe/js-libp2p-gossipsub#383

Merged

achingbrain closed this as completed Apr 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for configuring priority peers (connection tagging) #369

Add support for configuring priority peers (connection tagging) #369

jacobheun commented Jun 10, 2019

jacobheun commented Jun 12, 2019

dirkmc commented Jun 12, 2019

jacobheun commented Jun 12, 2019

vasco-santos commented Jun 18, 2019

achingbrain commented Jun 21, 2022

BigLep commented Sep 10, 2022

BigLep commented Sep 13, 2022

achingbrain commented Dec 6, 2022 •

edited

Loading

dapplion commented Jan 18, 2023 •

edited

Loading

achingbrain commented Apr 24, 2024

Add support for configuring priority peers (connection tagging) #369

Add support for configuring priority peers (connection tagging) #369

Comments

jacobheun commented Jun 10, 2019

jacobheun commented Jun 12, 2019

Peer Management

Prioritizing Peers

Configuration

Configuration Options

Updates

Additional Thoughts

dirkmc commented Jun 12, 2019

jacobheun commented Jun 12, 2019

vasco-santos commented Jun 18, 2019

achingbrain commented Jun 21, 2022

BigLep commented Sep 10, 2022

BigLep commented Sep 13, 2022

achingbrain commented Dec 6, 2022 • edited Loading

dapplion commented Jan 18, 2023 • edited Loading

achingbrain commented Apr 24, 2024

achingbrain commented Dec 6, 2022 •

edited

Loading

dapplion commented Jan 18, 2023 •

edited

Loading