-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: protect useful peers in low buckets #634
Conversation
8f6ac08
to
7c89d12
Compare
query.go
Outdated
// Restrict to buckets 0, 1 (75% of requests, max 40 peers), so we don't | ||
// protect _too_ many peers. | ||
commonPrefixLen := kb.CommonPrefixLen(q.dht.selfKey, kb.ConvertPeerID(p)) | ||
if commonPrefixLen < 2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: magic value.
dht.go
Outdated
commonPrefixLen := kb.CommonPrefixLen(self, kb.ConvertPeerID(p)) | ||
cmgr.TagPeer(p, "kbucket", BaseConnMgrScore+commonPrefixLen) | ||
commonPrefixLen := kb.CommonPrefixLen(dht.selfKey, kb.ConvertPeerID(p)) | ||
cmgr.TagPeer(p, kbucketTag, BaseConnMgrScore+commonPrefixLen) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we give a higher score to peers with higher CPL so that the connection manager dosen't select them as candidates if it has a choice ?
Basically, we will protect the "furthest peers" but try to give a good score to the closer peers so that we de-prioritise them for pruning.
If yes, please can we document this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, basically. It's fuzzy and not the best solution, but it ensures that we:
- Never cut connections to the most useful peers.
- Try to keep connections to our nearest neighbors.
I'll add some documentation.
Q for @aarshkshah1992 and @aschmahmann:
For context, I haven't seen this make any difference on a bootstrapper. We're still churning through nodes in the first two buckets every few hours. Honestly, I think this is just because we effectively pick peers at random. We need a better peer picking algorithm to replace old peers with better peers, ideally ones that seem to be sticking around. |
I take this back. After a day to stabilize, it looks like the churn is dying down a bit. |
cmgr.Protect(p, dhtUsefulTag) | ||
} else { | ||
cmgr.TagPeer(p, dhtUsefulTag, usefulConnMgrScore) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Stebalien The TagPeer
call is idempotent and overwrites the existing score for that tag.
How about adding to the current score here so we "reward" peers who are useful to us often.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we could probably do that if we utilize decaying tags like pubsub is planning on doing (libp2p/go-libp2p-pubsub#306). However, I don't think it's currently worth the complexity especially if we take into account the various ways the connection manager can be gamed to promote one peer over another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to give them too much weight, I just want to make sure that they're marked useful.
How about protecting useful peers/maybe ALL peers above a certain Cpl threshold ? I am not sure of the benefit of protecting useful peers in the low buckets as it's easy to discover those. On the other hand, if I get a peer with a CPL of 16, I'd protect it with my life.
I am not sure this is a great idea because it's "easy" to discover peers for the low buckets(because there are many of them) and this can create more problems than the benefit it gives us when the connection manager is constantly hitting it's limits. |
@aarshkshah1992 this PR is a stab at improving #566. The major point here is that since it's valid to have peers in our routing tables that we're not connected to we only really need to keep connections alive to peers we are using frequently.
While it initially seems like protecting peers with high CPLs is in your best interest it's not really all that helpful since you query those peers a very small number of times per refresh interval (i.e. once or less).
Yes, they are easy to discover and replace. However, we use them frequently (i.e. 50% of the time a user does a query it will use the peers in bucket 0) and therefore being able to reuse an existing connection instead of forming a new libp2p connection (TCP, multiplexer, encryption, ...) could save us a bunch of round trips and essentially shave a hop off of our lookups. Before we had dissociated connection state from routing table membership we needed to ensure we were connected to our closest peers to avoid harming the network (i.e. peers far from us are useful to us, but peers close to us are useful for the network). However, we are no longer under that obligation. Note: If we decide to sync state between nearby DHT peers then maintaining active connections to our closest peers will become important again. |
I want to be careful about protecting too many peers. That will completely exclude these peers from the connection manager. |
That way, we never disconnect from the core of the network. We don't protect our closest peers as: * We'll reconnect to them frequently when we query for ourself. * We use the core nodes most frequently.
And tag useful peers in higher buckets
a79d0dd
to
6c1d38b
Compare
Bump @aarshkshah1992.
Note: Protected peers do not count towards the limit. They're simply exempt. |
Let me try to reason through this as it's all fuzzy in my head:
This is good. However, We do not give any protection to our closest peers(even though we do give them a high score). If an attacker wanted to eclipse a peer from the network, it could still flush out that peer from the Routing Tables of it's closest peers. I think we should offer protection to useful peers above a certain CPL threshold (or come up with another algo to protect those closest to us). I agree with @aschmahmann that this isn't of much help to us query-wise but if everyone does this, it is mutually beneficial in terms of NOT getting eclipsed. |
If peer X is in bucket 15 how is the attacker going to flush it out, and if it was how would having the connection be protected prevent it from being flushed out? In my understanding the only way for the peer to get flushed out of the routing table is if it's deemed "no longer useful" which requires a malicious peer to make it's way into our high cpl bucket and for it to respond much faster than other peers (e.g. lower latency, more "valuable" connections so that the good connections are pruned which makes them slower, etc.). However, since this attack is predicated on the attacker being able to generate high cpl keys they could just generate even higher cpl keys and take over our high cpl buckets that way (although it may be more expensive since they have to respond to queries occasionally). Additionally, even with protecting the connections they will die/churn over time allowing the attacker to slowly work their way into the table. It's only K extra peers to protect so in some sense why not do it, but I don't think this buys us much from an eclipse-protection standpoint. |
Create a lot of connections to the peer -> make it's connection manager disconnect non-protected peers which means peers with cpl > 1 -> this will hurt the usefulness of those peers as they wont be able to reply fast enough because of having to pay the connection cost. At the same time, generate peersIds with those higher CPLs and connect to the peer so that you become useful for those CPLs. But, you are right, this is expensive compared to attacking the lower CPLs because of the compute power involved. And, to eclipse a peer, you'd have to do this attack on all it's closest peers. We can think about this/do this later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can still eclipse a peer by flushing it out from the Routing Tables of it's closest peers. But, that's harder to do because:
i. You'd have to generate peerIDs that hash to higher Cpls.
ii You'd have to do this for all of the peer's closest peers.
We can worry about this later.
Note: you could eclipse a peer that way, but it would be harder to split the peer off from the network. |
That way, we never disconnect from the core of the network.
We don't protect our closest peers as: