Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple DHTs #780

Closed
jacobheun opened this issue Jan 24, 2020 · 4 comments
Closed

Multiple DHTs #780

jacobheun opened this issue Jan 24, 2020 · 4 comments
Assignees
Labels
Epic kind/enhancement A net-new feature or improvement to an existing feature

Comments

@jacobheun
Copy link
Contributor

Design notes

We need to explore how the DHT fixes interact with the old DHT code and whether or not we want to fork the DHT. If we do fork the DHT, we have a few options:

  • Only support the new DHT. This will split the network.
  • Join both DHTs.
  • Join the old DHT as a client and the new DHT as a server.

If we decide to run two DHTs, we have a couple of options when querying.

  • Query both DHTs in parallel.
  • Query the new DHT first and assume that it will complete quickly.
  • Start querying the first DHT and then start querying the second after a delay.

The deciding factor will likely be limitations around how many parallel dials we can have. That is, querying both DHTs in parallel might slow down querying the new DHT because dials to the new DHT may have to wait on dials to the old DHT.

The same set of options apply to publishing content.

BONUS: We've frequently discussed running multiple DHTs (one for content routing, one for peer routing, one for IPNS, etc.). If we start down the path of running multiple DHTs, it's a short hop to one-dht-per-type.

  • We should consider this if we have time.
  • We can probably ensure that our DHTs have similar routing tables (i.e., need connections to the same peers) by bootstrapping the least common DHT first, then the next most common, then the next, etc. (this assumes that most members of the least common DHT are also members of the next most common DHT)

Testing mechanics

We'll need to test each set of options and see:

  1. How each affects query time.
  2. How many extra dials we perform by running multiple DHTs.
  3. How the network performs if many nodes are still running the old DHT and we stick with a single DHT.

Success Criteria

  • We are able to launch go-ipfs 0.5.0 with a fast DHT from day 0 instead of having to wait for the network to upgrade.
@jacobheun jacobheun added the kind/enhancement A net-new feature or improvement to an existing feature label Jan 24, 2020
@jacobheun jacobheun added this to the Working Kademlia milestone Jan 24, 2020
@jacobheun jacobheun added the Epic label Jan 24, 2020
@Stebalien
Copy link
Member

Stebalien commented Feb 25, 2020

This has been resolved (we think). The plan is to run one DHT and have new nodes listen on the old protocol but only route requests to nodes in the new DHT.

How does this work? If at least 20% of the network is running the latest DHT version (we can boost this by adding nodes to the network at launch), 99% of the time, the closest 20 nodes (old & new) to a target key will include a new node. That means, in 99% of the cases, a put/get from an old node will find a good node.

Note 1: It's actually even better than this. As more of the network upgrades, the chance that an old node never enters the old DHT increases because new nodes will never return peers from the old DHT.

Note 2: Old nodes backtrack indefinitely when finding content so they're pretty much guaranteed to find content published by new nodes regardless (because they'll keep trying until they do).

Note 3: If 44% of the network upgrades, we can get to a 99.999% success rate.

TODO:

  • Consider adding backtracking (on find) to new nodes.
    • nah
  • Make sure boosters scale.
    • Share a single datastore.
    • Share a single provider GC process. We're currently working around a trampling hurd issue by randomizing timers, but we should just have one provider record GC process for the entire booster node.

@aarshkshah1992
Copy link
Contributor

@Stebalien @aschmahmann

Note 2: Old nodes backtrack indefinitely when finding content so they're pretty much guaranteed to find content published by new nodes regardless (because they'll keep trying until they do).

What's the backtracking mechanism being described here & why do we need it ? Please can you elaborate a bit on it ?

@Stebalien
Copy link
Member

What's the backtracking mechanism being described here & why do we need it ?

Working through this, I think I may be incorrect. Backtracking is asking further away nodes on the same query. This can be useful for:

  • Putting a record further away from a target node to "spread the load" (coral does this).
  • Routing around dead-ends. Sometimes, further nodes will have information about closer peers (e.g., if your DHT is crappy).

However, I'm not actually sure if it will help with drift. With drift, we don't want to ask peers that we contacted on the query, we want to ask peers directly adjacent to the closest 20 peers. To do that, we'd probably have to ask the highest and lowest peer in the K bucket for their neighbors and expand the search that way.

Really, I don't think this is worth it. But, if we magically have time, we might want to consider it.

@aschmahmann
Copy link
Collaborator

aschmahmann commented Mar 10, 2020

Closed by libp2p/go-libp2p-kad-dht#479

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Epic kind/enhancement A net-new feature or improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

4 participants