-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reprovide Sweep #824
Comments
+1 to reproviding in a sweep. A number of the comments below are somewhat kubo centric, although that's mostly to match comments raise in the original post.
I thing you are misunderstanding what Bitswap broadcast is for and how this could impact it. The main things it provides are:
This proposal only reduces 1 of the 4 ways Bitswap broadcast is used. It's still a good proposal to do, but landing it alone will not make it feasible to remove Bitswap broadcast in kubo. Some notes from work on the FullRT client which already implements this sweeping behavior (albeit by taking shortcuts like doing providing externally and doing the reprovides all together mostly because this proposal, while more correct, was more time than was available then).
|
Thank you @aschmahmann for the valuable feedback! I agree concerning the Bitswap broadcast. I think once better Content Routers (DHT, IPNI) are widely used, the broadcast can be drastically reduced, even if it is still lightly used for reasons 1-3 you mentioned. Concerning 2., I plan on keeping track of the republish time of groups of Provider Records. If the node turns on after a period off and that the time to reprovide some groups of Provider Records has already passed, the node will republish all Provider Records to catch up with the sweep. I agree having a (re)provider distinct from the client and server makes it more modular and is generally a good practice. Its interface could contain at least the following functions The Kademlia Identifiers associated with the Provider Records (essentially I am not very familiar with block/disk corruption/deletion, so suggestions are welcome. To address corruption and keep consistency between the blocks kubo is providing and the Provider Records being advertised, kubo could periodically call |
Yeah, there are any number of ways to deal with corruption. For example, you could require a database with transactions you can commit/rollback, you can add that kind of functionality on top of an arbitrary kv store (https://github.com/ipfs/go-ipfs-pinner does this kind of thing), you can do corruption checks, etc. Mostly just wanted to make sure it was on your radar so you don't end up with hard to diagnose problems down the road. To some extent relying on a database to do the work for you is the easiest, but it requires all your downstream consumers to be willing to provide a database that does that which they may/may not be willing to do. Could be worth pushing on if the other approaches feel too computationally or development time expensive. I'll leave my comments on changing the |
Context
Currently the Reprovide Operation is triggered by Kubo for each Provider Record. Kubo periodically (every 22h) republished Provider Records using
go-libp2p-kad-dht
Provide method.go-libp2p-kad-dht/routing.go
Line 373 in b95bba8
The DHT Provide method consists in performing a lookup request to find the 20 closest peers to the CID, open a connection to these peers and allocate them the Provider Record.
The problem
This means that for every Provider Record that a node is advertising, 1 lookup request needs to be performed every 22h, and 20 connections need to be opened. This may seem fine for small providers, however this is terrible for large Content Providers. The Reprovide operation is certainly the reason most large Content Provider don't use the DHT, and IPFS is forced to keep the infamous Bitswap broadcast. Improving the Reprovide operation would allow large Content Providers to advertise their content to the DHT. Once most of the content is published on the DHT, Bitswap broadcast can be significantly reduced. This is expected to significantly cut off the price of hosting content on IPFS, because all peers in the network won't get spammed with requests for CIDs they don't host.
Solution overview
By the pigeonhole principle, if a Content Provider is providing content for$x \geq \frac{\#DhtServers}{repl}$ , then multiple Provider Records are allocated on the same DHT Servers. The optimization consists in reproviding all Provider Records allocated to the same DHT Servers at once. It spares expensive DHT lookups and connections opening.
x
CIDs, withWithout entering too much into details, all Provider Records are grouped by XOR proximity in the keyspace. All Provider Records in a group are allocated to the same set of DHT Servers. Perdiodically, the Content Provider sweeps the keyspace from left to right and reprovides the Provider Records corresponding to the visited keyspace region.
For a Content Provider providing 100K CIDs, and 25K DHT Servers the expected improvement is
~80x
.More details can be found on the WIP Notion document.
How to implement it
The Reprovide operation responsibility should be transferred from Kubo to the DHT implementation. This is generally desired because different Content Routers may have different reprovide logic, that kubo is unaware of, or cannot optimize for.
go-libp2p-kad-dht
(and other Content Routers) should exposeStartProviding(CID)
,StopProviding(CID)
methods instead of theProvide(CID)
method. Kubo then only needs to pass to the DHT which CIDs should be provided or not.A lot of refactoring needs to happen around go-libp2p-routing-helpers.
References
The text was updated successfully, but these errors were encountered: