Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

neutrino: add support for side-header loading on initial sync #70

Open
3 tasks
Roasbeef opened this issue Jul 10, 2018 · 6 comments
Open
3 tasks

neutrino: add support for side-header loading on initial sync #70

Roasbeef opened this issue Jul 10, 2018 · 6 comments

Comments

@Roasbeef
Copy link
Member

One way to speed up the initial sync for neutrino is to actually package a set of headers (both regular and filter headers) along side the application that packages neutrino. This would allow one to package a set of (possibly compressed) headers that will be written to disk on start up before we start to fetch headers from the network.

Steps To Completion

  • Modify the initial constructor to add a new set of functional options for side loaded headers.

  • On start up, before syncing, if this is IBD, we should read these headers and write them directly to disk. In the suggested model, we skip verification all together, as it's assumed that these headers are being fetched from a trusted source.

  • As a bonus, we can also compress the set of headers, and decompress them within neutrino. This may be useful for contexts such as mobile applications, where reducing the size of the apk is desirable. An example of a header specific compression scheme we may want to look at is: https://github.com/petertodd/rust-bitcoin-headers#how-it-works. There likely some additional optimizations on top of this that we can explore and later implement.

@rawtxapp
Copy link
Contributor

This could be useful for more than just the initial sync.

For example, on iOS, while Apple doesn't allow arbitrary code execution, it does allow background downloads. So what we could do is, once or twice a day, download the latest filters in the background and sync with them when the app is opened.

I think the filter fetching logic might need to be reimplemented in swift, but it's feasible. Or if the user has a trusted neutrino node with some sort of http endpoint serving filters, it could potentially just download those.

@Chinwendu20
Copy link
Contributor

Hello, how about this for a header compression scheme:
https://github.com/willcl-ark/compressed-block-headers/blob/v1.0/compressed-block-headers.adoc

@guggero
Copy link
Member

guggero commented Apr 13, 2023

That sounds like an interesting optimization! Though I'm not sure if anyone has ever actually implemented or used that proposed format. Also I'm not sure if removing the previous hash is a good idea in the context of Neutrino as that would make it impossible to verify the headers in parallel (as you would strictly need to calculate them one-by-one in order). But perhaps that's a small tradeoff for the win in compression that we get from it.

BTW, I started with a small side project (https://github.com/guggero/cfilter-cdn) for creating block header, filter header and filter files that could be distributed over HTTP (with the idea to support fetching them from a Neutrino client over a CDN).
Maybe we could implement the optimization there to see how much of a difference it really makes.

The current file sizes for mainnet (~780k blocks) are:

  • block headers: 54MB
  • filter headers: 23MB
  • filters: 8.5GB

So I think it would make sense to side-load the block headers and filter headers but then actually fetch the filters that we need (deciding from the wallet's birthday).

@Chinwendu20
Copy link
Contributor

Please I would like to get everyone's perspective on this:

If we want to preload cfheaders in neutrino's store, we would need a source that serves cfheaders.

But the thing is that if we have a source of cfheaders, we really cannot verify it as far as I know ( please if you think otherwise, point me to a piece of code or doc).

What would be more useful is having a source of filters, so we doublehash it according to bip157 and verify with checkpoints.

What do you all think, we sideload cfheaders with a source of filters which we would use to create cfheaders that we would store in neutrino?

@guggero
Copy link
Member

guggero commented May 1, 2024

Yes, as far as I know we cannot validate the cfheaders by themselves. But what we can do is after preloading connect to a certain number of peers and compare our latest cfheader with theirs. If they match up, that's a good sign. And then whenever we fetch an actual filter from the network, we can validate its cfheader as well (previous cfheader hash plus the filter hash should equal current cfheader hash, or something like this, haven't looked it up in a while).

I don't think pre-loading filters makes sense, since they are the large part (you don't want your app to be 9 GB in size) and you often don't need all of them anyway (you'd only ever download those for the block since your wallet birthday block).

The whole point of pre-loading is to trade speed vs. trust. So you do have to trust the source of the pre-loaded files somewhat. But since everyone could run their own block-dn and create their own pre-load files (or fetch them over HTTPS instead of packing the files into an app), the trust assumptions should be somewhat mitigated.

@Chinwendu20
Copy link
Contributor

Thanks for this

Yes, as far as I know we cannot validate the cfheaders by themselves. But what we can do is after preloading connect to a certain number of peers and compare our latest cfheader with theirs.

I do not know if this would work as I thinkthe latest cfheader checking out as valid does not necessarily mean the rest of the cfheaders preloaded would be.

I don't think pre-loading filters makes sense, since they are the large part (you don't want your app to be 9 GB in size) and you often don't need all of them anyway (you'd only ever download those for the block since your wallet birthday block).

Even with p2p sync though, a call to fetch cfheaders returns filter hashses and not cfheaders so we would just be mimicking the same behavior for non-p2p. Also I we would just use the filters to create cfheaders then verify with checkpoints and then not store the actual filters just as it is done with p2p sync.

The whole point of pre-loading is to trade speed vs. trust. So you do have to trust the source of the pre-loaded files somewhat. But since everyone could run their own block-dn and create their own pre-load files (or fetch them over HTTPS instead of packing the files into an app), the trust assumptions should be somewhat mitigated.

So we skip verifying entirely?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants