-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strict DFS traversal #359
Comments
I believe that section is doing DFS. See the bottom most part of it. Just collecting links but then serially walking those links and resolving what's below them. https://github.com/web3-storage/freeway I believe is making use of this same code to produce DFS order for us. I think originally they had some BFS code internally that they had to unwind but from what I understand the unixfs internals aren't the problem area. |
OK I really hope I'm not misunderstanding something.
The issue is, js-ipfs-unixfs/packages/ipfs-unixfs-exporter/src/resolvers/unixfs-v1/content/file.ts Line 78 in 8a271b2
Incremental verification fails on this CID that's fetched from lassie: bafybeigrbpmdsqaift2qwzy32bjyywkx6nzmn66pjeoaie6egpbktykc6e If I log every
If I use dagula, it prints the correct DFS order
The DAG looks like
You can see that js-ipfs-unixfs traverses Does calling blockstore.get count as a traversal? EDIT: Actually it's not totally BFS, it's just the first layer of children being visited before any of the sub children |
mmm, you're probably right about that! not sure about the async iteration going on there but probably doing all those blockstore fetches in parallel first; maybe worth finding out how freeway is using this code to do it? or perhaps it's just using js-unixfs for this side of it instead |
Or ... perhaps we haven't noticed traversal order problems with freeway because large file fetching is more rare, maybe I need to go back and look at the error logs. You have to have a pretty big file to end up with multiple layers. |
Yep I've been using this lib for a while without issue. It was only until I implemented incremental verification and tried rendering a 50MB image that I ran into this problem. I made a hasty fix here due to time constraints: filecoin-saturn@ee5a574 |
The traversal is DFS, that is, the leaf node data is emitted depth-first, but internally the exporter applies an optimisation to load all the children in the DAG as soon as they are encountered. The links are processed in-order and as soon as they are ready rather than waiting for the last to load before the first is processed. This speeds up the case for when the blockstore has to go to the network or it uses some other slow retrieval method, as it has a headstart for when you do actually need a sibling block, but is why it's called in an order that looks like it's doing BFS. A quick fix might be to expose a config option for the |
@rvagg freeway does not use the unixfs exporter to create CARs with DFS block ordering. |
Ok that makes sense. For context, Saturn retrieval clients expect CAR file blocks to be in DFS order, and it's implemented by having the traversal client (in this case js-ipfs-unixfs) ask the blockstore for blocks in the expected order. There isn't really an easy workaround in userland since the blockstore lacks any traversal context, so a workaround would be appreciated 🙏 . |
By default we attempt to load all sibilings in a given layer of a DAG at once to allow slow/async loading routines extra time to fetch data before it is needed. Some blockstores (e.g. CAR files) require the exporter to only request the next sequential CID in a DAG. Add a `blockReadConcurrency` option (named similarly to the importer's `blockWriteConcurrency` option) to control this behaviour. Fixes #359
By default we attempt to load all siblings in a given layer of a DAG at once to allow slow/async loading routines extra time to fetch data before it is needed. Some blockstores (e.g. CAR files) require the exporter to only request the next sequential CID in a DAG. Add a `blockReadConcurrency` option (named similarly to the importer's `blockWriteConcurrency` option) to control this behaviour. Fixes #359 --------- Co-authored-by: Rod Vagg <rod@vagg.org>
The Trustless Gateway spec is being productionized by Saturn, Daghaus and possibly others. It's important for DAG traversal clients to be able to consume CARs with blocks in DFS order as this is optimal for streaming and incremental verification. While DFS isn't required by the spec, it seems to be the preferred traversal order as it's the only explicitly mentioned order, the other being "unknown".
This is the section that performs BFS, there may be more though.
js-ipfs-unixfs/packages/ipfs-unixfs-exporter/src/resolvers/unixfs-v1/content/file.ts
Lines 53 to 127 in 8a271b2
The text was updated successfully, but these errors were encountered: