-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make small retrieval 200x faster #7693
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a cursory look, and the approach generally LGTM. However, something to call out is that efficiency will be (a) DAG-shape dependent and (b) request-dependent.
Re: (a), a DAG composed of tiny blocks will lead to substantial read amplification and roundtripping, and I expect this approach to be inefficient (comparatively how inefficient with the baseline approach will depend on (b)).
Re: (b), if the caller is fetching the root CID of the deal with the default selector, this equates to fetching the entire deal DAG. At that point it could be more efficient to copy the shard data and serve from local, like we were doing before.
@magik6k -- what do you think about introducing a "planner" component that decides which strategy to use based on (b)?
For (a) we should eventually store metadata about a DAG, something like a DESCRIBE DAG
, as we receive it, along with some stats (average min/max/avg node size, etc). This would allow us to strategise around (a).
@raulk From my testing, when retrieving the whole piece, reads tend to be sequential because graphsync requests data from the DAGStore in the same order it was written to the CAR file, and the pieceReader is exploiting this property. For some, bigger partial retrievals most accesses should still be mostly sequential, but possibly with multiple "heads", but here more data/experimentation is needed. (basically, in my testing, I didn't come up with an example in which copying data would be more efficient) |
6be6dc4
to
9110e6f
Compare
@magik6k Oh yeah, that's a good point. Both the layout of data in disk and the retrieval order are depth-first. That fact, along with the "burn" leniency, should solve both challenges in majority of cases 👍 Maybe we can think of some selectors and DAG combinations whose pattern of access would be different, and test those? |
Codecov Report
@@ Coverage Diff @@
## master #7693 +/- ##
==========================================
- Coverage 39.56% 39.53% -0.04%
==========================================
Files 637 638 +1
Lines 67924 67990 +66
==========================================
+ Hits 26872 26877 +5
- Misses 36446 36488 +42
- Partials 4606 4625 +19
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question but in general LGTM 👍
is this pr was rollback? |
This PR adds direct random-access capability to the lotus DAGStore mount.
Right now, when any retrieval is performed, DAGStore will read the entire piece and store it in a temporary file for random access. This is acceptable when retrieving the entire piece, but incredibly inefficient when performing retrievals for small % of the original piece.
By avoiding this copy step, small retrievals in many cases become sub-second, and time-to-first-byte in all retrievals is drastically reduced.
Example timings
Running current master, retrieving a few kb from a 16G piece
Running this PR, also retrieving a few kb from a different 16G piece (different piece to avoid any potential caching)