Propose traversal-based car creation #269

willscott · 2021-11-22T20:17:19Z

Complete the file-based alternative writes with no-precalculated size when a file is provided, and then goes pack to fix up the size of the data in the header at the end.
include carv1 stream support
Handle options being set on the traversal (repeated blocks)
Handle options being set for the car (type of index, etc.)
Roundtrip test

rvagg · 2021-11-23T08:32:00Z

v2/selective.go

+	"github.com/multiformats/go-varint"
+)
+
+// PrepareTraversal walks through the proposed dag traversal to learn it's total size in order to be able to


Lotus evolved away from needing this, so I don't think there's a consumer for prepare+dump anymore, it's just write now, thankfully. There used to be a need to get the size up-front to set up the commp calculation with padding, but that's not necessary anymore. So you could decomplicate this by removing this functionality if you like.

If we want to write out to a stream - e.g. network, this 2x scan would still be more efficient than having to write it out to a file, touch up the header, and then as a second step send the whole thing, i think

Oh, so this would be for a CARv2 format and you need the offset and all that? Because for a selector-based CARv1 you have the root already, or else you can't run the traversal.

yeah, this is to get the data size field correct in the carv2 header when streaming.

I can make a carv1-only stream version that can do it in one pass as well. I suppose that's probably useful as well.

* Provide a method for writing a carv1 to a writer stream. * Return count of bytes written across various interfaces.

willscott · 2021-11-25T19:31:31Z

I think i have all the code here doing what I want. starting in on tests.

This would be a great time for reviews if you want to request changes to naming

v2/internal/loader/counting_loader.go

warpfork · 2021-11-27T06:58:24Z

v2/internal/loader/writing_loader.go

+		DecoderChooser:     ls.DecoderChooser,
+		HasherChooser:      ls.HasherChooser,
+		StorageWriteOpener: ls.StorageWriteOpener,
+		StorageReadOpener: func(lc linking.LinkContext, l ipld.Link) (io.Reader, error) {


Okay, I'm not gonna try to steer here, this is more a question for me to understand how to serve this area better in the future: Did you go through a decision process on whether to do this wrap StorageReadOpener approach vs do some wrappers around the new storage.* APIs?

Is this wrapping strategy just familiar already?

Is this wrapping strategy preferable because it seems like the most broadly applicable place to intercept things?

Is this wrapping strategy preferable because it's clear how it composes with other StorageReadOpener magic, such as how graphsync uses it to hook block load events?

Is it because figuring out how to wrap the storage.* APIs is daunting because it would've required thinking about all the places feature detection might try kick in?

Other?

(These are not blocking questions; they just could be useful info for understanding the value of any future iterations on storage and link loading APIs and other possible event callbacks around them.)

I think it's that the object that is expected to be passed around is a LinkSystem - that linksystem will already have the storage.* object applied to it and turned into a StorageReadOpener, and in this library I won't know what the underlying readable storage was to be able to wrap it. If i assume the right thing to take in is a link system, which seems right because i need to care about things like 'should i hash reads', 'what reifiers are present', etc, then this is the place i can intercept block reads

v2/index/index.go

v2/internal/loader/counting_loader.go

v2/index/index.go

v2/selective.go

+	"github.com/multiformats/go-varint"
+)
+
+// PrepareTraversal walks through the proposed dag traversal to learn it's total size in order to be able to


v2/selective_test.go

v2/selective.go

masih · 2021-11-29T12:42:54Z

v2/selective.go

+		return int64(n), err
+	}
+
+	h := NewHeader(tc.size)


Thinking out loud, in a separate PR we probably should change NewHeader to take options and encapsulate the gymnastics needed to set correct header values.

Propose traversal-based car creation

7d4502e

willscott requested a review from masih November 23, 2021 08:23

rvagg reviewed Nov 23, 2021

View reviewed changes

willscott added 3 commits November 23, 2021 10:43

file mode - traverses once and fixes up carv2 header at the end

44245ff

add traversal budget option and respect set options on creation

642b02b

* Handle the various possible car writer options.

df2a186

* Provide a method for writing a carv1 to a writer stream. * Return count of bytes written across various interfaces.

willscott force-pushed the primeselect branch from e6ef7e2 to df2a186 Compare November 25, 2021 19:30

willscott added 2 commits November 25, 2021 11:34

[fixup] options test

6d41ba2

Add round-trip test

ac5d450

willscott marked this pull request as ready for review November 27, 2021 02:01

warpfork reviewed Nov 27, 2021

View reviewed changes

v2/internal/loader/counting_loader.go Outdated Show resolved Hide resolved

warpfork reviewed Nov 27, 2021

View reviewed changes

cleaner loaders

f53e09c

masih reviewed Nov 29, 2021

View reviewed changes

willscott added 3 commits November 29, 2021 11:45

renames / comments per review

4076b7a

fix test

21907ee

additional testing as requested

0b392bc

willscott requested a review from masih November 29, 2021 23:20

masih approved these changes Nov 30, 2021

View reviewed changes

willscott merged commit c35591a into master Nov 30, 2021

willscott deleted the primeselect branch November 30, 2021 18:22

willscott mentioned this pull request Dec 6, 2021

Add weekly sync meeting notes 2021-11-22 ipld/team-mgmt#142

Merged

masih mentioned this pull request Dec 15, 2021

Implement a CARv2 SelectiveCARAPI when clients have upgraded to go-ipld-prime v0.9.0 #104

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Propose traversal-based car creation #269

Propose traversal-based car creation #269

willscott commented Nov 22, 2021 •

edited

Loading

rvagg Nov 23, 2021

willscott Nov 23, 2021

rvagg Nov 23, 2021

willscott Nov 23, 2021

This comment was marked as resolved.

willscott commented Nov 25, 2021

warpfork Nov 27, 2021

willscott Nov 27, 2021

This comment was marked as resolved.

masih Nov 29, 2021

Propose traversal-based car creation #269

Propose traversal-based car creation #269

Conversation

willscott commented Nov 22, 2021 • edited Loading

rvagg Nov 23, 2021

Choose a reason for hiding this comment

willscott Nov 23, 2021

Choose a reason for hiding this comment

rvagg Nov 23, 2021

Choose a reason for hiding this comment

willscott Nov 23, 2021

Choose a reason for hiding this comment

This comment was marked as resolved.

willscott commented Nov 25, 2021

warpfork Nov 27, 2021

Choose a reason for hiding this comment

willscott Nov 27, 2021

Choose a reason for hiding this comment

This comment was marked as resolved.

masih Nov 29, 2021

Choose a reason for hiding this comment

willscott commented Nov 22, 2021 •

edited

Loading