-
Notifications
You must be signed in to change notification settings - Fork 108
CAR format spec doc #230
CAR format spec doc #230
Conversation
diagram file via jbenet
Closes #229
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm misunderstanding how Filecoin intends to use CARs, the main issue we need to address is that CARs need to be, at a minimum, deterministic. Specifically, we need to specify exactly how to compute a "CAR of a DAG (selector?)" such that we always get the same result, byte by byte.
* The `version` is always a value of `1`. Future iterations of this specification may make use of `version` to introduce variations of the format. | ||
* The `roots` array must contain **zero or more** CIDs, each of which must be present somewhere in the remainder of the CAR. | ||
|
||
An empty `roots` array is valid, but indicates that the CAR does not necessarily contain blocks forming a coherent DAG. Such a CAR may, for instance, form part of a collection of archives that, when combined, contain an entire DAG. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motivation for this? Ideally, the CAR header would be enough to reconstruct the CAR.
It would be nice to make this somewhat deterministic, at least. Something like: if some roots are listed, all roots must be listed.
* **Streaming**: the CAR format is ideal for dumping blocks via streaming reads as the Header can be loaded first and minimal state is required for ongoing parsing. | ||
* **Individual block reads**: as the CAR format contains no index information, reads require either a partial scan to discover the location of a required block or an external index must be maintained and referenced for a seek and partial read of that data. See below regarding indexing. | ||
* **DAG traversal**: without an external index, traversal of a DAG specified by a "root" CID is not possible without dumping all blocks into a more convenient data store or by partial scans to find each block as required, which will likely be too inefficient to be practical. | ||
* **Modification**: CARs may be appended after initial write as there is no constraint in the Header regarding total length. Care must be taken in appending if a CAR is intended to contain coherent DAG data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a minimum, we need to define a canonical CAR variant such that a CAR of a DAG is deterministic.
(especially if we intend to use them in Filecoin sectors)
|
||
The roots specified by the Header of a CAR must appear somewhere in its Data section, however there is no requirement that the roots define entire DAGs, nor that all blocks in a CAR must be part of DAGs described by the root CIDs in the Header. Therefore, the roots must not be used alone to determine or differentiate the contents of a CAR. | ||
|
||
The CAR format contains no internal means, beyond the IPLD block formats and their CIDs, to verify or differentiate contents. Where such a requirement exists, this must be performed externally, such as creating a digest of the entire CAR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, CAR originally stood for for certified archive because it was supposed to be self-certified/validating.
|
||
### Indexing and Seeking Reads | ||
|
||
The CAR format contains no internal indexing, any indexing must be stored externally to a CAR. However, such indexing is possible and makes seeking reads practical. An index storing byte offset (of section start or block data start) and length (of section or block data), keyed by CID, will enable a single block read by seeking to the offset and reading the block data. The format of any such index is not specified here and is left up to CAR format parsing implementations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could also be stored in the header, couldn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, a version 2 should include indexing information, it would just need additional up-front processing to make a CAR with an index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't bump the version for that. Version bumps should be for layout changes.
But otherwise, yeah, we can deal with that later.
|
||
### Header | ||
|
||
The first bytes of the CAR format hold a varint, this unsigned integer specifies the number of bytes beyond the varint itself that contain the _Header_ block. This Header block is a byte array DAG-CBOR (CBOR with tag 42 for CIDs) encoded object holding the version number and array of roots. As an [IPLD Schema](../schemas/): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we could just make this entire thing recursive. That is, the entire CAR could be:
[ varint | CID | BLOCK ] *
Where the first block is always the header object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would make a nice symmetry. Unfortunately the version information is in the header block and there's no CID before that block, so if we had a v2 that introduced a CID for the header block then our parsers would need to be able to figure out whether the bytes after the initial varint are a CID or the start of a CBOR block (map['version':x,...
). Is the difference between the start of a CID and the start of a CBOR map distinct enough for such branching?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking was to make that version 1. If we also required that the root block list all root blocks, the CAR would actually be a single DAG and would be really easy to validate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea. Reading the header would be more work (as you would also need to read its CID), but that should be negligible compared to making parsing easier as it's less of a special case.
Responding to most of @Stebalien's inline comments here because they revolve around this key point:
That would explain "certified". In https://github.com/ipfs/go-car/issues/19 we originally discussed not having any roots and allowing it to be a store of arbitrary blocks. The primary motivation was that we want to produce CARs that have a max size containing data for very large data structures (e.g. if we store very large DAGs in Filecoin we're limited by sector size). So, can we have a parent CAR that contains one or more roots and its DAG spans outside of the CAR to blocks contained "elsewhere" (the connection piece of this puzzle is yet to be determined). If deterministic is a strict requirement for CARs then I'll change a bunch of this spec to make that the case:
But that doesn't help with the size problem for very large DAGs. Or do we make multiple CAR modes? Where we could introduce a "certified" mode that has a fully self-contained DAG starting at its roots, and a "lax" mode that is just a format for storing arbitrary blocks. |
I don’t think it’s all that practical to ensure the format is deterministic without making huge sacrifices in other areas. For instance, how does the format ensure the blocks are in a particular order? We could add such a requirement, but then anyone who writes to the format won’t be able to stream to it, they’ll need to prepare all the data some other way and then write blocks in an order that ensured to be compliant with the spec. We need to reconcile the use cases we’re building this for before we make statements like “must” about features we don’t even have yet. There’s a lot of tradeoffs to be made here and the use case that has pushed us to even write up the spec and build another implementation isn’t visible to anyone but those of us currently working on it. |
Read:
That's an interesting question. One solution would be to use selectors. Looks like there was an attempt: https://github.com/ipfs/go-car/pull/18.
Being able to CAR a selector should help.
Being able to deterministically create a CAR from a CID (or a selector?) is a hard requirement for systems like Filecoin. Allowing non-canonical/non-deterministic CARs is probably fine.
Add blocks to the dag in-order, starting at the root, sorted in link order. As far as I know, we already have this feature. |
Note: if you need a "bag of blocks" format, it might make sense to create two formats:
|
Hey IPLDers — we had been assuming in Filecoin land that the CAR would tend toward “simple bag of blocks” and writing the CAR from a root + selector in a deterministic way would be an out of band concern for Filecoin to handle. But that assumes Filecoin does the CARification and the entry point for storing something is a cid + selector & blockstore to read from (which is where we’re headed) |
Part of the reason we were headed this way is selector traversal is itself not deterministic and we didn’t want to impose that requirement on selectors in general even if we were going provide a special case when we write car files. I guess though if none of this is in the CAR spec it better be in the Filecoin spec and currently it’s not :) |
How are we transferring the data? I assumed we'd be shipping the dag as a bunch of blocks (using, e.g., bitswap), then constructing the CAR on the remote (miner) side. If so, we need to deterministically construct it. |
@Stebalien your understanding of the transfer mechanics is correct and you are right that CARifying absolutely has to be deterministic for filecoin’s case. Our assumption was though because Filecoin would be doing the CARification itself that deterministic construction would be a Filecoin special case. For reference while https://github.com/ipfs/go-car/issues/18 exists my current assumption is we will move this code into Filecoin itself because we may want our writing process to store even more additional information (outside the file itself) like direct byte offsets to a given cid in the car file. |
Sure, but that’s not enforced by the format, which is what we’re talking about here. Yes, of course different applications will need to build these in a deterministic way, but it’s a stretch to say that it’s the role of the format to enforce that constraint, which it sounds like you aren’t actually recommending based on “Allowing non-canonical/non-deterministic CARs is probably fine.” The reason this is an important distinction for us is that we are currently writing tools to prepare data to be stored in Filecoin in .car files and there isn’t really a canonical representation of the data (local to the .car file) because the data is so large that a single file is sharded over multiple .car files, so each .car file doesn’t even have a local DAG to determine “ordering.” In order to understand the ordering you need data that is outside that .car file, which means we wouldn’t be able to build something that could validate any ordering at the format layer. |
@hannahhoward the spec needs to specify how we go from a CID to a CAR, as far as I know. Otherwise, we're going to have compatibility issues. |
I've just been saying that there must exist a canonical CAR format. That's the only must. Additionally, it would be really nice if CARs were completely self-certifying (as anything else is not a Certified ARchive by definition). |
@mikeal did you see my comment above describing the entry point for the filecoin system?
Are you expecting to provide car files directly? The current design of the code assumes data is not already put in CAR and filecoin has control over how that happens. |
@hannahhoward I think we might be talking about different car files being created for different purposes. We’re preparing large amounts of data to be stored in Filecoin. It’s not really practical for Filecoin to create those .car files, we have to build a lot of custom code in order to take a few PB of data and create the graphs and prepare the .car files, and we have to build all of this to run concurrently because even with 10K running in parallel it takes weeks to prepare the data. |
Off topic: Doesn't graphsync rely on deterministic selector evaluation order for efficient validation? On topic: How we go from a CID+Selector to a CAR is a critical part of the Filecoin spec. If a client and storage miner don't do the exact same thing, they won't be able to form a deal. If we just say "do whatever the code does" now, we'll have to turn that code into a spec as-is when we finish the spec. That's likely going to lead to a convoluted spec with a lot of edge cases. We can hand-wave network protocols and fix them later but hand-waving piece formation is going to be a problem later. |
|
||
### Header | ||
|
||
The first bytes of the CAR format hold a varint, this unsigned integer specifies the number of bytes beyond the varint itself that contain the _Header_ block. This Header block is a byte array DAG-CBOR (CBOR with tag 42 for CIDs) encoded object holding the version number and array of roots. As an [IPLD Schema](../schemas/): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea. Reading the header would be more work (as you would also need to read its CID), but that should be negligible compared to making parsing easier as it's less of a special case.
|
||
#### CID | ||
|
||
Following the Length, the CID of the block is included in raw byte form. A decoder reading a Section must decode the CID according to CID byte encoding rules, which don't provide a stable length. See https://github.com/multiformats/cid for details on the encoding of a CID. CIDv0 and CIDv1 are currently supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to support CIDv0 or do we support it because we can? I don't see a reason to use CIDv0 these days and I'd be happy to phase them out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to support CIDv0, we use it everywhere in IPFS. Long ago we could have probably introduced some kind of "use CIDv1 everywhere but always encode DagPB CIDs as CIDv0 when encoding a DagPB block" rule, but that ship has sailed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to create .car files you need some processing anyway. So doing a v0 to v1 conversion shouldn't be that hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that forming a block full of CIDv1 CIDs produces a different CID than a block formed using CIDv0 CIDs. I'm concerned about erasing the CIDv1/v0 information.
But... this does bring up a good point. We might reference the same block as CIDv1 and CIDv0. Given that, I think you're right. It might make sense to specify that CID "headers" (i.e., not internal CIDs), must be normalized to CIDv1.
There's a second (unlikely) issue where multiple blocks might be valid under different codecs. This is one of the reasons we're changing our blockstores to refer to blocks by multihash, instead of by CID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CIDv0 is supported in current CAR code, but iirc the only public use of them is shipping FIL genesis blocks around so I guess it could be broken. Would need some consensus from folks responsible for go-car to make that happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a CIDv1 requirement on the headers we should make sure to document it well and require that CAR readers will accept a CIDv0 as the read input and convert to CIDv1 when looking in the .car file. This implementation detail is likely to be overlooked if not well documented.
Apologies for the large comment but this is potentially controversial and I want to make sure I capture all of the details. I’d like to unwind/challenge the requirements being surfaced regarding determinism. To be clear, I am not questioning the need for determinism in many places where .car files are used, the need for determinism in these use cases is quite clear. What I don’t think is clear, and highly problematic, is trying to achieve and potentially ensure determinism at the .car file rather than by other means. In fact, I would argue that we already are achieving determinism by other means and that determinism is simply carrying over into the .car files we’re creating. Currently, the .car file library in Go ensures determinism at creation time by performing a deterministic read operation that writes the .car file in a predictable and consistent manner. It can only ensure determinism by performing a deterministic read-on-write and has no way to do this as a write-only operation. So, just to be clear, we do not currently have or rely upon a guarantee than any .car file we receive is deterministic. Instead, we rely on deterministic creation of .car files by different nodes. If a node were to generate a .car file in an indeterministic manner the operation it was trying to accomplish with other nodes would simply fail. Another way to describe this behavior that I think could bring us a lot more flexibility and better guarantees is that we have deterministic graph readers. Or, even better, what we have are deterministic selectors. We only have deterministic .car files to the extent that we have .car files created from deterministic selectors. Even outside of .car file usage, selectors already must be deterministic in order to work in a trust-less environment because the client receiving the results of a selector must be able to validate the data it is receiving as a stream without excessive buffering, which requires that the ordering be consistent between different implementations. I recall this being part of the selector/graphsync specification process. We’re about to do some more selector spec work in IPLD and we could easily expand and solidify this. Some problems with trying to place a deterministic requirement on .car files are:
By comparison, leaning on deterministic selectors has the following benefits:
Finally, it’s worth noting that this would require zero code changes. Like I said, this is already what we’re doing, this is really just about how we talk about determinism and where we try to expand and improve these guarantees in the future. In my view, it’s actually somewhat dangerous to say we have guarantees that we aren’t validating and could quite easily not be true. In the places I’m aware of in Filecoin where determinism is required for .car files it’s only required for specific operations that could be accomplished by other means and deals could easily be created for indeterministic .car files. This makes me worry about assumptions being made about determinism up the stack that aren’t necessarily true. But everywhere that we use a selector, we can ensure determinism, and if we shift our focus to “determinism via selectors” we won’t find later that we assumed there was a guarantee we don’t actually have. Tying this all back together, my ideal representation of a “deterministic .car file” is a (selector + car file) in some format. The requirements for determinism should migrate to selectors and we should ensure that current and future selector specifications leave no room for ambiguity on issues like ordering. If Filecoin, or anyone else, wants a deterministic guarantee they should specify both the selector and the .car file (or some other unique identifier for the .car file like CommP). Which, again, is what we’re already doing in practice. |
@mikeal that makes a large amount of sense to me |
I agree. The only requirement we have is that there exists a well-defined, deterministic way to go from a selector to a CAR. The CAR format itself doesn't have to require or guarantee any particular order of blocks. |
+1 to what @mikeal is saying. In terms of future work:
For example, I believe the current determinism to be:
While the above is currently the only traversal type used anywhere, it would be good to have it documented in the selector spec, and referenced in the two places it's vital:
|
I don't want to derail this conversation but I think it's worth noting so we don't leave this snippet as a hand-wave:
We dove into this discussion today in our meeting, https://github.com/ipld/team-mgmt/blob/8d612e3ce1222de9dd531ca14bdeeb662bd2a3a1/meeting-notes/2020/2020-01-20--ipld-sync.md#notes - map ordering will likely be defined in our strict DAG-* codecs (dag-cbor, dag-json) but will have to be exposed through the encoding in a way that selectors can make use of, which will be interesting in some languages where you can't control natural map ordering (Go for example, but in Go if you want ordering you need a side-channel anyway so this is all being baked into go-ipld-prime anyway). The best place for further discussion on this, for now, is probably in #227 where I've noted it (it's collecting some CBOR-specific loose vs strict thoughts). |
I've pushed an update that I hope puts this in a place where it can land, see last commit or read rendered @ https://github.com/ipld/specs/blob/rvagg/car/block-layer/content-addressable-archives.md Summary:
If you feel strongly that one of those unresolved items should be resolved then please speak up, otherwise let's treat this as needing final reviews. |
I feel strongly about it as I think restricting it to CIDv1 would make things easier and the spec more precise. I also haven't really seen a reason for not restricting it to CIDv1. Though if that again leads to discussions, it could be added in a subsequent PR. |
re CIDs I'm mostly concerned about it being a one-way operation. If you have a datastore with blocks in it and some are CIDv0, the round-trip through a CAR will mean they all get converted to CIDv1. If they are used as string keys then anything trying to reference a block will have to know that it has to up-convert to CIDv1 to fetch it. Is that going to work? tbh I'm not completely in sync with how this will work in practice and if it'll matter; maybe this up-conversion on look-up is being done actively now anyway? |
The actual block data won’t be modified so this should be fine. Keep in mind, this is essentially a decision the CAR format is making for internal representation and implementations can handle translation of older (or perhaps some day newer) CID versions. Requiring CIDv1 adds some extra work to CAR implementors but once that work is done there’s no externally visible transformation because:
This means that when you read block data that references v0 links you can just pass them right off to the CAR file and it’ll pull them out properly. References inside the block to v0 are fine and will remain unchanged. Any external string references to v0 are also fine because they will be converted on The only caveat is that if you parse the entire CAR file’s CID’s and do your own comparison you’ll need to do the conversion yourself as well. |
Right, my point was that I think existing datastores retain CID version during storage, there's no translation right now is there? For instance, if someone goes to ipfs.io/ipfs/Qm.... then under the hood is it doing a CIDv1 translation of that to look in datastore caches already (same on the client end when doing a DHT query)? If not, then it would seem like the CID bytes are being treated strictly as a key, rather than the content address that it is, so are we ready to be enforcing conversion in all of these other code edges? |
I believe implementing such dual-lookup this is the actual plan, but can not find the issue I've seen that... |
There are plans that the datastores use only the Multihash and not the full CID: ipfs/js-ipfs#2415. |
It’s my understanding that part of the transition existing data stores are making from v0 to v1 CID’s is moving the storage layer to use the multihash. We should consider doing the same in a future version of CAR (for block identifiers, not for root node identifiers) but that is probably too disruptive at the moment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the "only use CIDv1" would be a quick, easy, non-controversial change that doesn't need much discussions. It turns out this is not true, hence I propose opening a separate issue for it and get this version of the spec merged ASAP.
I had a sync up with @Stebalien and @jbenet last Friday and came to a few conclusions.
|
I've done some minor cleanups in the latest commit. I've also changed the roots array from being, in essence, "zero or more, but we haven't strictly resolved that" to being "one or more, but we haven't strictly resolved that". And while I was at it, I did the same for the number of blocks, "one or more blocks" must be present. We're leaning a lot on an "unresolved issues" section here, so we'll want to focus on those things for an iteration of this spec. (@ribasushi also raised the possibility of identity CIDs with zero-length blocks, but I'll leave that out of the for now for further consideration). PLEASE REVIEW all, this is intended to reflect current state, a future PR can evolve to a version 2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not that happy with shipping a half-baked format with a spec with things we are pretty sure we want to change in the future. But the spec itself is great, thanks @rvagg!
Sorry for the last minute addition. We should note that not all implementations properly support multiple roots (filecoin) and that it is highly recommended you only use a single root in this version of the spec. |
looks great, let’s get it merged. |
I've no idea why this PR doesn't show any merge. But in case someone is wondering, this got merged with commit 71ef2bd. |
Builds on @mishmosh's work in #229 for a complete spec.
Best reviewed as rendered MD: https://github.com/ipld/specs/blob/rvagg/car/block-layer/content-addressable-archives.md
I'm confident in the technical details, but when reviewing I'd like you to consider the following items that I've codified:
{version:1,roots:[]}
. We could rule it out./R @mishmosh @whyrusleeping @dignifiedquire @Kubuxu @Stebalien