Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: disallow CAR of single block with links #344

Merged
merged 5 commits into from
Aug 12, 2021

Conversation

alanshaw
Copy link
Member

If a CAR file is sent to us that has a single block (the root block) and that block has links, then it is an invalid partial DAG.

* @param {CarReader} reader
* @param {CarStat} stat
*/
async function validateCar (reader, stat) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the case that leads to the creation of such CAR files? I am not sure if this is the best place for such validation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go-ipfs can create them where it's using a url-store and the url can no longer be reached. it's an edge case that we hit pretty hard, but is otherwise pretty uncommon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see, so it is a valid Car file by the end of the day. Probably better to have a better name for this like isCompatibleCar

Copy link
Contributor

@olizilla olizilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we tweak the strategy here to be more of a best effort guard against weirdness? Not caused by this PR but currently we'll prevent CAR files created with codecs we other than dag-cbor/dag-pb, which we'd rather let through, and we pull the entire CAR into memory to do the checks.

We could use a CarBlockIterator to check that each block in a CAR is smaller than 1MiB, and count the blocks, but drop them as we go. If any are larger, we abort.

In the cases where there are only 1 block that is less than 1MiB, then it'd be much ligher work to read the whole thing again into memory with a CarReader, and try and decode it to see if it is incomplete, and abort if so. If we can't decode it, then ok, it's alienware, and who are we to judge it, on it goes!

if (chunkSize === 0) {
throw new Error('empty CAR')
}
const stat = await carStat(bytes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CAR could be LARGE at this point. We could pass a clone of the request to stat and have it convert the request body into an AsyncIterable so that we can use CarBlockIterator.fromIterable inside stat, to avoid creating a second copy of the CAR in mem as bytes, until after we scan it to figure out if it is large or invalid or both.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good suggestion but please can we not perf creep any more and get this fix merged 🙏?

Copy link
Contributor

@olizilla olizilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It GOOD!

@alanshaw alanshaw merged commit e7436da into main Aug 12, 2021
@alanshaw alanshaw deleted the fix/disallow-car-of-single-block-with-links branch August 12, 2021 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants