-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: disallow CAR of single block with links #344
Conversation
packages/api/src/car.js
Outdated
* @param {CarReader} reader | ||
* @param {CarStat} stat | ||
*/ | ||
async function validateCar (reader, stat) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the case that leads to the creation of such CAR files? I am not sure if this is the best place for such validation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go-ipfs can create them where it's using a url-store and the url can no longer be reached. it's an edge case that we hit pretty hard, but is otherwise pretty uncommon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah I see, so it is a valid Car file by the end of the day. Probably better to have a better name for this like isCompatibleCar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we tweak the strategy here to be more of a best effort guard against weirdness? Not caused by this PR but currently we'll prevent CAR files created with codecs we other than dag-cbor/dag-pb, which we'd rather let through, and we pull the entire CAR into memory to do the checks.
We could use a CarBlockIterator to check that each block in a CAR is smaller than 1MiB, and count the blocks, but drop them as we go. If any are larger, we abort.
In the cases where there are only 1 block that is less than 1MiB, then it'd be much ligher work to read the whole thing again into memory with a CarReader, and try and decode it to see if it is incomplete, and abort if so. If we can't decode it, then ok, it's alienware, and who are we to judge it, on it goes!
if (chunkSize === 0) { | ||
throw new Error('empty CAR') | ||
} | ||
const stat = await carStat(bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CAR could be LARGE at this point. We could pass a clone of the request to stat and have it convert the request body into an AsyncIterable so that we can use CarBlockIterator.fromIterable inside stat, to avoid creating a second copy of the CAR in mem as bytes
, until after we scan it to figure out if it is large or invalid or both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a good suggestion but please can we not perf creep any more and get this fix merged 🙏?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It GOOD!
If a CAR file is sent to us that has a single block (the root block) and that block has links, then it is an invalid partial DAG.