Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nft.storage should make sure that blocks don't exceed 1MiB #637

Closed
Gozala opened this issue Oct 19, 2021 · 5 comments · Fixed by #899
Closed

nft.storage should make sure that blocks don't exceed 1MiB #637

Gozala opened this issue Oct 19, 2021 · 5 comments · Fixed by #899
Assignees
Labels
kind/bug A bug in existing code (including security flaws) P2 Medium: Good to have, but can wait until someone steps up

Comments

@Gozala
Copy link
Contributor

Gozala commented Oct 19, 2021

Larger blocks are problematic because IPFS would accept them but then fail to provide. It would gladly accept a giant json in store API and produce a block that could exceed 1MiB limit

https://github.com/ipfs-shipyard/nft.storage/blob/f8b56d1577f5abc1fb640bcd19f8b3e4ade03e85/packages/api/src/routes/nfts-store.js#L53-L62

Solution

Our encoder should use chunking strategy and factor out large fields e.g. data urls into separate blocks and validate that in the end all blocks fit the limits. If this naive strategy fails it could error and ask user to refactor metadata instead.

@Gozala Gozala added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Oct 19, 2021
@Gozala
Copy link
Contributor Author

Gozala commented Oct 19, 2021

@rvagg offered to help with this endeavor https://protocollabs.slack.com/archives/CDWAJ81FA/p1634617946388500

@rvagg
Copy link

rvagg commented Oct 20, 2021

@Gozala can we bound the problem a bit for a short term solution and say that this is caused by large binary blobs within the object being processed? An object chunker could then just focus on finding binary blobs over a certain threshold and externalising them appropriately? Or are you seeing objects that are large because they contain a lot of fields and deep data and not necessarily binary? Or, maybe we're dealing with large string blob fields where the user is string encoding binary fields in some way, so we also need to be extracting those?

@atopal atopal added P2 Medium: Good to have, but can wait until someone steps up and removed need/triage Needs initial labeling and prioritization labels Oct 22, 2021
@atopal
Copy link

atopal commented Oct 22, 2021

Needs investigation.

@JeffLowe
Copy link
Contributor

related to #390 per Alan

@dchoi27
Copy link
Contributor

dchoi27 commented Nov 12, 2021

Yeah more precisely #645 will close this

alanshaw pushed a commit that referenced this issue Jan 21, 2022
This uploads data sent to the `/upload` endpoint to S3 for disaster recovery. All data is converted to CAR (if not CAR already) before being uploaded to S3.

Note: does not backup pinned data or IPNFT's uploaded to the `/store` endpoint

resolves #390
resolves #637
resolves #355
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) P2 Medium: Good to have, but can wait until someone steps up
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants