[WIP] feat: support chunked add requests #1540

hugomrdias · 2018-09-03T16:26:08Z

This PR supports this ipfs-inactive/js-ipfs-http-client#851 in the ipfs-api

todo:

use file per chunk, resume should simpler with this
~~try not using files and add directly~~
maxBytes value ???

next(in a new PR):

resumable add
concurrent chunk request
delete zombie tmp files

These need more work so we don't add corrupted data.

check duplicates
check missing chunks
check chunk size matches content-range
in the last request right before adding check that full size matches the number of chunk in the folder

lidel

See my thoughts at ipfs-inactive/js-ipfs-http-client#851 (review) first, then the inline comment below.

lidel · 2018-09-03T23:46:35Z

src/http/api/routes/files.js


+const streams = []
+const filesDir = tempy.directory()


Just so that we don't forget: we should remove temporary files after successful upload, but also have a plan for what happens with temp files produced by failed/aborted ones. I don't think we can assume temp dir will be cleared by the OS.
Leaving them forever may be a self-administered DoS attack (eventually running out of disk space)

yes definitely should go like this

success adding ? delete file

errors during session ? define a way to remove only the last chunk from the file ( or have one file per chunk ) this would allow for resumable add's

on new session start async check the temp folder and delete all files older than 1 hour

also we need to handle open streams, too many can become a problem

lidel · 2018-09-09T21:37:47Z

src/http/api/routes/files.js

+      if (err) {
+        pushStream.push({
+          Message: err.toString(),
+          Code: 0,


FYI I've tracked down the source of values used in Code field in go-lang implementation (at go-ipfs-cmdkit/error.go#L9-L25).

Translating from iota notation, it is an enum:

ErrNormal // ErrorType = 0 ErrClient // ErrorType = 1 ErrImplementation // ErrorType = 2 ErrNotFound // ErrorType = 3 ErrFatal // ErrorType = 3

Not sure if that is a problem, probably hardcoding 0 here is fine.

lidel · 2018-09-09T22:08:29Z

src/http/api/routes/files.js

+      payload: {
+        parse: false,
+        output: 'stream',
+        maxBytes: 1000 * 1024 * 1024


How do we pick what is the default max chunk size? Eg. why 1GB and not 256MB ?

Is this going to be a hardcoded limit, or a configuration option that can be passed to js-ipfs constructor?

How do we pick what is the default max chunk size? Eg. why 1GB and not 256MB ?

we need this value to be high for the non chunked path

Is this going to be a hardcoded limit, or a configuration option that can be passed to js-ipfs constructor?

don't know. what you think ?

from the hapi documentation when output=stream, maxBytes doesn't matter but this doesnt seem to be the case. Needs further investigation, maybe it's a matter of upgrading hapi dunno

I can't think of anything right now.
Unless you see good use case for customizing it, lets go with hardcoded value for now.
Exposing it as a config property can be contributed later in separate PR.

alanshaw · 2018-09-13T21:25:39Z

@hugomrdias is this ready for review or does this need feedback for you to proceed?

If yes then please in future assign me as a reviewer and remove the "[WIP]" label from the title when it's ready so I don't skim down the PR list and miss it! Thank you ❤️

hugomrdias · 2018-09-13T21:30:43Z

Yes I'm gonna do that, just getting this ready for you and others. I have a couple more commits to push with tests and a big fix for a bug I found today.

alanshaw

This is a great start 😄 🚀 thanks for working on this - I'm excited to be able to upload big data to IPFS from the browser!

Just a note while I'm thinking about it - the API will need documentation in the README for this and for bonus points an example with a big file!?

There's some review comments for specific tests but this PR needs a suite of tests to ensure the chunking works correctly. Note it doesn't have to include super big files!

Some test cases I'm interested in:

Do the temp files get cleaned up on success/error or incomplete request (final chunk not sent)?
Do the chunking headers get validated properly?
Can I send a single chunk upload?
If I make the chunk size really small does it still work?
Does progress reporting still work?

alanshaw · 2018-09-27T10:38:49Z