Backwards compatibility of `ipfs.add(urlSource(url))` #3195

Gozala · 2020-07-24T16:42:21Z

This came up in #3081 (comment) and me and @achingbrain agreed to continue discussion on the separate channel (this issue)

Problem

ipfs.add is supposed to take a single file input (right now it will happily accept many, but that is going to change in the future).
ursSource returns multiple files (see: https://github.com/ipfs/js-ipfs-utils/blob/master/src/files/url-source.js#L5)

Options to address this:

Make ipfs.add support multiple files.
Make ursSource return singe FileObject.
Embrace native data types (E.g. DOM Response object)

My personal opinions are as follows

I really don't like the option 1. It would prevent us from taking advantage of add / addAll separation to reduce complexity.
Option 2. seems like a good compromise
- Looking into urlSource implementation, there is no reason it for it to return AsyncIterable<FileObject> instead it could return { path: string, * content() { yield * (await new Http().get(url, options)).iterator() } }.
- What's bad about urlSource is that it will cause buffering in browser because of streaming.
  - This could probably addressed by introducing another code path for browser which would return { path: sting, content: AsyncIterable<Blob> }
Option 3. Makes most sense to me, but it's a larger breaking change
- urlSource essentially returns an async task (not the result) that add will run and consume results
  - This seems wrong to me add is not a scheduler for running tasks, it should take results instead.
- Standard Response represents result of fetching remote data.
- Conveniently it allows consumer to choose betwen blob and stream consumption strategies.
- However it would require await fetch(url) on user end, but that makes sense (add isn't a task scheduler).
  - If we add should take on a task scheduler role, then it could accept standard Request instances.

With all that said I think best compromise would be to make ursSource return FileObject and I can take on that as a followup to #3081 that would enable tests that use urlSource now.

The text was updated successfully, but these errors were encountered:

lidel · 2020-07-27T15:50:23Z

Unsure what would be the generic use case for urlSource in browser context ("paste URL here to import it to IPFS" page for re-hosting stuff?), but for what its worth IPFS Companion would most likely switch to urlSource for importing images/videos/pages via context menu on pages. For this specific use case, buffering should not be a problem, because the data is already in the browser cache.

I agree the best compromise for now is Option 2.

Gozala · 2020-07-29T20:03:35Z

Unsure what would be the generic use case for urlSource in browser context ("paste URL here to import it to IPFS" page for re-hosting stuff?)

Not sure what you mean by generic use case, but https://github.com/inkandswitch/xcrpt add-on we did crawl the page to archive it into IPFS. However as I pointed out in the description (quoting below)

urlSource essentially returns an async task (not the result) that add will run and consume results

This seems wrong to me add is not a scheduler for running tasks, it should take results instead.

That makes urlSource(url) inadequate for that, because:

Some URLs are inaccessible

Some resources can't be fetched (e.g. images) due to of CORS headers, even though they are in the cache.
Some URLs may refer to not found resources (404).

To workaround some CSP issues you need to do some fetching in the add-on context as opposed to page context and pass things back and forth.

All this to suggest that ipfs.add / ipfs.addAll can't possibly recover from such errors, but user space code can. Which is why I find urlSource to be inadequate for most production use cases.

but for what its worth IPFS Companion would most likely switch to urlSource for importing images/videos/pages via context menu on pages. For this specific use case, buffering should not be a problem, because the data is already in the browser cache.

You will likely face problems outlined in the section above, which would make urlSource to be a poor choice.
If that is the case we should consider urlSource to use different code path in browser. Or else you'd fall of happy path as per Embracing web native FormData / File where possible #3029, because instead of passing blobs pointing to the in-memory resource you'd effectively be reading it via stream and then buffering it back into blob.

Changes the return type of this `urlSource` from `AsyncIterable<{ path: string, content: AsyncIterable<Uint8Array> }>` to `{ path: string, content: AsyncIterable<Uint8Array> }` removing one layer of asynchronicity and making return be a single file as opposed to collection of files that always happen to contain one file. fixes: ipfs/js-ipfs#3195 Co-authored-by: achingbrain <alex@achingbrain.net>

Gozala added the need/triage Needs initial labeling and prioritization label Jul 24, 2020

Gozala mentioned this issue Jul 24, 2020

feat: share IPFS node between browser tabs #3081

Merged

6 tasks

Gozala self-assigned this Jul 28, 2020

terichadbourne mentioned this issue Jul 29, 2020

Update Regular Files API to reflect new add / addAll distinctions ProtoSchool/protoschool.github.io#498

Open

Gozala mentioned this issue Jul 29, 2020

feat: make urlSource compatible with new ipfs.add ipfs/js-ipfs-utils#53

Merged

lidel mentioned this issue Aug 6, 2020

refactor: the async await extravaganza ipfs/ipfs-companion#811

Merged

14 tasks

jacobheun added status/in-progress In progress and removed need/triage Needs initial labeling and prioritization labels Aug 20, 2020

achingbrain closed this as completed in ipfs/js-ipfs-utils#53 Jan 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backwards compatibility of `ipfs.add(urlSource(url))` #3195

Backwards compatibility of `ipfs.add(urlSource(url))` #3195

Gozala commented Jul 24, 2020

lidel commented Jul 27, 2020 •

edited

Loading

Gozala commented Jul 29, 2020

Backwards compatibility of ipfs.add(urlSource(url)) #3195

Backwards compatibility of ipfs.add(urlSource(url)) #3195

Comments

Gozala commented Jul 24, 2020

Problem

Options to address this:

lidel commented Jul 27, 2020 • edited Loading

Gozala commented Jul 29, 2020

Backwards compatibility of `ipfs.add(urlSource(url))` #3195

Backwards compatibility of `ipfs.add(urlSource(url))` #3195

lidel commented Jul 27, 2020 •

edited

Loading