Skip to content
This repository has been archived by the owner on Mar 10, 2020. It is now read-only.

refactor: async iterables #1183

Merged
merged 2 commits into from
Jan 23, 2020
Merged

refactor: async iterables #1183

merged 2 commits into from
Jan 23, 2020

Conversation

alanshaw
Copy link
Contributor

@alanshaw alanshaw commented Nov 27, 2019

TLDR;

  • Remove Node.js streams and pull streams
  • Remove callbacks
  • Remove peer-info and peer-id

Now that internals are all async/await/iterables and fetch the next step is to bubble up that goodness to the core API, removing the multiple stream APIs and removing callback support.

I'm also proposing removing peer-info and peer-id, since these drastically increase the bundle size by pulling in libp2p-crypto, for which 99% of it's capability is unused. In place of peer-id we return a CID, which can easily be converted to a PeerId instance via:

const peerId = PeerId.createFromCID(peerCid)

In place of peer-info we return an object { id: CID, addrs: Multiaddr[] }, which can easily be converted to a PeerInfo like:

const peerInfo = new PeerInfo(PeerId.createFromCID(info.id))
info.addrs.forEach(addr => peerInfo.multiaddrs.add(addr))

refs ipfs/js-ipfs#1670
refs ipfs/js-ipfs#2611
refs ipfs-inactive/interface-js-ipfs-core#394

TODO:

Depends on:

BREAKING CHANGE: Callbacks are no longer supported on any API methods. Please use a utility such as callbackify on API methods that return Promises to emulate previous behaviour.

BREAKING CHANGE: PeerId and PeerInfo classes are no longer statically exported from ipfs-http-client since they are no longer used internally.

BREAKING CHANGE: pin.add results now contain a cid property (a CID instance) instead of a string hash property.

BREAKING CHANGE: pin.ls now returns an async iterable.

BREAKING CHANGE: pin.ls results now contain a cid property (a CID instance) instead of a string hash property.

BREAKING CHANGE: pin.rm results now contain a cid property (a CID instance) instead of a string hash property.

BREAKING CHANGE: add now returns an async iterable.

BREAKING CHANGE: add results now contain a cid property (a CID instance) instead of a string hash property.

BREAKING CHANGE: addReadableStream, addPullStream have been removed.

BREAKING CHANGE: ls now returns an async iterable.

BREAKING CHANGE: ls results now contain a cid property (whose value is a CID instance) instead of a hash property.

BREAKING CHANGE: files.ls now returns an async iterable.

BREAKING CHANGE: files.readPullStream and files.readReadableStream have been removed.

BREAKING CHANGE: files.read now returns an async iterable.

BREAKING CHANGE: files.lsPullStream and files.lsReadableStream have been removed.

BREAKING CHANGE: files.ls now returns an async iterable.

BREAKING CHANGE: files.ls results now contain a cid property (whose value is a CID instance) instead of a hash property.

BREAKING CHANGE: files.ls no longer takes a long option (in core) - you will receive all data by default.

BREAKING CHANGE: files.stat result now contains a cid property (whose value is a CID instance) instead of a hash property.

BREAKING CHANGE: get now returns an async iterable. The content property value for objects yielded from the iterator is now an async iterable that yields BufferList objects.

BREAKING CHANGE: stats.bw now returns an async iterable.

BREAKING CHANGE: addFromStream has been removed. Use add instead.

BREAKING CHANGE: isIPFS is no longer exported from the client, please npm i is-ipfs or include the CDN script tag <script src="https://unpkg.com/is-ipfs/dist/index.min.js"></script> to use this utility in your applications.

BREAKING CHANGE: addFromFs has been removed. Please use the exported globSource utility and pass the result to add. See the glob source documentation for more details and an example.

BREAKING CHANGE: addFromURL has been removed. Please use the exported urlSource utility and pass the result to add. See the URL source documentation for more details and an example.

BREAKING CHANGE: name.resolve now returns an async iterable. It yields increasingly more accurate resolved values as they are discovered until the best value is selected from the quorum of 16. The "best" resolved value is the last item yielded from the iterator. If you are interested only in this best value you could use it-last to extract it like so:

const last = require('it-last')
await last(ipfs.name.resolve('/ipns/QmHash'))

BREAKING CHANGE: block.rm now returns an async iterable.

BREAKING CHANGE: block.rm now yields objects of { cid: CID, error: Error }.

BREAKING CHANGE: dht.findProvs, dht.provide, dht.put and dht.query now all return an async iterable.

BREAKING CHANGE: dht.findPeer, dht.findProvs, dht.provide, dht.put and dht.query now yield/return an object { id: CID, addrs: Multiaddr[] } instead of a PeerInfo instance(s).

BREAKING CHANGE: refs and refs.local now return an async iterable.

BREAKING CHANGE: object.data now returns an async iterable that yields Buffer objects.

BREAKING CHANGE: ping now returns an async iterable.

BREAKING CHANGE: repo.gc now returns an async iterable.

BREAKING CHANGE: swarm.peers now returns an array of objects with a peer property that is a CID, instead of a PeerId instance.

BREAKING CHANGE: swarm.addrs now returns an array of objects { id: CID, addrs: Multiaddr[] } instead of PeerInfo instances.

BREAKING CHANGE: block.stat result now contains a cid property (whose value is a CID instance) instead of a key property.

BREAKING CHANGE: bitswap.wantlist now returns an array of CID instances.

BREAKING CHANGE: bitswap.stat result has changed - wantlist and peers values are now an array of CID instances.

@alanshaw alanshaw mentioned this pull request Nov 27, 2019
5 tasks
@alanshaw alanshaw force-pushed the refactor/async-iterables2 branch 2 times, most recently from 80cda06 to a02bf45 Compare December 2, 2019 13:38
alanshaw pushed a commit to ipfs/js-ipfsd-ctl that referenced this pull request Dec 3, 2019
The interface tests in ipfs-inactive/js-ipfs-http-client#1183 have started throwing `UnhandledPromiseRejection`s for a connection refused error to `/api/v0/shutdown`. This is because `stop` was being called and then called again in `killProcess`.

With callbacks this error was smothered but it is now appearing with the promise-only API.
@alanshaw alanshaw marked this pull request as ready for review December 6, 2019 15:36
Copy link
Contributor

@hugomrdias hugomrdias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful ❤️

src/name/resolve.js Outdated Show resolved Hide resolved
@alanshaw
Copy link
Contributor Author

I'm going to back out of the object.data returning an async iterable change - this doesn't make sense.

@alanshaw alanshaw requested a review from dirkmc December 12, 2019 15:32
src/files/ls.js Show resolved Hide resolved
Copy link
Contributor

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alanshaw after sleeping on it I am bit worried about removing buffered versions of commands like cat.

Instead of

const data = await ipfs.cat(cid)

user now needs to ALWAYS wrap it with a helper:

const data = await concat(ipfs.cat(cid))

Something feels off here 🤔

I would expect methods like cat to be easy to use, buffered versions (as they are now) and provide separatecatStream for cases where streaming is needed.
Was this discussed before? Did I miss the train?

@@ -26,13 +25,22 @@ module.exports = configure(({ ky }) => {
})

for await (let message of ndjson(toIterable(res.body))) {
// 3 = QueryError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to extract some of this boilerplate

  • for await (let message of ndjson(toIterable(res.body)))
  • if (message.Type === 3)
  • etc
    into a function / class?

}
})

function toCoreInterface (entry) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be worth extracting this into a shared function also

@alanshaw
Copy link
Contributor Author

@alanshaw after sleeping on it I am bit worried about removing buffered versions of commands like cat.

Instead of

const data = await ipfs.cat(cid)

user now needs to ALWAYS wrap it with a helper:

const data = await concat(ipfs.cat(cid))

Something feels off here 🤔

If they want to buffer data into memory then yes that's what they need to do. They definitely don't ALWAYS need to do this! I'd advise against doing so anyway unless you know how much data you're buffering and that it's coming from your local repo.

If we provide them, users will just use the non-streaming APIs and be frustrated when things take a long time to yield any results and they get no feedback. For big files, it's necessary to stream data and given the majority of people won't know the size of the file they're retrieving, and are trying to obtain it from a network of poorly connected peers we should provide an API that a) encourages them to do the right thing and not buffer content into memory regardless of the size of the file they're retrieving and b) reduces time to first byte, gives them visibility over progress and allows them to differentiate between the file being unavailable or it being large.

On a related point, there's likely to be built in helpers to async iterables soon https://github.com/tc39/proposal-iterator-helpers so you'd be able to await for collect or reduce.

@alanshaw alanshaw force-pushed the refactor/async-iterables2 branch 2 times, most recently from a83f101 to 2aaf632 Compare December 17, 2019 11:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants