Skip to content
This repository has been archived by the owner on Aug 11, 2021. It is now read-only.

Major API changes #185

Closed
wants to merge 5 commits into from
Closed

Major API changes #185

wants to merge 5 commits into from

Conversation

vmx
Copy link
Member

@vmx vmx commented Nov 22, 2018

The whole IPLD APIs get a review to make them more consistent and
easier to use.

Changes to compared to the current spec:

  • No more callbacks, everything is Promises.
  • get():
    • Is renamed to resolve() to indicate that it also takes a mandatory
      path argument.
    • Doesn’t have a cid in the return value anymore. That use case is
      covered by returning all objects that were traversed. Their value
      will always be the CID of the next one to traverse. So if you want
      to know the CID of the current IPLD Node, just look at the value
      of the previously returned IPLD Node.
    • Doesn’t return a single value, but an iterator.
    • localResolve option is removed. If you want to resolve a single IPLD
      Node only, just stop the iterator after the first item.
  • getStream() is removed without replacement. Use resolve() which uses
    async iterators instead.
  • getMany() is renamed to get():
    • takes an iterable of CIDs as parameter.
  • put():
  • .treeStream() is removed without replacement (the only current user seems
    to be the IPFS DAG API tree command and the ipld-explorer-cli). IPLD Nodes
    are just normal JavaScript objects, so you can call Object.keys()
    recursively on a IPLD Node to get all its resolvable paths. If you want to
    follow all the CIDs, write the tree traversal yourself. This could perhaps
    be an example bundled with js-ipld.
  • remove():
    • takes an iterable of CIDs as parameter.
  • .support.add() is renamed to .addFormat().
  • .support.rm() is renamed to .removeFormat().

Almost all open issues in regards to the IPLD API got adressed. One bigger
thing is the Patch API, which also isn’t part of the current specification.
Here’s an overview of the issues:

Closes #182.

Others that might be interested: @achingbrain @alanshaw @olizilla

@ghost ghost assigned vmx Nov 22, 2018
@ghost ghost added the status/in-progress In progress label Nov 22, 2018
The whole IPLD APIs get a review to make them more consistent and
easier to use.

Changes to compared to the current spec:

 - No more callbacks, everything is Promises.
 - `get()`:
   - Is renamed to `resolve()` to indicate that it also takes a mandatory
     path argument.
   - Doesn’t have a `cid` in the return value anymore. That use case is
     covered by returning all objects that were traversed. Their `value`
     will always be the CID of the next one to traverse. So if you want
     to know the CID of the current IPLD Node, just look at the `value`
     of the previously returned IPLD Node.
   - Doesn’t return a single value, but an iterator.
   - `localResolve` option is removed. If you want to resolve a single IPLD
      Node only, just stop the iterator after the first item.
 - `getStream()` is removed without replacement. Use `resolve()` which uses
    async iterators instead.
 - `getMany()` is renamed to `get()`:
   - takes an iterable of CIDs as parameter.
 - `put()`:
   - takes an iterable of CIDs as parameter.
   - Doesn’t take `cid` as an option anymore. The CID is always calculated
     (#175).
   - The options don’t take the `format` (which is a string), but the `codec`
     (which is a `multicodec`) (#175).
   - the `version` option always defaults to `1`.
 - `.treeStream()` is removed without replacement (the only current user seems
   to be the IPFS DAG API tree command and the ipld-explorer-cli). IPLD Nodes
   are just normal JavaScript objects, so you can call `Object.keys()`
   recursively on a IPLD Node to get all its resolvable paths. If you want to
   follow all the CIDs, write the tree traversal yourself. This could perhaps
   be an example bundled with js-ipld.
 - `remove()`:
   - takes an iterable of CIDs as parameter.
 - `.support.add()` is renamed to `.addFormat()`.
 - `.support.rm()` is renamed to `.removeFormat()`.

Almost all open issues in regards to the IPLD API got adressed. One bigger
thing is the Patch API, which also isn’t part of the current specification.
Here’s an overview of the issues:

 - #66
   - [ ] IPLD Resolver `patch` API: There is no patch API yet
 - #70
   - [x] lazu value lookups: Can be done as IPLD Formats is pure JavaScript
   - [x] get external / local paths only: @vmx doesn’t know what this is,
         considers it a “won’t fix”
   - [x] toJSON + fromJSON - roundtrip: A roundtrip is not a goal anymore
         => won’t fix
   - [ ] put + patch api #66: Patch API is still missing
   - [x] text summary: @vmx doesn’t see a strong use case for this => “won’t fix”
   - [x] globbing: Out of scope for js-ipld
 - #175
   - [x] Deprecate passing a CID to `ipld.put`?: Current spec doesn’t allow
         `put()` with a CID
 - #182
   - [x] API Review: Links to many other issues, I list the individual issues
         below
 - #183
   - [x] getting the merkle proof with resolver.resolve: `resolve()` returns
         all CIDs along the traversed path
 - #184
   - [ ] Pass down the `options` object to resolver.resolve(): This needs a
         custom resolver. @vmx isn’t sure if the js-ipld API should make this
         possible, or of it should just be easy enough to create your own
         resolver.
 - https://github.com/ipfs/interface-ipfs-core/issues/81
   - [x] The `dag` API - One API to manipulate all the IPLD Format objects:
         Interesting discussion, but not relevant anymore.
 - ipfs-inactive/interface-js-ipfs-core#121
   - [x] dag api: add {ls, tree}: `tree()` is not part of js-ipld anymore as
         the IPLD Nodes are just JavaScript objects which you can easily traverse
         if you like. Though `tree()`-like functionality might be added as an
         example or separate module.
 - ipfs-inactive/interface-js-ipfs-core#125
   - [x] `dag get --localResolve` vs `dag resolve`: This is solved by the new
         `resolve()` API
 - ipfs-inactive/interface-js-ipfs-core#137
   - [x] add `level` option to ipfs.dag.tree: `tree()` is not part of js-ipld
         anymore. It should be considered if `tree()` is implemented (probably
         as an example or separate module)

Closes #182.
@codecov
Copy link

codecov bot commented Nov 26, 2018

Codecov Report

Merging #185 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #185   +/-   ##
=======================================
  Coverage   93.43%   93.43%           
=======================================
  Files           1        1           
  Lines         198      198           
=======================================
  Hits          185      185           
  Misses         13       13

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9a12316...5139f53. Read the comment docs.

@alanshaw
Copy link
Member

alanshaw commented Nov 26, 2018

Could we please get worked examples to easier understand the proposed API? They should just be on the README.


Changes to compared to the current spec:

* No more callbacks, everything is Promises.

...and iterators ;)

* `get()`:
  
  * Doesn’t have a `cid` in the return value anymore. That use case is
    covered by returning all objects that were traversed. Their `value`
    will always be the CID of the next one to traverse. So if you want
    to know the CID of the current IPLD Node, just look at the `value`
    of the previously returned IPLD Node.

Is the TLDR that when I do an ipld.resolve(cid, path) the last item of the iterator is the equivalent of the old ipld.get(cid, path, cb)?

  * Doesn’t return a single value, but an iterator.
  * `localResolve` option is removed. If you want to resolve a single IPLD
    Node only, just stop the iterator after the first item.

I really like this idea!

* `getMany()` is renamed to `get()`:
  
  * takes an iterable of CIDs as parameter.

...and returns?

Does it have to accept an iterable? Can it also accept a single CID?

* `put()`:
  
  * The options don’t take the `format` (which is a string), but the `codec`
    (which is a `multicodec`) (#175).

What's the reasoning behind this? I think a string codec name is more user friendly. Perhaps it could be flexible and accept both? Also, I'm fine with format - we're selecting the IPLD format to use with this option. codec could be confused with the codec for the base encoding in the CID, the codec of the multihash algorithm or potentially something else.

It's also consistent with the current IPFS DAG API e.g. https://github.com/ipfs/interface-ipfs-core/blob/master/SPEC/DAG.md#dagput and the IPFS block API https://github.com/ipfs/interface-ipfs-core/blob/master/SPEC/BLOCK.md#blockput

* `remove()`:
  
  * takes an iterable of CIDs as parameter.

Does it have to accept an iterable? Can it also accept a single CID?

* `.support.add()` is renamed to `.addFormat()`.
* `.support.rm()` is renamed to `.removeFormat()`.

...and could you expand on what they accept?

@vmx
Copy link
Member Author

vmx commented Nov 26, 2018

Thanks for the feedback!

Is the TLDR that when I do an ipld.resolve(cid, path) the last item of the iterator is the equivalent of the old ipld.get(cid, path, cb)?

No. The last element will only contain value and remainderPath and not the CID. In order to get the CID, you'd need to look at the previous element.

...and returns?
...and could you expand on what they accept?

It's just an overview so you get an idea if you know the old API/want to upgrade. For details please see the changes on the README itself.

Does it have to accept an iterable? Can it also accept a single CID?

I'm unsure myself. On one hand I'd like to keep the API (and types) as small as possible, on the other hand it should still be nice to use. So I guess changing it to "single item or iterable" might be a good idea. Output from others would be great.

About taking the format as string or codec:

I'd prefer if all libraries would use the codec instead of a string internally. To pass the right value in you could then import the multicodec table and use something like multicodec.DAG_CBOR (or so). This is neat for static code analysis.

@achingbrain
Copy link
Member

Interesting, I think these changes are pretty good, though like @alanshaw I'd like to see some examples of old vs new api.

get():
Doesn’t have a cid in the return value anymore.
if you want to know the CID of the current Node, just look at the value of the previously Node

Is there ever a case where there wouldn't be a previous node?

@vmx
Copy link
Member Author

vmx commented Nov 26, 2018

get():
Doesn’t have a cid in the return value anymore.
if you want to know the CID of the current Node, just look at the value of the previously Node

Is there ever a case where there wouldn't be a previous node?

Yes there is. If you only return a single node. But in this case you already know that CID of that node as it is the one you passed into get().

@achingbrain
Copy link
Member

Then I'd really like to see some examples of how you see this working in practice 😉

Is it so bad to pass all of the context (e.g. including the CID of the result) to the result handler? That way the handler wouldn't need to know about anything further up the call chain so can limit its scope.

Most importantly the `tree()` method is added again. It's really needed
for backwards compatibility.
Loading a format is no longer callback based but is using an
async function.
`doctoc` was used to update the table of contents.
@achingbrain
Copy link
Member

I think this is all fine. The only extra thing I'd like is for resolve(cid, path) to pass the last resolved cid back along with the results so that the calling code doesn't have to keep track of it itself and so can be simplified.

@vmx
Copy link
Member Author

vmx commented Jan 28, 2019

The only extra thing I'd like is for resolve(cid, path) to pass the last resolved cid back along with the results so that the calling code doesn't have to keep track of it itself and so can be simplified.

I had this code, but then I wanted to keep things simple. I didn't really need it for any of the PRs. When it comes up and it helps simplifying code, I'm happy to add it in a later PR.

@achingbrain
Copy link
Member

Fair enough. We had that feature previously - it made js-ipfs-unixfs-exporter simpler but adding support for HAMT sharding means there isn't a 1-to-1 mapping between UnixFS paths and DAG paths any more so it's use was removed.

@achingbrain achingbrain self-requested a review January 28, 2019 15:18
@achingbrain
Copy link
Member

achingbrain commented Jan 29, 2019

If I get a thumbsup from you and @alanshaw that the API looks OK, I'll do a release.

@alanshaw do you have any thoughts on the API changes?

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
README.md Show resolved Hide resolved
- `version` (`number`, default: 1): the CID version to use.
- `onlyHash` (`boolean`, default: false): if true the serialized form of the IPLD Node will not be passed to the underlying block store.

Returns an async iterator with the CIDs of the serialized IPLD Nodes.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return the full block instances? It doesn’t really cost us anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to expose the serialized form of the nodes like that? I don't think it's needed from the IPFS point of view but are there use cases where this is desirable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's needed from the IPFS point of view but are there use cases where this is desirable?

It's actually a good point. When using IPLD you shouldn't really care about how it is serialized. I'm always in favour having a minimal API :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly - it doesn't feel like an application-level concern.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree if the API only returned the final value it had resolved, but it doesn’t, it returns all the CID’s it passed through to resolve the path. It’s already exposing the serialization just by giving you the CID’s of the intermediate nodes and in that case, why not just give me the blocks if you already have them? It costs nothing on the consumer side to ignore the content and just use the CID and it saves a BlockService lookup if you do need the data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it's about abstractions. In js-ipld you only care about the the IPLD Data types. A CID is as low as you go, you don't care about how it is serialized, but only about that it got serialized.

IPLD Formats are the low level building block, which is about serialized data. You should use that level if you have needs to operate on that level.

I think not exposing the serialized data would make IPLD easier to use as it is one less concept to learn.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikeal This discussion is the last outstanding one. Would you be OK with moving forward with the current state?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya, it’s fine, go ahead and merge.

README.md Show resolved Hide resolved
The `format` is no longer part of the the options object, but a
separate parameter. The `version` option was renamed to `cidVersion`.
@vmx
Copy link
Member Author

vmx commented Feb 7, 2019

I've created a rebased version of this PR at #191. It is now a single commit with an update commit message. Once merged it will close a lot of issues. Hence I'll wait for the merge unti #190 is merged, so that people looking into the README or the closed issue can actually use the new code.

Thanks everyone for the review!

@vmx vmx closed this Feb 7, 2019
@ghost ghost removed the status/in-progress In progress label Feb 7, 2019
@vmx vmx mentioned this pull request Feb 7, 2019
15 tasks
@vmx vmx deleted the api-review branch February 7, 2019 13:47
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API Review
4 participants