Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

feat: update DAG API to match go-ipfs@0.10 changes #3917

Merged
merged 20 commits into from
Dec 3, 2021

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented Oct 11, 2021

Current status:

  • Focuses mainly on dag.put since that's where the bulk of the changes are
    • Option names have changed
    • Behaviour has now changed, strictly "input codec" and "store codec" where "codec" is that IPLD kind, no longer the special JSON form that was accepted for "input-enc". But in the case of the HTTP and Core APIs you give it the JS object as the node to store so "input codec" doesn't make as much sense - but I've made it so that if you give it "input codec" and a Uint8Array then it'll do a decode for you, or send it upstream to do the decode. So we have equivalent behaviour that you can use if there's an optimisation that makes sense for your use-case (perhaps you have the encoded bytes and want to pass that on and have it stored as an alternative codec than the one it's encoded with?).
  • CLI has --output-codec for dag get which matches go-ipfs, roughly:
    • no newline character
    • uses the codec you requested, defaulting to dag-json
    • if you want to print a Bytes node and request an output codec of "raw" then --data-enc will still work, but otherwise that argument isn't used like it was before (and I'm pretty sure it's js-ipfs specific)
  • Added an equivalent of first half of sharness t0053 from go-ipfs that pushes the DAG API around a bit
  • Added dag-json as a first-class codec

Outstanding:

  • fix: update for v0.10 DAG API, demonstrate BLOCK vs DAG ipfs-examples/js-ipfs-examples#77
  • CIDv0 was removed from go-ipfs 0.10 DAG API, you only have the option to get CIDv1. I think we should follow suit here.
  • It was raised in fix: update DAG PUT API s/format/storeCodec ipfs-examples/js-ipfs-custom-ipld-formats#1 but I think that DAG GET API shouldn't be using the BLOCK API (neither should DAG PUT), it should be a pass-through to the server. If it's more efficient to use the BLOCK API then the user should use that directly. There are a couple of reasons:
    • The DAG API on the server should be responsible for codecs, go-ipfs has a coded plugin architecture for this, if the client doesn't have the codec but the server does, you should be able to get it from the server anyway
    • Using the BLOCK API means you need the entire block even if you only want a tiny bit of it - if you want the entire block, use the BLOCK API directly
    • The DAG API is responsible for navigating through the DAG, using the BLOCK API as a substitute on the client means you need all of the blocks from the server and then navigate to the final node to get what you want - if you're 5 blocks deep and they're all 1Mb and all you want is a single 1Kb string in the middle of the final node then you've had to download 5Mb and decode them all, in sequence, just to get the final node. The server is better placed to do this and hand you what you want.
    • It should be a matter of documentation - use the DAG API if you want to traverse into nodes, or want a convenient way to get a decoded JS object for a single block. If you want to grab whole blocks and you have the codecs then use the BLOCK API. If you have a use-case where dealing with raw-bytes, and possibly doing codec work on the client side, then use the BLOCK API. If you have a custom codec on the server and not on the client, then use the DAG API.

This is a Draft PR because there's more to do. The initial, most important thing is solving #3914 so the DAG PUT API works against 0.10, which this does. I think it'll end up being a non-breaking change for users unless they're using format option or expecting to get a v0 CID from a dag-pb (which go-ipfs doesn't do anymore).

As a fun side effect, this introduces a possible optimisation for users in some situations - if they have the already-encoded bytes of a block then they can use DAG PUT to send that in and bypass the encode/decode cycles on the client side and the server will store it as is if they provide inputCodec. Although the BLOCK API is probably more appropriate for that use-case.

To do:

  • Server PUT & GET changes
  • CLI PUT and GET changes
  • Consider what to do with the changed dag-pb & unixfs output and pathing semantics - at a minimum we might want to follow suit and ditch the custom dag-pb object shape output and use pure dag-pb output (yay)

feat!: match go-ipfs@0.10 dag put options and semantics

Fixes: #3914
Ref: https://github.com/ipfs/go-ipfs/blob/master/CHANGELOG.md#v0100-2021-09-30

--format and --input-enc have been replaced with --input-codec and
--store-codec and mean something a little different. You now supply
raw input and instruct the server which --input-codec that data is
in which it will decode, then re-encode with --store-codec before
storing it and providing you with the CID.

We accept plain JavaScript objects to encode with --store-codec via
the API here, defaulting to dag-cbor, and send that to the server as
encoded bytes using that codec, to be stored using that codec.

If you supply an --input-codec then we assume you're supplying raw,
encoded bytes using that codec and we pass that directly on to the
server to handle.

@rvagg
Copy link
Member Author

rvagg commented Oct 15, 2021

One of the failing examples is js-ipfs-custom-ipld-formats, but there's a problem, which I outline @ ipfs-examples/js-ipfs-custom-ipld-formats#1
The example is going to work for in-process calls where you're inserting the custom codec into the core, but it won't work where you're using the DAG PUT API on a remote which doesn't have the codec. You need to use the BLOCK PUT API for that. This is consistent now between js-ipfs and go-ipfs with this PR, the js-ipfs-custom-ipld-formats won't work if your remote is a go-ipfs and only works with a js-ipfs remote because it shortcuts via the BLOCK API internally.

@rvagg rvagg marked this pull request as ready for review October 15, 2021 06:58
@lidel lidel marked this pull request as draft October 15, 2021 14:10
@lidel
Copy link
Member

lidel commented Oct 15, 2021

@rvagg is the TODO list here still up to date?

@rvagg
Copy link
Member Author

rvagg commented Oct 18, 2021

@lidel I've updated OP with the current status. I reckon this is good to merge as it is but there's some outstanding items to do and discuss that I've documented there.

@rvagg rvagg marked this pull request as ready for review October 18, 2021 10:02
@rvagg
Copy link
Member Author

rvagg commented Oct 19, 2021

I don't understand the lint error on CI, I can't repro locally

@karim-agha
Copy link

any updates on this? It's blocking us at the moment.

@lidel
Copy link
Member

lidel commented Oct 25, 2021

I think CI is broken because npm run dep-check uses aegir which uses bleeding edge version of dependency-check.
dependency-check@5.0.0-2 changed the default detective, but was releases as a patch release, so it got pulled in automatically (?) 💀

@hugomrdias does the above make sense? Is this something we need to fix on aegir side?

ps. I wanted to confirm this breakage is caused by external dependency change and re-run CI on PR that was green, and it failed for a different reason. 💀 💀

@oed
Copy link
Contributor

oed commented Nov 3, 2021

Any update on this? It's a blocker for us as well.

@BigLep
Copy link
Contributor

BigLep commented Nov 12, 2021

@rvagg : are you able to get the build to pass? @achingbrain will take on the code review afterwards.

Fixes: #3914
Ref: https://github.com/ipfs/go-ipfs/blob/master/CHANGELOG.md#v0100-2021-09-30

--format and --input-enc have been replaced with --input-codec and
--store-codec and mean something a little different. You now supply
raw input and instruct the server which --input-codec that data is
in which it will decode, then re-encode with --store-codec before
storing it and providing you with the CID.

We accept plain JavaScript objects to encode with --store-codec via
the API here, defaulting to dag-cbor, and send that to the server as
encoded bytes using that codec, to be stored using that codec.

If you supply an --input-codec then we assume you're supplying raw,
encoded bytes using that codec and we pass that directly on to the
server to handle.
SgtPooki referenced this pull request in ipfs/js-kubo-rpc-client Aug 18, 2022
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: achingbrain <alex@achingbrain.net>

BREAKING CHANGE: `ipfs.dag.put` no longer accepts a `format` arg, it is now `storeCodec` and `inputCodec`.  `'json'` has become `'dag-json'`, `'cbor'` has become `'dag-cbor'` and so on
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

http-client dag.put failed with invalid byte
8 participants