-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ipfs dag diff #4801
Comments
Pinning services would really like to have this ability to compare DAGs behind two CIDs and get a list of block CIDs for additions and removals, as that would enable them to optimize the replace pin operation in their internal infra. Sounds like something we may want to look into after ipld-prime refactor is done. |
I know dag-pb is not compatible with ADLs yet (ipld/go-ipld-prime#258), and it may not even be a proper abstraction (since we want low-level dag inspection. Perhaps we could special-case/shim it for now, and make |
Working on some special case variant that will at least cover the basic |
Some additional thoughts on On operating at the right abstraction levelThe output of The My take is that This means the On output formatI believe The returned diff would be in JSON Patch format, as specified in RFC 6902 from the IETF. Rationale: IPLD Data Model is closer to JSON than anything else, and we should reuse existing RFC standards whenever possible. On diffing DagPB nodes using Logical Format
We now have IPLD Schema that describes Logical Format of dag-pb : type PBNode struct {
Links [PBLink]
Data optional Bytes
}
type PBLink struct {
Hash Link
Name optional String
Tsize optional Int
} I think a safe approach here is to special-case dag-pb and in the This is already what we do in A dag-pb block passed to {
"Data": {
"/": {
"bytes": "CAIYBw"
}
},
"Links": [
{
"Hash": {
"/": "bafyfoo"
},
"Name": "foo",
"Tsize": 42
}
]
} Unixfs is also raw blocks, and those return: {
"/": {
"bytes": "CAIYBw"
}
} With this convention, comparing JSON representations of DAG-PB Logical Format, we have JSON Patch output like this: { "op": "replace", "path": "Data", "value": { "/": { "bytes": "CAIYBw" } } } If diff is in file name, size, or cid of the first linked node: { "op": "replace", "path": "Links/0/Name", "value": "NewName" }
{ "op": "replace", "path": "Links/0/Tsize", "value": "42" }
{ "op": "replace", "path": "Links/0/Hash", "value": { "/": "bafyfoo" } } etc. By using Logical Format for dag-pb the cc @Stebalien @warpfork @achingbrain @rvagg for sanity check around this approach for unixfs |
Yep, all reasonable, and I'd like to see what we're calling the "Logical Format" become standard - already with the new DAG API changes it is because go-codec-dagpb instantiates it this way, the older format exists in the node implementations in go-merkledag but mainly exists to serve named pathing. So a As for diff output format, I don't have strong opinions and json patch doesn't seem unreasonable to me, at least it's an existing standard and we can just run the patch data through dag-json to get proper bytes and cid output with it. @warpfork I'm pretty sure you have some existing opinions here and some experimentation - and doesn't this overlap with your ipld-prime patch API work? Can we leverage that here at all? |
I'm pretty much thumbs up across the board, but I'll highlight bits and thumbs up them individually since there was a lot in that post. ^^
👍 👍 👍
👍
👍 I wrote up an issue about a hypothetical "IPLD Patch" recently but it mostly says "JSON Patch is pretty good and we should follow it".
👍 yeah it's honestly pretty readable.
👍 🙌 and I totally love that we're using it to describe this. Nit: I'm a little confused by the line after that where it's described as "special-case" for dag-pb... I think all the rest of the examples you put there are not special cases now :) This is the structure that go-codec-dagpb will give you now, as @rvagg said. No fuss :D
🙌 🙌 🙌 🙌 Yep. That's exactly the kind of clarity I'd really want to be able to get from a gadget called 'diff'. We might have more than one mode later, and maybe some that do higher level stuff, but ISTM that this should indeed be the base case. |
👍 @schomatis any preference around JSON Patch library? I've found https://github.com/evanphx/json-patch – actively maintained and supports both JSON Patch (RFC 6902), and also JSON Merge Patch (RFC 7396, which may be useful later for #4782) |
I just gave some looks at https://github.com/evanphx/json-patch --
I'm not opposed to anything that makes solutions happen (especially if the result is something that has a stable API that we can happily keep to even while replacing the implementation), and it seems like the "tightly bound to JSON" thing can be ignored then. But I don't know if that library will do the right kind of patch generation at all, unfortunately. |
No experience with this so no preference. From the library recommendations in jsonpatch that seems the more maintained one so let's go with that. Ideally the library used to perform the diff shouldn't be too entangled with the code here so changing down the road if we want won't be a complex operation. |
Per Lidel's request in ipfs/go-merkledag#82 (comment) going with diffing JSON output instead of implementing a specialized DAG-PB diff logic. |
👍 A quick explainer of my thinking is we want a basic poc to see if we are happy with JSON Patch output / ergonomics, and then decide on next steps, namely:
|
It seems that library doesn't create patches/diffs, just applies them or checks for equality, trying https://github.com/mattbaird/jsonpatch as a first attempt. |
Agreed; that's what I'll provide. A basic comparison of JSON output from |
Sorry @warpfork, I see that you were already flagging this. I got confused by all the different names and rfcs. |
Relevant IPLD spec: https://ipld.io/specs/patch/ |
Use-case: replace
ipfs object diff
with a generalized version that works for generalized IPLD DAGs.Text syntax:
+
for added items.-
for removed items.~
for replaced items.JSON syntax: Use JSON patch.
Generalized syntax: Basically, JSON patch with extra CBOR types.
Blocker: Pathing through DagPB nodes in go-ipfs isn't correct per the IPLD spec (paths don't reflect the object structure).
See ipld/specs#55.The text was updated successfully, but these errors were encountered: