-
Notifications
You must be signed in to change notification settings - Fork 12
Implement stripping LD directives from IPLD Node, and transforming IPLD to JSON-LD #7
Conversation
3cfc39a
to
d1666d4
Compare
I rebased my branch to get small and nice commits. @jbenet, could you review this and tell me what you think ? What this PR does is escape keys in JSON for special characters like @, and add transformation to plain JSON without directives, and to JSON-LD. The commit history tells you what you need to know:
|
ok, i think this matches https://github.com/diasdavid/node-ipld 👍
i had it this way because it's a nicer UX to pass in a single string. I'm explicitly matching the UX expected by most golang users: http://golang.org/pkg/path/filepath/#WalkFunc 👎
I think this belongs in a different package, one specific to using JSON-LD with IPLD, not here. I'm 👎 on including all this here. |
It's nice to have a nice API, but we are not dealing with paths, really, but JSON keys that can sometimes contain '/' characters (as opposed to filesystem paths). If you want to handle JSON-LD or arbitrary JSON, you must be able to handle that nicely. And as a programmer, I prefer dealing with a It's always possible to provide another function to have an API using plain string paths. You might argue that you specifically disallow keys with
I would perhaps agree with you, but in the future. Having it here has the advantage that we don't forget JSON and JSON-LD compatibility. Moving it elsewhere means it's easy to make a change that will break this compatibility. Having tests to ensure that is something nice. If ou don't think this belong in this repository, what's go-ipls for ? Is it just useful to serialize/unserialize data and find the list of links ? I would think this was also the place where you can use the IPLD format in various useful ways. If not here, where would this belong to ? |
I have an additional question, how do you handle the case of the link named And it won't be nice to do the work internally (escape the special characters) and not provide an external API to use that (forcing reimplementation elsewhere) |
Agreed about tests, but these can be in a different package.
This package is just for the bare minimum needed to read/write ipfs data to disk/wire/etc. the bare minimum everyone is going to have to write in other languages, and understand, to grasp the format. We can make it a subpackage of this repo. maybe
Yeah, as i was discussing with the json-ld folks, perhaps we should. @diasdavid?
yeah if we'll do that, we should have functions here for sure. |
Ok, should the |
|
d1666d4
to
6f3d1c5
Compare
I updated this pull request, most of the code is still here but organized a bit differently:
@jbenet If you're ok with the Mildred |
@@ -48,6 +47,36 @@ func (d Node) Context() interface{} { | |||
return d[CtxKey] | |||
} | |||
|
|||
// Like StripDirectivesAll but on the root node only, for use in Walk | |||
func (d Node) StripDirectives() Node { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not have this on the Node
, make it a separate function, like:
func StripDirectives(n Node) Node { ... }
(want to keep the interface of Node
very, very minimal. same reason things like walk are separate functions)
@mildred thanks for the revision!
ok thanks.
This escaping should just happen on the path string, not on the node itself.
Filtering may be ok.
I'm still not sure about this.
thanks!
I think I am ok with this, and +1 on escaping However, before we pull the trigger, we should confirm what we win by doing this now instead of later? and whetehr doing it later is ok? Like, do we lose anything by providing a raw-json-without-and-no-@context-special-handling-at-all interface? (that just looks for specially formed link ojects within it) If we can't do it in the future without breaking anything (i suspect we cant), then it makes sense to put it it now, but would be good to have this concretely shown why now. (I'm sorry if this is repeating oursleves-- this stuff is seriously confusing, and worry about getting it either wrong or incompatible with what @diasdavid is doing in node).
Excellent, me too. i've wanted this done for a long time. sorry to keep massaging this but just want to make sure because this is THE core api, it is the core of IPFS as a whole, so it best be very right, and very polished. Whatever we do here also means @diasdavid may have to change things on his end too, so i'd like to minimize that right now.
Very much agreed! Thanks for bearing with me, and pushing this stuff! Juan |
The idea was riven by the compatibility we would have with JSON-LD. The problem is we want to nest index maps (the @container: @index), and JSON-LD doesn't permit it. So we must define a way to translate our model to JSON-LD. By formulating that, I realize this algorithm shouldn't probably take place when looking for links, but rather when converting to JSON-LD. So the code can probably be moved in the jsonld package and the Links() function can be simplified. Anyway, to understand how this algorithm works, we transform the following JSON (IPLD wire format):
To the following JSON-LD file:
And, this was also converted (by mistake I now think) to the same JSON when stripping directives (minus the directives). I'll have to change that, thank you for finding this mistake. This would lead to the path I'll answer the rest later (and ipdate the PR probably) |
What's implemented is escaping of all
We gain time to think about the format and its implications by not doing the escaping right now, and that would be welcomed. I find it difficult to have a clear idea on what it is that we want really.
Having said that, I'm quite in favor of escaping |
I think escaping all is "less magic-y" so we probably should do that
agreed. me too.
I think we should probably escape
what is your reason again for escaping with the problem is that escaping with it would be most intuitive if we keep |
|
Sorry, I wasn't available these past 4 days, but I'll work back on this issue starting soon.
Windows, I wouldn't even think of it. I haven't been using it since like forever. The thing is that |
That's bundling many assumptions. Also i do not think we should reuse I think it is cleanest to stick to the very well-established expectation: special characters are escaped with |
@jbenet, I'll probably close this PR and make another one with just the character escaping implemented. But before I do that, I'd like to get a few things right first: On the characters we want to escape (the special characters mentioned before):
Once we decided what we want to escape (
I don't like the For example the filename Can we make a decision on those points? I think you already mentionned your preference for the The next step would be to implement the character escaping, and at a later stage, implement IPLD directives (using the chosen special character) to transform our JSON to JSON-LD if we want to. But that's not as important as the rest. |
thanks for all the help here!
I think we should disallow it for now, but not close the door on allowing it and escaping it later.
right. but to confirm, if users provide JSON-LD as input (their own) we can take it, right? (supposing we escape things as needed)
I'm for this one. I think (1) So: Let's say we use
One worry of clashing with json-ld, is if we may want to have a directive like a type --
I think we should escape them. i want to allow people to give us any data (the paths being the only exception, and may not be an exception later). (2) So: let's just escape Once we decided what we want to escape (
This is common, but i've seen it confuse many people. So i've taught many programming classes in the past, so i've a sense of what trips people up, and having different escape characters across programs can be a very confusing thing. People can grasp double escapes much easier (though yes, it's a bit annoying too), than recalling which escape character is for which layered system.
I really, really dislike URL encoding style. It's very frustrating for people.
This is precisely why we need it. the cognitive overload on users, particularly people who are new to programming in general, is very very low.
Yes, but this is also the simplest, most common and familiar thing to do.
Regexping the raw data? you'd be regexping over cbor most of the time. if converting to json, can also pull out the strings. And yes, in the worst case, you count over
Is the conflict risk very high? i really don't want to invent a new, complex language with all sorts of things. If at all possible it would be a watered down version of JSON-LD. So far, I strongly think:
If clashes with JSON-LD is seen as a major problem, then:
But i think this (
Yep. Steps:
Actually i really like the |
Even without escaping, JSON-LD is JSON, so we can. With escaping we wouldn't create links from JSON-LD directives. The problem is that the links generated might not be the most interesting ones.
The major need would be for wrapping index container in intermediate LD objects. I proposed So ok, All JSON keys participate in IPLD links, except keys containing a
What do you propose this should do? We allow its inclusion (we already do). I don't suppose you want to implement a full JSON-LD processor in IPLD, do you? The I'd say, we include it inline in the JSON (I'd love to have it on the context, I just don't see how). |
6f3d1c5
to
b5a26b3
Compare
haha no :) -- was thinking of how to handle the
Ok. We can just punt on this until later. one step at a time :). thanks for all the discussion here @mildred! 👏 👏 |
@jbenet What do you think of the code ? |
// data structure. | ||
func EscapePathComponent(comp string) string { | ||
comp = strings.Replace(comp, "\\", "\\\\", -1) | ||
comp = strings.Replace(comp, "@", "\\@", -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it double because json? or what?
(outside of json, in our own datastruct, we may not need to escape twice?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(or maybe not, to preserve the data nicely. not sure)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's double because it's the Go string escaping. Is \
still the Go escape character for double quote strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, right :) 👍 (i asked for it ;) )
@mildred sorry for the delay here-- lost track. Code LGTM. I'm good to merge it. You? |
|
||
return ts == LinkType | ||
_, ok = vn[LinkKey].(string) | ||
return ok; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for ;
. would've expected go fmt
to remove it.
I'll just merge it and we can go from there. |
Implement stripping LD directives from IPLD Node, and transforming IPLD to JSON-LD
Thank you |
@mildred no, thank you, this is great! 👍 👏 👏 |
This is a work in progress, but I wanted to have a place to discuss implementation
@jbenet, can you review the test case
ipld_test.go
and tell me if you think this goes in the right direction. Every test case has a few keys I want ti implement:At first, I wrote flattening algorithm without using the Walk function you created, and I then noticed that I should converge with your implementation. I modified the Walk function to allow preprocessing of Nodes (transforming the JSON tree). This would allow walking either the IPLD node unmodified, the node without the LD directives, or the LD model. You can walk the model using three methods.
I still have to write a few more test cases, and I plan to remove most of the code in
node_index.go
and put it somewhere else.