-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relationship with Protocol Buffers legacy IPFS node format #59
Conversation
What is added: Relationship with Protocol Buffers legacy IPFS node formatIPLD has a known conversion with the legacy Protocol Buffers format. This format is defined with the Protocol Buffers syntax as:
The conversion to the IPLD data model must have the following properties:
There are multiple ways to do that that will be described next. Current encoding in go-ipldgo-ipld implements the following conversion:
Notes :
Other proposition that does away with escapingWe can imagine another transformation where the link names are not escaped. For example:
Notes:
Other encodingsSpeak up in comments if you have other encoding suggestions, or you have another implementation with another encoding. |
|
||
### Other proposition that does away with escaping | ||
|
||
We can imagine another transformation where the link names are not escaped. For example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont like this one (using .
) as much. i prefer the @attrs
with escaping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even considering there is no escaping?
i think this should go in a separate doc that extends the spec, or as an example of a special case format. we can have it inform (and modify) the proper spec, but i think including this format in the main document will confuse people? we could link to another doc. |
@mildred thanks for this!! 👍 |
In case we go the route of escaping, this needs to be linked in other parts of the spec as it has impact on it. |
Added considerations about escaping in the path section. See also PR #60 for similar changes. |
Based on IRC discussion and reading #60 it seems to me that traversing the IPLD structure should not be shoehorned into IP*S paths. It's more of a reflection / metadata thing for the paths themselves. Equivalent to Putting arbitrary keys at the top level of an associative array (hash) also scratches me the wrong way. The metadata could look like this |
Especially considering that unixfs paths are separate from IPLD paths according to last state of PR #62. There is nothing preventing the data to look like this as @the8472 told us:
Is there a reason why we couldn't do this way. @jbenet? |
Links needs not to be present at the top level. having them in a separate map removes all complexity of key escaping.
Updated the conversion format. |
The format is encapsulated after a multicodec header that tells which codec to use. In addition, older applications that do not yet use the multicodec header will transmit a protocol buffer stream. This can be detected by looking at the first byte: | ||
|
||
- if the first byte is between 0 and 127, it is a multicodec header | ||
- if the first byte if between 128 and 255, it is a protocol buffer stream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I am wrong here. The multicodec header length is not limited to one byte, but can be encoded in multiple bytes if it is above 127. Setting the MSB to 1 is just the way varint works.
I also assumed that protocol buffers always started with a byte with MSB set to 1, but I don't know if that true. Probably not.
So, it's probably not possible to detect in a such easy way if we are transmitting a multicodec header or a protocol buffer message. I'm currently rewriting this part (as I am implementing it in go-ipld).
Fix the paragraph about the first byte that is able to determine if the data in prefixed by a multicodec or is a protocol buffer object.
|
||
The conversion to the IPLD data model must have the following properties: | ||
|
||
- It should be convertible back to protocol buffers, resulting in an identical byte stream (so the hash corresponds). This implies that ordering and duplicate links must be preserved in some way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- "- It MUST be ..."
minor things, otherwise LGTM |
@jbenet I pushed some fixes, also, what do you think of the ordered-links vs named-links issue ? |
@jbenet I removed the named links section and made it an option for the implementations as it is not an important part of the spec. |
SGTM! 👍 thanks |
Relationship with Protocol Buffers legacy IPFS node format
In PR #37, we left out an important part of the spec aside: the relationship with protocol buffer serialization. This ought to be described as it has effects that may be far reaching.
TODO items:
@attrs
in one proposition, with@
escaping,.
in the other proposition): not needed