-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: add peer id spec #100
Conversation
We should probably say: Implementations SHOULD support RSA and Ed25519. Implementations MAY support Secp256k1 and ECDSA but nodes using those keys may not be able to connect to all nodes. |
+1 for using normative language |
Any update here? I'd love to have some specs around here to link to ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the interest of getting our specs PRs merged, I did a review of this one. Overall, this LGTM, I just think we can delete a few sentences that specify implementation details.
peer-ids/peer-ids.md
Outdated
## Keys | ||
|
||
|
||
Our key pairs are stored on disk using a simple protobuf defined in [libp2p/go-libp2p-crypto/pb/crypto.proto#L5](https://github.com/libp2p/go-libp2p-crypto/blob/master/pb/crypto.proto#L5): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The specs shouldn't link to code, it should be the other way round.
- How keys are stored on discs doesn't need to be specified, it's an implementation decision. We only need to specify things that effect the interoperability of implements.
peer-ids/peer-ids.md
Outdated
3. If the length of the serialized bytes <= 42, then we compute the "identity" multihash of the serialized bytes. In other words, no hashing is performed, but the [multihash format is still followed](https://github.com/multiformats/multihash) (byte plus varint plus serialized bytes). The idea here is that if the serialized byte array is short enough, we can fit it in a multihash proto without having to condense it using a hash function. | ||
4. If the length is >42, then we hash it using it using the SHA256 multihash. | ||
|
||
For more information, refer to this block in [libp2p/go-libp2p-peer/peer.go](https://github.com/libp2p/go-libp2p-peer/blob/master/peer.go): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. I think the text already describes the logic pretty well, so we don't need to cite this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
peer-ids/peer-ids.md
Outdated
|
||
Implementations SHOULD support RSA and Ed25519. Implementations MAY support Secp256k1 and ECDSA, but nodes using those keys may not be able to connect to all other nodes. | ||
|
||
Keys are passed around in code as byte arrays. Keys are encoded within these arrays differently depending on the type of key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like an implementation decision. Remove this sentence?
peer-ids/peer-ids.md
Outdated
|
||
To sign a message, we first hash it with SHA-256 and then sign it using the RSASSA-PKCS1-V1.5-SIGN from RSA PKCS#1 v1.5. | ||
|
||
See [libp2p/go-libp2p-crypto/rsa.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/rsa.go) for details |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
peer-ids/peer-ids.md
Outdated
|
||
Ed25519 signatures follow the normal Ed25519 standard. | ||
|
||
See [libp2p/go-libp2p-crypto/ed25519.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/ed25519.go) for details |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
peer-ids/peer-ids.md
Outdated
|
||
To sign a message, we hash the message with SHA 256, and then sign it with the ECDSA standard algorithm, then we encode it using DER-encoded ASN.1. | ||
|
||
See [libp2p/go-libp2p-crypto/ecdsa.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/ecdsa.go) for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
peer-ids/peer-ids.md
Outdated
|
||
To sign a message, we hash the message with SHA 256, then sign it using the standard Bitcoin EC signature algorithm (BIP0062), and then use standard Bitcoin encoding. | ||
|
||
See [libp2p/go-libp2p-crypto/secp256k1.go](https://github.com/libp2p/go-libp2p-crypto/blob/master/secp256k1.go) for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
|
||
We do not do any special additional encoding for Ed25519 public keys. | ||
|
||
The encoding for Ed25519 private keys is a little unusual. There are two formats that we encourage implementors to support: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like an implementation decision, so we probably don't need to specify it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not entirely. We do want users to be able to port keys from one implementation to another.
peer-ids/peer-ids.md
Outdated
} | ||
``` | ||
|
||
As should be apparent from the above code block, this proto simply encodes for transmission a public/private key pair along with an enum specifying the type of keypair. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any situation where we want to transmit the PrivateKey
? That seems... dangerous. If not, we don't need to specify the PrivateKey
here at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, storage of private key is implementation specific, so no need to cover them in this doc I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, users do need to be able to take their private keys with them (especially because we use these for things like IPNS).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true that removing the private key format from this doc leaves a gap. We still need to specify somewhere how we handle them.
We could bring back the private key references and add a call-out at the top of the doc that they're not related to peer-id calculation and are shown for reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really, we should probably rename this doc to the "libp2p key spec" and make peer ID calculation a part of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for that
peer-ids/peer-ids.md
Outdated
1. Encode the public key into the protobuf. | ||
2. Serialize the protobuf containing the public key into bytes using the [canonical protobuf encoding](https://developers.google.com/protocol-buffers/docs/encoding). | ||
3. If the length of the serialized bytes <= 42, then we compute the "identity" multihash of the serialized bytes. In other words, no hashing is performed, but the [multihash format is still followed](https://github.com/multiformats/multihash) (byte plus varint plus serialized bytes). The idea here is that if the serialized byte array is short enough, we can fit it in a multihash proto without having to condense it using a hash function. | ||
4. If the length is >42, then we hash it using it using the SHA256 multihash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should say something about how these are commonly represented as strings: base58btc encoding raw, without using multibase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a bit about base58btc, but didn't mention multibase, since we hadn't defined it yet in the doc. Should I bring it up? I think if people are likely to expect Peer Ids to use multibase we should clarify.
peer-ids/peer-ids.md
Outdated
3. If the length of the serialized bytes <= 42, then we compute the "identity" multihash of the serialized bytes. In other words, no hashing is performed, but the [multihash format is still followed](https://github.com/multiformats/multihash) (byte plus varint plus serialized bytes). The idea here is that if the serialized byte array is short enough, we can fit it in a multihash proto without having to condense it using a hash function. | ||
4. If the length is >42, then we hash it using it using the SHA256 multihash. | ||
|
||
For more information, refer to this block in [libp2p/go-libp2p-peer/peer.go](https://github.com/libp2p/go-libp2p-peer/blob/master/peer.go): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
peer-ids/peer-ids.md
Outdated
} | ||
``` | ||
|
||
As should be apparent from the above code block, this proto simply encodes for transmission a public/private key pair along with an enum specifying the type of keypair. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, storage of private key is implementation specific, so no need to cover them in this doc I think.
@yusefnapora if you wanna do some spec herding, this is a quick win I think. Pretty good consensus. |
I did a quick edit to remove references to private keys and serialization on disk. I also removed links to go code and added some links to e.g. the RSA signing spec, etc. @mgoelzer do you mind if I push changes to this branch? I put up a new one here with the edits: https://github.com/libp2p/specs/blob/edit/peer-ids/peer-ids/peer-ids.md but it might be easier to discuss if I drop the commits here. |
@yusefnapora gonna jump in and say yes. In the interest of moving forward, push to the branch in this PR. Thanks! |
Co-Authored-By: Stebalien <steven@stebalien.com>
Adds "encode to byte array according to rules below" as first step, and makes explicit that we only use the public part of the keypair.
Sorry for sleeping on this for a while everyone :) I added a few commits to address some feedback. I think the most significant is the note about deterministic protobuf encoding. It basically says determinism is "desirable" and you should try to make it happen, but doesn't call it out as a MUST. Without requiring a bunch of changes to the protobuf spec, that might be the best we can do, but if anyone has a better way to put this, I'm all ears. |
@Stebalien @arnetheduck @raulk - could you guys help me figure out the resolution to the deterministic encoding problem? If we definitely want to require a consistent / canonical encoding for peer ids, then I think I should write up a precise spec that requires a certain field ordering, etc. And we can have some tests that ensure your encoder handles edge cases well. But @Stebalien mentions that, because we're also not guaranteeing a canonical encoding for the key I can make up some arguments in favor of this view; for one, we can extend the @Stebalien could you elaborate a bit on your view? I think we should figure out if this is a blocker or not to merging the spec. |
On mobile. Re: deterministic key serde. I suggest we specify the format as proto3 + the extra requirements to reach a deterministic result (ordering, no unrecognised fields, no duplicates last wins, etc.) We should add an implementers note in the form of a SHOULD recommendation to use an OOTB protobuf encoder, where possible, and provide test vectors. For cases where that’s unfeasible, we should provide a boiled down serde spec in BNF or similar form. It’ll be super simple, the schema is so short and constrained we can express the serialised form manually without alluding to proto3 at all. Re: using unrecognised fields for peer ID calculation, I’d like to hear what use cases you had in mind @Stebalien. In my view, the peer ID should be derived from the pubkey modulo metadata, if any. I don’t think user-defined metadata should yield a different identity. Seems like opening a trivial attack vector for sybils. |
Note: I'm really not expecting much if any user defined metadata. However, we may want to add new fields in the future and it's hard to do this if we don't include them in the hash.
There are two cases:
|
IMO, "same key" should mean "same bytes". That is, if I change anything about the bytes of the serialized key, I get a new key. I'm concerned that not all key formats will have a "canonical" encoding, some libraries may strip certain metadata while others preserve it, etc. This will lead to hard-to-track-down bugs. This has already been a real pain for us in IPLD and our solution there is to avoid re-serializing unless we change something. My unconcern with having multiple valid peer IDs for the same underlying cryptographic key is that there's likely nothing we can do to stop this. I haven't audited our key formats/algorithms but it's likely possible to make some small changes to a public key and have it continue to work with the private key. |
Also, @raulk, @vyzo & @Stebalien I nominated you guys as the Interest Group for this one 😄 Others are welcome to join in if they like |
I would like to reiterate the value in migrating to embedded-key base32 encoded canonical representation of PeerIDs |
How do we feel about merging this? 870b71a kicks the deterministic encoding issue down the road by saying a future version of the spec may require more strict encoding than the protobuf spec, but until then, don't extend the @mgoelzer @Stebalien @raulk @vyzo |
Sorry for commenting on an old PR, but I have not found an answer anywhere else. What's the motivation for number
When one would expect keys to be longer than 42 bytes? Thank you! |
I believe it's the size of ed25519 public keys inside of protobuf encoding, since ed25519 keys are supposed to be inlined into PeerIDs, so any keypair algorithm with public keys larger than 32 bytes would be encoded as more than 42 bytes and therefore not be inlined |
uh, so it's 32 + 10 bytes. 30% for serializaton is such a big number, I didn't even think about that. thank you, that makes it clearer🙏 |
And I vaguely remember that 42 would be small enough to then fit into a single DNS segment. Though I might be misremembering that. |
This updates the peer ID spec to explain what keypairs are supported and how peer IDs are encoded for each key type. Thanks to @Stebalien for figuring this out with me.