Skip to content
This repository has been archived by the owner on Dec 7, 2019. It is now read-only.

change how we calculate inline peer IDs. #30

Merged
merged 2 commits into from
Mar 22, 2018
Merged

Conversation

Stebalien
Copy link
Member

@Stebalien Stebalien commented Feb 8, 2018

Instead of trying to get fancy with multicodecs and such, just inline the public key's protobuf (if it's small enough).

Before:

Currently, we always use sha256-256 multihash's except when the user explicitly uses the IDFromEd25519PublicKey function. That function returns the identity multihash of the ed25519 multicodec concatenated with the ed25519 public key).

After:

  1. Serialize the public key protobuf.
  2. If it's <= 40 bytes long, return the identity multihash as the peer ID.
  3. If it's > 40 bytes long, return the sha256-256 multihash as the peer ID.

Eventually, we can try to switch to true IPLD but this is, IMO, a step in the right direction.

This is a massively breaking change for anyone using ed25519 keys but:

  1. Ensures that any given key always has the same peer ID.
  2. Removes key-specific logic. If the key protobuf fits in under 40 bytes (32 bytes for the key + 8 for the key type, protobuf stuff, and maybe an additional field).
  • Is 40 bytes the right size? Once we decide, we should set this in stone. Otherwise, we'll end up with disagreeing peer IDs again.
  • What about the hash function? We currently hard-code sha2-256... punt? Make this a part of the key "type"?

fixes #29

@ghost ghost assigned Stebalien Feb 8, 2018
@ghost ghost added the in progress label Feb 8, 2018
@Stebalien Stebalien requested review from whyrusleeping, a user and daviddias February 8, 2018 00:31
peer.go Outdated
if err != nil {
return "", err
var alg uint64 = mh.SHA2_256
if len(b) <= 40 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might want to define the algorithm and the magic 40 as consts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And now the world is a little less magical 😿.

@vyzo
Copy link
Contributor

vyzo commented Feb 8, 2018

i think it's important to ensure that IDFromPublicKey returns the same value as the old implementation; it's perhaps ok to break the fringe ed25519-specific method.
Breaking IDFromPublicKey would be a very nasty surprise for many applications.

@Stebalien
Copy link
Member Author

i think it's important to ensure that IDFromPublicKey returns the same value as the old implementation; it's perhaps ok to break the fringe ed25519-specific method.

It will except for ed25519 keys which, according to @whyrusleeping, nobody is using.

@Kubuxu
Copy link
Member

Kubuxu commented Mar 4, 2018

Wouldn't it be better, in case of ed25519 to skip the protobuf? Public key is bytearray of 32 bytes. Prepend it with ed's multicodec packed and it should be good to go.

@Stebalien
Copy link
Member Author

Wouldn't it be better, in case of ed25519 to skip the protobuf? Public key is bytearray of 32 bytes. Prepend it with ed's multicodec packed and it should be good to go.

The motivations for this change are:

  1. The ID generated for any given key should always be the same.
  2. We shouldn't be special casing different key types (inlining is a property of the key size not *type).

Currently, we have a separate special function for generating inlined ed25519 keys (that currently slices the serialized protobuf to extract the key 😱) and another for extracting such keys from peer IDs. My first attempt at fixing this issue involved:

  1. Modifying the key interface to expose the key type/key data.
  2. Adding a table mapping key types (0, 1, 2) to key multicodecs.
  3. Adding an additional deserialization system for deserializing inlined keys.

However, even after doing this, we'd still have two separate serialization formats which seems kind of silly. Maybe we should have serialized keys as <multicodec><keydata> from the get-go but we didn't and I'm not convinced that saving a few bytes is really worth having two formats.

So, we could go with the two format system but it really only saves us 3 bytes. The inlined peer IDs are currently 35 bytes. After this change, they'll be 38 bytes. I picked a 40 byte inline cutoff so we can get a bit of breathing room (although we could also just pick 42 bytes as a nice round number).

@daviddias
Copy link
Member

daviddias commented Mar 8, 2018

Given that we are going "full breaking change mode"...Is ipfs/specs#58 being taken in consideration here at all? @ianopolous brought some really valid points during forest (Keys will eventually be huge) and we should just treat the key as a File with metadata that we can then chunk and retrieve through IPFS itself.

This proposal -- ipfs/specs#58 (comment) (and comments below) -- should meet all the requirements for all types of keys, existing now and future ones.

@Stebalien
Copy link
Member Author

Stebalien commented Mar 8, 2018

@diasdavid we aren't quite going full breaking change mode. This PR won't break any existing IPFS nodes as it doesn't change how we compute RSA peer IDs.

Eventually, I'd like to treat keys as files (or, really, IPLD DAGs) but that's a larger change (with backcompat issues).

Instead of trying to get fancy with multicodecs and such, just inline the public
key's protobuf.

Eventually, we can try to switch to true IPLD but this is, IMO, a step in the
right direction.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

strange issue with ED25519 keys
4 participants