Skip to content
This repository has been archived by the owner on Feb 8, 2023. It is now read-only.

Introduce Ed25519 public key IPFS identities #241

Closed
JustinDrake opened this issue Apr 21, 2017 · 26 comments
Closed

Introduce Ed25519 public key IPFS identities #241

JustinDrake opened this issue Apr 21, 2017 · 26 comments

Comments

@JustinDrake
Copy link

At the moment IPFS identities are either SHA256 hashes of RSA public keys, or SHA256 hashes of Ed25519 public keys. I propose introducing a new type of IPFS identity (by adding a new multihash codec) that are simply Ed25519 public keys (not hashed). This is for several reasons:

  1. Ed25519 public keys are 256 bits, the same size as SHA256 hashes. So identities remain short.
  2. IPFS publishes public keys to the libp2p DHT. These DHT public key records take up resources and need to be maintained otherwise they expire.
  3. The process of broadcasting and retrieving these public keys is unnecessary complexity, and unnecessarily adds brittleness in the system.
  4. In the context of OpenBazaar, we want to maximise buyer privacy and protect buyers against spam/DoS attacks. A radical way to protect buyers is simply to never publicly broadcast their peer id (i.e. only share it with vendors they buy stuff from). Having the peer id be the Ed25519 public key would remove the need to publish the peer id to the DHT.

Thoughts?

cc @jackkleeman

@whyrusleeping
Copy link
Member

@JustinDrake Very interesting idea, Though if your concern is just not wanting to publish public keys to the dht, You can just disable that. Public keys are exchanged when peers connect to eachother normally (via secio and/or the identify handshake)

@JustinDrake
Copy link
Author

JustinDrake commented Apr 22, 2017

Public keys are exchanged when peers connect to eachother normally (via secio and/or the identify handshake)

Oh that's awesome to know, thanks! I'll look into it for secio we use. I don't know what the identify handshake is. Could you point to some documentation or code?

@JustinDrake
Copy link
Author

@whyrusleeping In the context of OpenBazaar, there are two paths. The online path uses secio, whereas the offline path uses pointers posted to the DHT. We could disable publishing public key for online messages, but offline messages wouldn't work out of the box.

@whyrusleeping
Copy link
Member

@JustinDrake the identify handshake happens any time two peers connect with libp2p. The code is here: https://github.com/libp2p/go-libp2p/tree/master/p2p/protocol/identify And it is initiated by the basichost.

For offline, that all wouldnt work, youre right. We've talked for a while about having an 'identity' multihash function (identity function referring to f(x) = x), I think thats what youre looking to do here. I'm on board.

The first step will be adding the identity function code to the multihash table. After that, there will need to be a lot of code changes to be able to properly handle peerIDs not being the sha256 hash of the serialized public key protobuf. I guess the first place to start there would be to make a big list of all the places that assumptions are made about public keys.

Then once thats done, you should be able to plug code in that checks if a peer ID is a full public key when searching for a peers public key.

@JustinDrake
Copy link
Author

We've talked for a while about having an 'identity' multihash function

@whyrusleeping Yes it's the identity multihash function. Having said that I think there needs to be an explicit marker in the prefix to indicate that it's specifically an ed25519 public key (as opposed to some other piece of data with a different interpretation). This table uses 0x00 for the identity codec. Am I correct that we want to add a new code (e.g. 0x01) for the "identity-ed25519" codec? How can we get this process kickstarted?

I guess the first place to start there would be to make a big list of all the places that assumptions are made about public keys.

I would be happy to participate in itemising the changes required 👍

@whyrusleeping
Copy link
Member

I think the best way to do this would be to use the 0x00 identity multihash codec, then after that put a codec for ed25519. (can allocate one here: https://github.com/multiformats/multicodec/blob/master/table.csv). So then the peer ID would look like: <identity mc><length><ed25519 mc><ed25519 pubkey>

@whyrusleeping
Copy link
Member

To allocate a code for ed25519, just PR to that table

@Kubuxu
Copy link
Member

Kubuxu commented Apr 22, 2017

You would also have to check if go-multihash supports identity function.

JustinDrake added a commit to JustinDrake/multicodec that referenced this issue Apr 22, 2017
See ipfs/notes#241 (comment)

I suggest using the code `0xed` so that's it's visually obvious and easy to remember it's associated with ed25519.
@JustinDrake
Copy link
Author

@whyrusleeping Your suggestion to use <identity mc><length><ed25519 mc><ed25519 pubkey> makes a lot of sense. I have submitted a PR for table.csv here.

Below is a partial list of changes that need to be made. Have I missed anything obvious?

  • multiformats/multicodec
    • table.csv: add Ed25519 codec
  • multiformats/go-multihash
    • multihash.go: add support for the identity hash
  • ipfs/go-ipfs-util
    • util.go: Generalise Hash function beyond SHA2_256
  • libp2p/go-libp2p-peer
    • peer.go: Generalise IDFromPublicKey and IDFromPrivateKey functions beyond SHA2_256
    • peer.go: Make String support new identity type
  • ipfs/go-cid
    • cid.go: Generalise Decode to support new identity type
  • libp2p/go-libp2p-kad-dht
    • records.go: Have GetPublicKey extract public key instead of querying caches or the DHT for new identity type
  • ipfs/go-ipfs
    • namesys/publisher.go: Have PutRecordToRouting refrain from publishing public key to the DHT for new identity type
  • other
    • Add argument specifying hash type to all calls to Hash, IDFromPublicKey and IDFromPublicKey
    • Add command line option to use the new identity type when generating new identities
    • Write documentation and tests for the new identity type
    • Do the same thing for js-ipfs

@cryptix
Copy link

cryptix commented Apr 25, 2017

ipfs/libp2p could also integrate secret-handshake when this is done.

@JustinDrake
Copy link
Author

I have talked to the OB1 team (@cpacia @hoffmabc) and we see great value for OpenBazaar 2.0 to launch with this new identity scheme (disallowing the other two identity schemes). The biggest constraint is time. The OB1 team want to launch a beta in 3 months and would prefer having the new identity scheme put in place by then.

@whyrusleeping What are the next steps? Is 3 months a doable timeline for the IPFS team?

@jbenet
Copy link
Member

jbenet commented Apr 30, 2017

@JustinDrake

  • in general i'm very +1 for ed25519 keys (part of why we made keys swappable in the first place)
  • i very much want to see these changes happen, and have been thinking about similar things for a while
  • do make sure the changes account for the different types of keys (e.g. GetPublicKey should check what type of key it is, if it's a "key in peerID" use it, else go fetch it).

i do think it is complex to try to change all of this on short time frame:

  • the changes here are all doable, but to make sure it all goes according to plan, and safely, you have to allot time for testing and for bugs to surface
  • have you measured the problems with the current keys?
    • i would love to see a graph that shows this is a serious problem. (i dont doubt it is, i just would love to see it validated with data)
    • you may see a speedup on a subset of lookups
    • but i dont think you'll see a speedup on general DHT lookups (keys are exchanged when connecting)
    • you may see a speedup on provider lookups
  • why is the key storage a concern? it should be negligible for a network of ~10-100K ephemeral nodes with ~300-3K long-lived:
    • (512B (size per rsa key + wrapping) * 10,000 (nodes) / log_2 300 = (5MB/8) => ~600KB per long-lived node in keys for 10K nodes.
    • (512B (size per rsa key + wrapping) * 100,000 (nodes) / log_2 1000 = (50MB/10) => ~5MB per long-lived node in keys for 100K nodes.

@jbenet
Copy link
Member

jbenet commented Apr 30, 2017

also, btw:

  • Ed25519 public keys are 256 bits, the same size as SHA256 hashes. So identities remain short.

agreed.

  • IPFS publishes public keys to the libp2p DHT. These DHT public key records take up resources and

dont think they take up much resources. would like to see graph or calculations comparing to other content.

need to be maintained otherwise they expire.

yeah that sucks

The process of broadcasting and retrieving these public keys is unnecessary complexity, and unnecessarily adds brittleness in the system.

yeah agreed there. it will still be necessary for some key types.

In the context of OpenBazaar, we want to maximise buyer privacy and protect buyers against spam/DoS attacks. A radical way to protect buyers is simply to never publicly broadcast their peer id (i.e. only share it with vendors they buy stuff from). Having the peer id be the Ed25519 public key would remove the need to publish the peer id to the DHT.

  • wait, i would NOT rely on peer-id's staying private as a privacy measure, at all. peerIDs in IPFS are meant to be pseudonymous identifiers and used only for content distribution. peerIDs travel with provider records, general swarm observations, and so on.
  • if the concern here is privacy, then make the peerID irrelevant (maybe ephemeral? and dont tie it to the application?).
    • clients can just re-roll ephemeral keys whenever
    • could have two keys, one OB key (application), one ipfs key (network transport). ipfs key could be rerolled often.
  • happy to talk about this in detail (call or something). i want to make sure you're doing the right thing re: privacy. it's very hard to do right

@hoffmabc
Copy link

Thanks for all of this feedback. My biggest concern when we chatted about it was the time frame because it would be very quick to get this into IPFS, test it and then get it into a release before we were ready to drop our first version of OB 2.0. At least that's my assumption.

I think the ephemeral peerID is probably a better approach to privacy than just obfuscation of peerIDs. We could probably set up a call for sometime next week if you like to talk through this.

@jbenet
Copy link
Member

jbenet commented Apr 30, 2017

@hoffmabc @JustinDrake -- yeah happy to talk next week. we can move to email to schedule @JustinDrake can you email me and then i can organize? (feel free to keep going regardless, most changes here will be useful)

@hoffmabc
Copy link

👍

@whyrusleeping
Copy link
Member

whyrusleeping commented May 1, 2017

@JustinDrake some thoughts on the list:

  • multiformats/multicodec
    • table.csv: add Ed25519 codec 👍
  • multiformats/go-multihash
    • multihash.go: add support for the identity hash 👍
  • ipfs/go-ipfs-util
    • util.go: Generalise Hash function beyond SHA2_256
    • === We should actually just remove all dependence on this method. I've been wanting to kill off that package for a while.
  • libp2p/go-libp2p-peer
    • peer.go: Generalise IDFromPublicKey and IDFromPrivateKey functions beyond SHA2_256
      • === This one might be tricky, we will have to think about how we want to select for this
    • peer.go: Make String support new identity type
      • === It should just be able to print the base58 encoded multihash, no real changes required here AFAIK
  • ipfs/go-cid
    • cid.go: Generalise Decode to support new identity type
      • === We probably don't have to touch the cids, since identities are their own separatish thing
  • libp2p/go-libp2p-kad-dht
    • records.go: Have GetPublicKey extract public key instead of querying caches or the DHT for new identity type 👍
  • ipfs/go-ipfs
    • namesys/publisher.go: Have PutRecordToRouting refrain from publishing public key to the DHT for new identity type
  • other
    • Add argument specifying hash type to all calls to Hash, IDFromPublicKey and IDFromPublicKey
    • Add command line option to use the new identity type when generating new identities
    • Write documentation and tests for the new identity type
    • Do the same thing for js-ipfs

@Kubuxu
Copy link
Member

Kubuxu commented May 1, 2017

You also have to add the ed-code to go-multicodec-packed.

@JustinDrake
Copy link
Author

@jbenet

have you measured the problems with the current keys?

The problems are much more qualitative than quantitative. DHT resource usage is the least of problems (sorry for bringing it up!). For https://openbazaar.chat the biggest pain point is interacting with the DHT. (Direct communications from js-libp2p to go-libp2p work well; it's awesome!) One particularly flaky point for us is the publishing and querying of public keys. We've spent a lot of engineering resources to make it workable in the context of a web browser and it's still far from perfect. (For context, the public keys are required for a custom offline messaging scheme used in OpenBazaar.)

Regarding privacy, I totally agree we want to make peer ids as irrelevant as possible. (I have been rooting for this in openbazaar-go since Nov 2016.) At the transport layer ephemeral peer IDs work especially well. At the application layer openbazaar-go assumes that peer IDs are long lived. For buyers specifically (which is Duo's focus) the peer ID is required during the whole purchase flow which can last months. Peer IDs are also used for chat which is a fundamentally open-ended and long-lived thing. OpenBazaar 2.0 also has a Bitcoin wallet with the ability to send funds to a peer ID, and this is also an open-ended and long-lived thing.

The good news is that for privacy-conscious buyers (which is the target market for www.openbazaar.chat and Duo) the OpenBazaar application peer ID does not need to be publicly shared, ever. Buyers make no content distribution (so no provider records, general swarm observations, etc.). OpenBazaar does have its own home-rolled DHT "pointer records" used for offline messaging but those do not leak peer IDs thanks to a prefix scheme similar to stealth addresses developed by @cpacia. OpenBazaar also makes uses of IPNS records pointing to IPFS assets such as a profile and listings, but this is optional. The only wrinkle in all of the above is that we currently need DHT records for the public keys (required for offline messaging at the OpenBazaar application layer) and those do leak application peer IDs.

@jbenet
Copy link
Member

jbenet commented May 4, 2017

getting the changes in place

the remainder is:

Did i miss anything?

@daviddias
Copy link
Member

Funny story, I initially started implementing js-ipfs back in 2015 with ed25519 keys and later replaced for RSA to match go-ipfs and because we didn't have multikey yet.

@whyrusleeping should we take this chance to make PeerIDs full CIDs instead of base58 multihashes only? Also, double checking, @whyrusleeping are you migrating keys to become multikeys?

@JustinDrake
Copy link
Author

@diasdavid I think having PeerIDs be full CIDs makes sense. I just hope we can finish off ipfs/kubo#3896 within a couple weeks for OpenBazaar 2.0. @whyrusleeping did you receive my email? Any help would really appreciated.

@djdv
Copy link

djdv commented Aug 30, 2017

I don't know if this is the appropriate place to mention it or not but I'd like to point out that ipfs key list has only 1 parameter: -l and it doesn't show anything about the key's type, size/length, etc. Since identities would no longer be limited to RSA only, it might be useful to expose more information about the keys when listed. Not sure if this has been raised or handled yet.

@ghost
Copy link

ghost commented Sep 1, 2017

Hi @whyrusleeping, since there was some discussion on this, just wanted to mention that I'm working on a Haskell implementation of the libp2p-crypto here: MatrixAI/haskell-libp2p-crypto

It would be nice to have multikeys so that parsing keys such as Ed25519 is unambiguous. This isn't a problem for RSA because it has headers/ASN1 format, but is a problem for keys such as Ed25519 which are just random bytes. Would multikeys have a similar format to multihash (i.e. a keytype prefix with a varint for length)?
MatrixAI/hs-libp2p-crypto#3


Also, as I understand, the byte representation for Ed25519 private keys in go-libp2p-crypto currently appends the public key bytes twice to the secret key bytes? https://git.io/v54Se

@whyrusleeping
Copy link
Member

@JustinDrake I dont think I got your email, did you send it to why at ipfs.io ?

@lidel
Copy link
Member

lidel commented Feb 11, 2022

In 2022, ed25519 keys are the default in both go-ipfs and js-ipfs ✨

@lidel lidel closed this as completed Feb 11, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants