RFC: Signed Address Records #217

yusefnapora · 2019-10-04T20:11:50Z

Hey all, this introduces an RFC for a self-certified address record that could be published to a DHT or gossiped about in a pubsub topic without worrying about it being altered in-flight.

This is motivated by the discussion in libp2p/libp2p#47 and libp2p/go-libp2p#436.

I wrote up the record as an IPLD schema, but we could go with another format if we'd rather not have a dependency on IPLD.

I also wasn't sure whether we might want to issue records about peers other than ourselves - this draft does not, but we could have records with distinct subject and issuer peers.

There are some TODOs at the bottom that are better tracked here:

how to store signed records
- should be separate from "working set" that's optimized for retrieval
- need to store unaltered bytes
figure out if IPLD is the way to go here. If not, what serialization format, etc.
extend identify protocol to include signed records?

@bigs @raulk @mgoelzer @Stebalien @vyzo

ghost · 2019-10-05T00:47:10Z

cc @whyrusleeping

yusefnapora · 2019-10-07T13:43:56Z

What do people think about IPLD? I mostly just thought the schema notation is pretty cool, but I think if we want to use it for real we'd need a block store (at least an in-memory one), which doesn't feel like it's really a libp2p-layer concern. Maybe I'm overthinking that though.

Other options that I can think of are protobuf, which plays nice with most other libp2p things, or CBOR. Raw CBOR is as annoying to work with as raw JSON, but it does have a canonical encoding that is deterministic, which protobuf lacks. I'm not sure that a deterministic encoding is strictly necessary, and as far as I can tell, IPLD doesn't enforce one either. Anyway, protobuf is likely the simplest to use, so that seems like a decent way to go.

yusefnapora · 2019-10-07T13:47:33Z

I should note that I haven't tried actually using IPLD yet :) I'm going to play with it a bit and see how it works in practice. If anyone has experience with it, how is it?

yusefnapora · 2019-10-10T14:50:44Z

I've been thinking about this some more and just wanted to write up where my head is at before I start revising.

Ultimately I think IPLD is overkill for this, and we should just use protobufs. Earlier I was leaning away from this because there's no guarantees of deterministic encoding for protobufs, which was an issue with the peer id spec. However, as long as we sign, transmit and store serialized protobufs, we can still validate the record before deserializing it, and we don't really have to worry about whether our encoder has the same behavior as the one that produced the record.

I think I want to tease this apart into two RFCs:

An all-purpose "signed envelope" record containing a peer id, byte string payload, and a signature of the payload. May optionally include public key (which must be consistent with peer id). Will require a "signature service" that can sign and validate.
An address record with routability & confidence scores. This will get serialized to a byte string, and the encoded record will be wrapped in a signed envelope. The peer store will accept these directly, or inside a signed envelope. If inside an envelope, peer store will reject if signature is invalid. If signature is valid, mark the record as certified & surface that via peer store APIs.

This is basically the same idea as the original draft, except instead of an AddressEnvelope struct that stores a specific structured data type, the envelope only stores opaque binary blobs.

e.g.:

message SignedEnvelope {
  bytes peerId = 1;
  bytes contents = 2;
  bytes signature = 3;

  optional PublicKey pubkey = 4;
}

message AddressInfo {
  bytes addr = 1;
  Routability routability = 2;
  Confidence confidence = 3;
}

message AddressState {
  bytes subject = 1;
  repeated AddressInfo addresses = 2;
  // etc...
}

Storing serialized protobufs as a byte string inside another protobuf feels weird, but it seems necessary, given the lack of a deterministic encoder.

Does this make sense?

Stebalien · 2019-10-11T01:12:55Z

Instead of requiring the peer ID and optionally having a key, how about we just always have the key and then always derive the peer ID? That way it's impossible to just assume that the key matches the peer ID?
Yeah, a byte string within a protobuf seems like the right tactic if we're using protobufs.

Also note: we need to tag the data with some kind of "purpose" string so we can't reuse these records as messages in other protocols with signed messages. We do this in pubsub by prepending libp2p-pubsub: before signing/verifying the signature (but not when serializing) but I wonder if we should have a better solution. Maybe prepend a length-prefixed string? Null terminated string?

yusefnapora · 2019-10-14T13:44:05Z

how about we just always have the key and then always derive the peer ID

That is better 👍

Good question about the "domain separation" string. Maybe we could have a fixed string, e.g. "libp2p-signed-message:" but also allow the user to add an arbitrary "purpose" string that gets appended to the fixed string. This could be included in plain text as a field in the SignedEnvelope message, which might be a nice way to let users indicate how to process the contents of the message.

Something like

{
  publicKey: { 
    //...
  },
  contents: Buffer(),
  purpose: "AddressState",
  signature: Buffer()
}

And you prepend "libp2p-signed-message:" ++ purpose to the contents to get the signature ("libp2p-signed-message:AddressState" in the example).

Stebalien · 2019-10-21T18:48:00Z

This could be included in plain text as a field in the SignedEnvelope message, which might be a nice way to let users indicate how to process the contents of the message.

That will take extra space in the message itself which may discourage high entropy names.

I actually like forcing users to explicitly specify the purpose/domain. If we include it in the message, users can be lazy and ignore it. If we don't, users will have to call some function like:

func Open(domain string, envelope []byte) (message []byte, author PublicKey, err error) { ... }

And you prepend "libp2p-signed-message:"

I'm not sure if libp2p-signed-message is all that important as long as users specify purpose strings. However, I guess it can't hurt.

Maybe prepend a length-prefixed string? Null terminated string?

BTW, my concern here is that foo is a prefix of foobar which would allow an attacker to add bar to the signed data and pass it off as a foo message. This isn't an issue if we require this prefix to end in : but it may be better to just length-prefix it.

yusefnapora · 2019-10-22T16:53:34Z

Ah, I see where you're coming from with the out-of-band domain string, that makes sense.

Using only user-controlled data for the domain string makes me nervous, although I'm not sure I can really come up with a plausible attack. Still, there's not really any cost to using the fixed string plus the user-provided one, so might be worth doing just in case.

ghost · 2019-10-22T19:00:41Z

From my experience with security auditors, letting arbitrary or user-defined input control the interpretation of a message is going to raise a yellow flag for them even if it's not exploitable. (A plain text field in the protobuf does not seem as likely to raise those concerns.) Perhaps something to consider since we've fixed a lot of other stuff that was not technically broken (but is a nonstandard approach) to satisfy auditor concerns.

Stebalien · 2019-10-22T23:10:15Z

This is less of an arbitrary user-defined input than the bytes being signed as the purpose string is fixed at compile time.

My only concerns are:

Ideally, we should always combine the purpose with the signed data the same way.
We need to structure the signed data such that we can split $purpose$signedData into the $purpose and $$ignedData without external information. Otherwise, as noted above, messages for protocol foobar could be passed to protocol foo.

purpose:[data] makes (1) easy as we already do this in pubsub, TLS, and QUIC (once we cut a release). However, that does mean that : can't appear in the purpose.

yusefnapora · 2019-10-23T16:30:11Z

Good point - the "user" that controls the domain string is another libp2p subsystem, and there's no external input at runtime.

I guess I lean towards using a : separator and prohibiting : in the purpose string, since it's consistent with what we're doing elsewhere.

raulk

Shaping up well. Just a handful of initial comments.

RFC/0002-signed-envelopes.md

raulk · 2019-10-30T16:52:19Z

RFC/0002-signed-envelopes.md

+
+```protobuf
+message SignedEnvelope {
+  PublicKey publicKey = 1; // see peer id spec for definition


Food for thought. Including the pubkey may be superfluous for some signature schemes.

Given an ECDSA signature, one can recover the public key provided we know the curve, the hash function, and the plaintext that was signed. Bitcoin and Ethereum use that trick heavily to validate transactions.

See:

https://crypto.stackexchange.com/questions/18105/how-does-recovering-the-public-key-from-an-ecdsa-signature-work
https://crypto.stackexchange.com/questions/60218/recovery-public-key-from-secp256k1-signature-and-message

RFC/0002-signed-envelopes.md

raulk · 2019-10-30T16:55:55Z

RFC/0003-address-records.md

+  LOCAL = 3;
+
+  // public internet
+  GLOBAL = 4;


We'll need a PROXIMITY enum value for network-less transports that rely on physical proximity, e.g. Bluetooth.

Suggested change

GLOBAL = 4;

PUBLIC = 4;

raulk · 2019-10-30T16:57:53Z

RFC/0003-address-records.md

+  }
+```
+
+If Alice wants to publish her address to a public shared resource like a DHT,


Should we recommend that records are rejected outright when they leak addresses that are outside the scope of the discovery mechanism?

raulk · 2019-10-30T16:59:00Z

RFC/0003-address-records.md

+
+// Confidence indicates how much we believe in the validity of the
+// address.
+enum Confidence {


What classes of problems do you foresee communicating our perceived confidence of our own addresses would solve?

If it's to signal which dials to prioritise, we can simplify this by conveying a 1-byte precedence value in the range (-128, 128).

raulk

Notes from a sync call with @yusefnapora.

Regarding the peer routing record:

For now, modelling confidence and routability should be out of scope, because these two concepts are not yet modelled in the libp2p stack.
We should think about "record extensions", where confidence and routability would be two examples of future extensions.
Another extension I'd like to see is a "service bloom filter" that allows peers to advertise their protocols compactly and efficiently (without enumerating them, therefore with a degree of privacy preservation), and receivers of that record would be able to probabilistically test whether that authoring peer supports a protocol they're interested in, without having to connect to them.

Regarding the envelope:

Add a type field: not sure if this should be a string, an enumeration, a multicodec.
Hoist the sequence number from the peer routing record to the envelope.
These two fields need to be part of the plaintext to sign.

raulk

More thoughts regarding the "routability" aspect:

What addresses do we want to communicate in a peer routing record?
Discovery mechanisms have different scopes of operation, e.g. mDNS operates on LAN and loopback, PEX and the DHT are mostly public, but we do want to facilitate two local peers finding each other and establishing a local connection.
For PEX and DHT, maybe we shouldn't even publish local addresses? Doing so has implications in security (you can order a machine to perform local dials to some process in their local network).
Instead of publishing local addresses in a global venue (e.g. DHT), we should be having a mechanism so that peers establishing a connection via a public endpoint can detect if they're behind the same network, and "upgrade" to a better route.
I guess in some scenarios, NAT hairpinning takes care of the above.

A possibility is having the address filtering be explicit. In most cases, we can communicate local and public endpoints, filtering out the localhost endpoints.

If the user wants to enable localhost endpoint advertisement, they can do so by enabling a flag on the Host.

In the implementation realm, assuming there'll be a peer routing record generator component, which knows all addresses and spits out records whenever they change, notifying downstream components (e.g. DHT, PEX, etc.), those components could provide an address filter when registering.

The problem with the above approach is that it can lead to a "chimera" scenario. If two peers find each other via two venues (e.g. mDNS, DHT), the routing records with different scopes could collide and override each other. Unless we isolate routing records per domain/scope in the peerstore.

Food for thought.

raulk · 2019-11-01T16:22:00Z

@yusefnapora aims to have a final draft of incorporating comments by today, in order to be reviewed and merged on Monday, to have a minimal implementation ready by the end of the week in Go.

Stebalien · 2019-11-01T19:08:42Z

Another extension I'd like to see is a "service bloom filter" that allows peers to advertise their protocols compactly and efficiently (without enumerating them, therefore with a degree of privacy preservation),

Hm, not really. If these bloom filters are at all useful, peers will be able to ask "does peer X likely speak protocol Y". I'd expect an attacker to ask "which peers speak protocol Y" (e.g., trying to find all the IPFS, Filecoin, etc. peers).

However, shrinking the size of the protocol list is likely worth it (but multicodecs!).

Stebalien · 2019-11-01T19:14:24Z

We should think about "record extensions", where confidence and routability would be two examples of future extensions.

YES!

Stebalien · 2019-11-01T19:18:49Z

Hoist the sequence number from the peer routing record to the envelope.

Why? Are we planning on duplicating these? I thought the idea was to make the envelope general-purpose.

Add a type field: not sure if this should be a string, an enumeration, a multicodec.

Multicodec prefixed byte string? That allows:

/path/thing... (/ is a valid multicodec 😁)
Multicodec.
Cids if we merge Make CID versions multicodecs multiformats/multicodec#49.

This is also consistent with ENS records.

yusefnapora · 2019-11-01T20:08:50Z

@Stebalien about moving the sequence number to the envelope, we were thinking it might be broadly useful for other kinds of payload, but as I was writing it up, I realized that each system that uses envelopes would need to have their own rules about how to interpret the sequence number (e.g. whether to invalidate & replace old records, merge them, etc). So it may be best to just leave it in the routing record and keep the envelope as simple as we can make it.

For the type field, I think your suggestion makes sense and gives us a lot of flexibility. 👍

Since we're talking about prefixing multiple fields to the content to sign instead of just the domain string, I'm now thinking that length-prefixing the fields is better than using a : separator. Especially if we want the type field to contain a CID, since a valid CID might randomly contain the UTF-8 code point for :.

yusefnapora · 2019-11-08T14:29:05Z

Just wanted to update this, since I've been a bit quiet this week. I've been working on an initial implementation in Go that I'm hoping to make PRs for today.

I diverged from what's written up here in one place. It seemed a bit silly to use varints for length-prefixing when building up the buffer to sign, since we don't need to save space on the wire or anything. So I just used uint64, which seems like way more than enough to fit any payload we might want to use envelopes for. I'll update this RFC to reflect that in a second.

Also, how does everyone feel about the name "routing records"? In the implementation I ended up calling them RoutingStateRecords to emphasize that they contain transient state, but I don't know if that's really any better. I'm also wondering if an even more generic name like "peer records" makes sense, since we're thinking about putting things like the protocol bloom filter in here, which isn't strictly a routing concern. Maybe "availability records"? IDK 🚲 🏠

Anyway, my plan for today is to fix up a few things in the go implementation and push those branches. Then I was thinking I'd merge this RFC in and start new PRs to turn the RFCs into Working Draft specs and move the conversation over to those PRs.

RFC/0003-routing-records.md

raulk

We've already merged a reference implementation into go-libp2p, so this can be promoted to a spec and be tagged with status 2A (Candidate Recommendation).

@vasco-santos -- are you planning to implement in js-libp2p?

RFC/0003-routing-records.md

RFC/0002-signed-envelopes.md

Co-authored-by: tmakarios <githubcorrespondence@freespoken.nz>

jacobheun · 2020-11-13T12:46:51Z

Now that this has been implemented in Go and JS I intend to merge this later today unless there is a reasonable objection. Ideally we would couple the approved RFC in a merge with its spec per #198, but I'd prefer this PR not sit open, completed, until someone can dedicate time to completing it.

aarshkshah1992 · 2020-11-17T12:33:53Z

@jacobheun Should we go ahead and merge this ?

This commit upgrades the current gossipsub implementation to support the [v1.1 spec](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md). It adds a number of features, bug fixes and performance improvements. Besides support for all new 1.1 features, other improvements that are of particular note: - Improved duplicate LRU-time cache (this was previously a severe bottleneck for large message throughput topics) - Extended message validation configuration options - Arbitrary topics (users can now implement their own hashing schemes) - Improved message validation handling - Invalid messages are no longer dropped but sent to the behaviour for application-level processing (including scoring) - Support for floodsub, gossipsub v1 and gossipsub v2 - Protobuf encoding has been shifted into the behaviour. This has permitted two improvements: 1. Message size verification during publishing (report to the user if the message is too large before attempting to send). 2. Message fragmentation. If an RPC is too large it is fragmented into its sub components and sent in smaller chunks. Additional Notes The peer eXchange protocol defined in the v1.1 spec is inactive in its current form. The current implementation permits sending `PeerId` in `PRUNE` messages, however a `PeerId` is not sufficient to form a new connection to a peer. A `Signed Address Record` is required to safely transmit peer identity information. Once these are confirmed (libp2p/specs#217) a future PR will implement these and make PX usable. Co-authored-by: Max Inden <mail@max-inden.de> Co-authored-by: Rüdiger Klaehn <rklaehn@protonmail.com> Co-authored-by: blacktemplar <blacktemplar@a1.net> Co-authored-by: Rüdiger Klaehn <rklaehn@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Roman S. Borschel <roman@parity.io> Co-authored-by: Roman Borschel <romanb@users.noreply.github.com> Co-authored-by: David Craven <david@craven.ch>

rfc: 1st draft for signed address records

77e3b66

yusefnapora added 3 commits October 21, 2019 10:32

wip - use protobuf instead of IPLD

5351d94

split into RFCs for signed envelope / addr records

8d10f25

wip discussion of peerstore API changes

59f660b

domain separation, remove CID and local storage

b8f1c5e

raulk mentioned this pull request Oct 25, 2019

Simple, minimal peer exchange protocol #222

Open

raulk reviewed Oct 30, 2019

View reviewed changes

raulk reviewed Nov 1, 2019

View reviewed changes

yusefnapora added 2 commits November 1, 2019 17:04

add type hints, length-prefix sig components

107ddde

trim scope & rename to "routing records"

cba046f

yusefnapora added 2 commits November 8, 2019 10:21

encode lengths in sig buffer as uint64

35fda19

rename from RoutingRecord to RoutingState

627a57c

use varints for length-prefixes in sig buffer

47606a0

raulk mentioned this pull request Dec 11, 2019

Design blueprint for signed peer records libp2p/go-libp2p#761

Open

11 tasks

yusefnapora mentioned this pull request Jan 23, 2020

Signed Peer Records libp2p/go-libp2p#776

Closed

10 tasks

jacobheun mentioned this pull request Mar 25, 2020

feat: address and proto books libp2p/js-libp2p#590

Merged

7 tasks

vasco-santos reviewed Mar 26, 2020

View reviewed changes

RFC/0003-routing-records.md Show resolved Hide resolved

daviddias mentioned this pull request Apr 6, 2020

Alan's (first time) review on go-libp2p-pubsub libp2p/go-libp2p-pubsub#277

Closed

7 tasks

raulk approved these changes Apr 28, 2020

View reviewed changes

raulk mentioned this pull request Apr 28, 2020

Some feedback and thoughts on integration vacp2p/multiprotocol#2

Open

wemeetagain mentioned this pull request May 12, 2020

gossipsub v1.1: Security extensions to improve on attack resilience and bootstrapping ChainSafe/js-libp2p-gossipsub#73

Closed

26 tasks

vasco-santos mentioned this pull request Jun 2, 2020

Signed Peer Records libp2p/js-libp2p#653

Closed

This was referenced Jun 16, 2020

feat: record interface libp2p/js-libp2p-interfaces#52

Merged

feat: signed peer records data types libp2p/js-libp2p#681

Merged

jacobheun reviewed Jun 22, 2020

View reviewed changes

RFC/0003-routing-records.md Outdated Show resolved Hide resolved

jacobheun reviewed Jun 29, 2020

View reviewed changes

RFC/0002-signed-envelopes.md Outdated Show resolved Hide resolved

vasco-santos mentioned this pull request Jul 21, 2020

fix: rendezvous spec #284

Merged

update signed records rfc to match Go implementation

377f05a

AgeManning mentioned this pull request Aug 27, 2020

Gossipsub v1.1 libp2p/rust-libp2p#1720

Merged

tmakarios reviewed Sep 24, 2020

View reviewed changes

RFC/0002-signed-envelopes.md Outdated Show resolved Hide resolved

AgeManning mentioned this pull request Oct 19, 2020

Signed peer records update libp2p/rust-libp2p#1799

Closed

blacktemplar mentioned this pull request Oct 19, 2020

Signed Address Records libp2p/rust-libp2p#1803

Closed

fix: correct link in RFC/0002-signed-envelopes.md

e401b14

Co-authored-by: tmakarios <githubcorrespondence@freespoken.nz>

jacobheun approved these changes Nov 13, 2020

View reviewed changes

jacobheun merged commit b70ccf2 into master Nov 19, 2020

jacobheun deleted the rfc/address-records branch November 19, 2020 14:05

Frederik-Baetens mentioned this pull request Dec 17, 2021

gossipsub peer sharing libp2p/rust-libp2p#2391

Closed

drHuangMHT mentioned this pull request Sep 27, 2023

gossipsub: documented TODO libp2p/rust-libp2p#4557

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Signed Address Records #217

RFC: Signed Address Records #217

yusefnapora commented Oct 4, 2019 •

edited

Loading

ghost commented Oct 5, 2019

yusefnapora commented Oct 7, 2019

yusefnapora commented Oct 7, 2019

yusefnapora commented Oct 10, 2019

Stebalien commented Oct 11, 2019

yusefnapora commented Oct 14, 2019

Stebalien commented Oct 21, 2019

yusefnapora commented Oct 22, 2019

ghost commented Oct 22, 2019

Stebalien commented Oct 22, 2019

yusefnapora commented Oct 23, 2019

raulk left a comment

raulk Oct 30, 2019

raulk Oct 30, 2019

raulk Oct 30, 2019

raulk Oct 30, 2019

raulk Oct 30, 2019

raulk Oct 30, 2019

raulk left a comment

raulk left a comment

raulk commented Nov 1, 2019

Stebalien commented Nov 1, 2019

Stebalien commented Nov 1, 2019

Stebalien commented Nov 1, 2019

yusefnapora commented Nov 1, 2019

yusefnapora commented Nov 8, 2019

raulk left a comment

jacobheun commented Nov 13, 2020

aarshkshah1992 commented Nov 17, 2020

RFC: Signed Address Records #217

RFC: Signed Address Records #217

Conversation

yusefnapora commented Oct 4, 2019 • edited Loading

ghost commented Oct 5, 2019

yusefnapora commented Oct 7, 2019

yusefnapora commented Oct 7, 2019

yusefnapora commented Oct 10, 2019

Stebalien commented Oct 11, 2019

yusefnapora commented Oct 14, 2019

Stebalien commented Oct 21, 2019

yusefnapora commented Oct 22, 2019

ghost commented Oct 22, 2019

Stebalien commented Oct 22, 2019

yusefnapora commented Oct 23, 2019

raulk left a comment

Choose a reason for hiding this comment

raulk Oct 30, 2019

Choose a reason for hiding this comment

raulk Oct 30, 2019

Choose a reason for hiding this comment

raulk Oct 30, 2019

Choose a reason for hiding this comment

raulk Oct 30, 2019

Choose a reason for hiding this comment

raulk Oct 30, 2019

Choose a reason for hiding this comment

raulk Oct 30, 2019

Choose a reason for hiding this comment

raulk left a comment

Choose a reason for hiding this comment

raulk left a comment

Choose a reason for hiding this comment

raulk commented Nov 1, 2019

Stebalien commented Nov 1, 2019

Stebalien commented Nov 1, 2019

Stebalien commented Nov 1, 2019

yusefnapora commented Nov 1, 2019

yusefnapora commented Nov 8, 2019

raulk left a comment

Choose a reason for hiding this comment

jacobheun commented Nov 13, 2020

aarshkshah1992 commented Nov 17, 2020

yusefnapora commented Oct 4, 2019 •

edited

Loading