Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discv5: protocol version v5.1 #157

Merged
merged 40 commits into from
Oct 7, 2020
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
b48b231
discv5: WIP protocol version v5.1
fjl Apr 28, 2020
4f3c7dc
discv5: move handshake to theory document and import the new packet f…
fjl Jun 5, 2020
ee30374
discv5: improve FINDNODE, NODES text
fjl Jun 30, 2020
3b03774
discv5: add description of header masking
fjl Aug 31, 2020
55160b6
discv5: update packet layout images
fjl Aug 31, 2020
1fd2385
discv5: clarify FINDNODE processing
fjl Aug 31, 2020
172c117
discv5: remove special case for empty request-id in TALKREQ
fjl Aug 31, 2020
92bbbd2
discv5: require sending record when enr-seq == 0
fjl Aug 31, 2020
da37f72
discv5: update handshake section cross links in discv5-wire.md
fjl Aug 31, 2020
47a94d6
discv5: fix markdown lint issues in discv5-theory.md
fjl Aug 31, 2020
d1903ec
discv5: improve handshake text and re-add id-signature definition
fjl Aug 31, 2020
fc5fdd0
discv5: listify FINDNODE distances
fjl Aug 31, 2020
1086cdd
discv5: remove auth-resp-key KDF output
fjl Sep 2, 2020
13c2302
discv5: delete unused image discv-topic-queue-diagram.png
fjl Sep 2, 2020
34b96c8
discv5: fix typo
fjl Sep 2, 2020
6e32e21
discv5: fix issue in description of AES/GCM authenticated data
fjl Sep 17, 2020
88b82a2
discv5: remove definition of rlp_bytes
fjl Sep 17, 2020
6e96b92
discv5: fix message-ids
fjl Sep 25, 2020
163c73f
discv5: limit request ID size
fjl Sep 25, 2020
4a9cc85
discv5: document TALKEQ/TALKRESP field types
fjl Sep 25, 2020
d244367
discv5: bump request-id max size to 8 bytes
fjl Sep 25, 2020
c13e4cf
discv5: improve the handshake text
fjl Sep 25, 2020
d19fbe7
discv5: add destination ID into id-sig-input
fjl Sep 25, 2020
995e68e
discv5: fix link
fjl Sep 25, 2020
5b59e88
discv5: put version into protocol-id
fjl Sep 30, 2020
60d9107
discv5: reduce id-nonce size and use IV in handshake
fjl Sep 30, 2020
2acb646
discv5: change text in id-signature-input
fjl Sep 30, 2020
1d133e6
discv5: be more exact about packet names
fjl Sep 30, 2020
23e35a7
discv5: define minimum packet size
fjl Sep 30, 2020
eae09e8
discv5: add text about limit for total nodes responses
fjl Sep 30, 2020
74c4af8
discv5: move src-id into authdata, nonce into static header
fjl Sep 30, 2020
a785698
discv5: update test vectors
fjl Sep 30, 2020
0c1b004
discv5: delete old auth message test vectors
fjl Sep 30, 2020
98c7cf9
discv5: update low-level test vectors
fjl Sep 30, 2020
a0a45ae
discv5: text fixes in test vectors document
fjl Sep 30, 2020
96e6b4b
discv5: text improvements
fjl Oct 1, 2020
6cf063c
discv5: use all of WHOAREYOU for id-signature and KDF
fjl Oct 2, 2020
815ef40
discv5: use entire unmasked packet header as authenticated data
fjl Oct 2, 2020
2526ef0
discv5: update test vectors
fjl Oct 2, 2020
1a4a974
discv5: fix challenge-data in test vector
fjl Oct 2, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion discv5/discv5-rationale.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Node Discovery Protocol v5 - Rationale

**Draft of October 2019**
**Protocol version v5.1**

Note that this specification is a work in progress and may change incompatibly without
prior notice.
Expand Down
155 changes: 146 additions & 9 deletions discv5/discv5-theory.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
# Node Discovery Protocol v5 - Theory

**Draft of October 2019.**
**Protocol version v5.1**

Note that this specification is a work in progress and may change incompatibly without
prior notice.
This document explains the algorithms and data structures used by the protocol.

## Nodes, Records and Distances

Expand All @@ -26,7 +25,7 @@ used in place of the actual distance.

logdistance(n₁, n₂) = log2(distance(n₁, n₂))

## Maintaining The Local Node Record
### Maintaining The Local Node Record

Participants should update their record, increase the sequence number and sign a new
version of the record whenever their information changes. This is especially important for
Expand All @@ -41,6 +40,141 @@ IP address and port.
If the endpoint cannot be determined (e.g. when the NAT doesn't support 'full-cone'
translation), implementations should omit IP address and UDP port from the record.

## Sessions

Discovery communication is encrypted and authenticated using session keys, established in
the handshake.

### Handshake

Since every node participating in the network acts as both client and server, a handshake
can be initiated by either side of communication at any time. In the following
definitions, we assume that node A wishes to communicate with node B, e.g. to send a
FINDNODE query.

Node A must have a node record for node B and know B's node ID to communicate with it. If
node A has session keys from prior communication, it encrypts its request with those keys.
If no keys are known, it initiates the handshake by sending a packet with random content.
fjl marked this conversation as resolved.
Show resolved Hide resolved

A -> B FINDNODE encrypted with unknown key or random-packet

Node B receives the initial packet, extracts the source node ID from the packet (see
[encoding section]), and continues the handshake by responding with [WHOAREYOU]. The
WHOAREYOU packet contains the id-nonce value to be signed by A as well as the highest
known ENR sequence number of node A's record.

A <- B WHOAREYOU including id-nonce, enr-seq

Node A now knows that node B is alive and is ready to perform the handshake. The handshake
proceeds by re-sending the original request message in a [handshake packet].
fjl marked this conversation as resolved.
Show resolved Hide resolved

The handshake packet includes an ephemeral public key in the cryptosystem used by B's
identity scheme (e.g. an elliptic curve key on the secp256k1 curve if node B uses the "v4"
scheme). This key is used to perform Diffie-Hellman key agreement with B's static public
key and the session keys are derived from it using the HKDF key derivation function.

ephemeral-key = random private key generated by node A
ephemeral-pubkey = public key corresponding to ephemeral-key
dest-pubkey = public key of node B
secret = ecdh(ephemeral-key, dest-pubkey)
info = "discovery v5 key agreement" || node-id-A || node-id-B
prk = HKDF-Extract(secret, id-nonce)

initiator-key, recipient-key = HKDF-Expand(prk, info)

The handshake packet also contains a signature over `id-nonce` as well as node A's record
if the local sequence number is higher than `enr-seq`. The signature proves that node A
controls the identity key which signed the record and also prevents replay of the
handshake.

id-nonce-input = sha256("discovery-id-nonce" || id-nonce || ephemeral-pubkey)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ephemeral-pubkey that goes over the wire now is in the compressed format (33 bytes).
Do we still used the uncompressed format when creating the id-nonce-input as it used to be? Or directly the compressed format?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be the compressed format for the id signature as well.

Copy link
Collaborator Author

@fjl fjl Sep 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this needs additional spec text or if it's fine the way it is now.

fjl marked this conversation as resolved.
Show resolved Hide resolved
id-signature = id_sign(id-nonce-input)

The request is now re-sent:

A -> B FINDNODE as handshake packet, message encrypted with new initiator-key

Node B receives the handshake packet and verifies that the signature over `id-nonce` was
created by node A's public key. To verify the signature, it looks at node A's record which
it either already has a copy of or which was received in the header.

Node B then performs key agreement/derivation using its own static private key and
`ephemeral-pubkey`.

If the `id-nonce` signature is valid, Node B considers the new session keys valid,
decrypts the message contained in the packet and responds to it. In our example case, the
response is a `NODES` message:

A <- B NODES encrypted with new recipient-key

Node A receives the response and authenticates/decrypts it with the new session keys. If
decryption succeeds, node B's identity is verified. Node A now considers the new session
keys valid and uses them for all further communication.

### Handshake Implementation Considerations

Since a handshake may happen at any time, implementations should keep a reference to all
sent request packets until the request either times out, is answered by the corresponding
response packet or answered by WHOAREYOU. If WHOAREYOU is received as the answer to a
request, the request must be re-sent with an authentication header containing new keys.

Multiple responses may be pending when WHOAREYOU is received, as in the following example:

A -> B FINDNODE
A -> B PING
A -> B TOPICQUERY
A <- B WHOAREYOU (token references PING)

In those cases, pending requests can be considered invalid (the remote end cannot decrypt
them) and the packet referenced by WHOAREYOU (example: PING) must be re-sent with an
authentication header. When the response to the re-sent request (example: PONG) is
received, the new session is established and other pending requests (example: FINDNODE,
TOPICQUERY) may be re-sent.

Note that WHOAREYOU is only ever valid as a response to a previously sent request. If
WHOAREYOU is received but no requests are pending, the handshake attempt can be ignored.

### Identity-Specific Cryptography in the Handshake

Establishment of session keys is dependent on the identity scheme of the recipient (i.e.
the node which sends WHOAREYOU). Similarly, the signature over `id-nonce-input` is made by
the identity key of the initiator. Although initiator and recipient might not be using the
same identity scheme in their respective node records, implementations must be able to
handle handshaking for all supported identity schemes.

At this time, the only supported identity scheme is "v4".

`id_sign(data)` creates a signature over `data` using the node's static private key. The
signature is encoded as the 64-byte array `r || s`, i.e. as the concatenation of the
signature values.

`ecdh(pubkey, privkey)` creates a secret through elliptic-curve Diffie-Hellman key
agreement. The public key is multiplied by the private key to create a secret ephemeral
key `eph = pubkey * privkey`. The 33-byte secret output is `y || eph.x` where `y` is
`0x02` when `eph.y` is even or `0x03` when `eph.y` is odd.

### Session Cache

Nodes should store session keys for communication with other recently-seen nodes. Since
sessions are ephemeral and can be re-established whenever necessary, it is sufficient to
store a limited number of sessions in an in-memory LRU cache.

To prevent IP spoofing attacks, implementations must ensure that session secrets and the
handshake are tied to a specific UDP endpoint. This is simple to implement by using the
node ID and IP/port as the 'key' into the in-memory session cache. When a node switches
endpoints, e.g. when roaming between different wireless networks, sessions will to be
re-established by handshaking again. This requires no effort on behalf of the roaming node
because the recipients of protocol messages will simply refuse to decrypt messages from
the new endpoint and reply with WHOAREYOU.

The number of messages which can be encrypted with a certain session key is limited
because encryption of each message requires a unique nonce for AES-GCM. In addition to the
keys, the session cache must also keep track of the count of outgoing messages to ensure
the uniqueness of nonce values. Since the wire protocol uses 96 bit AES-GCM nonces, it is
fjl marked this conversation as resolved.
Show resolved Hide resolved
strongly recommended to generate them by encoding the current outgoing message count into
the first 32 bits of the nonce and filling the remaining 64 bits with random data
generated by a cryptographically secure random number generator.

## Node Table

Nodes keep information about other nodes in their neighborhood. Neighbor nodes are stored
Expand Down Expand Up @@ -95,7 +229,7 @@ initiator has queried and gotten responses from the `k` closest nodes it has see
To improve the resilience of lookups against adversarial nodes, the algorithm may be
adapted to perform network traversal on multiple disjoint paths. Not only does this
approach benefit security, it also improves effectiveness because more nodes are visited
during a single lookup. The initial `k` closest nodes are partioned into multiple
during a single lookup. The initial `k` closest nodes are partitioned into multiple
independent 'path' buckets, and ​concurrent FINDNODE​ requests executed as described above,
with one difference: results discovered on one path are not reused on another, i.e. each
path attempts to reach the closest nodes to the lookup target independently without
Expand Down Expand Up @@ -321,12 +455,15 @@ encountered during lookup are asked for topic queue entries using the [TOPICQUER
Radius estimation for topic search is similar to the estimation procedure for
advertisement, but samples the average number of results from TOPICQUERY instead of
average time to registration. The radius estimation value can be shared with the
registration algorithm if the the same topic is being registered and searched for.
registration algorithm if the same topic is being registered and searched for.

[EIP-778]: ../enr.md
[encoding section]: ./discv5-wire.md#packet-encoding
[handshake packet]: ./discv5-wire.md#handshake-message-packet-flag--2
[WHOAREYOU]: ./discv5-wire.md#whoareyou-packet-flag--1
[PING]: ./discv5-wire.md#ping-request-0x01
[PONG]: ./discv5-wire.md#pong-response-0x02
[FINDNODE]: ./discv5-wire.md#findnode-request-0x03
[REGTOPIC]: ./discv5-wire.md#regtopic-request-0x05
[REGCONFIRMATION]: ./discv5-wire.md#regconfirmation-response-0x07
[TOPICQUERY]: ./discv5-wire.md#topicquery-request-0x08
[REGTOPIC]: ./discv5-wire.md#regtopic-request-0x07
[REGCONFIRMATION]: ./discv5-wire.md#regconfirmation-response-0x09
[TOPICQUERY]: ./discv5-wire.md#topicquery-request-0x10
Loading