Skip to content

Commit

Permalink
discv5: protocol version v5.1 (#157)
Browse files Browse the repository at this point in the history
This defines the first stable (non-draft) spec version of Discovery v5.

- The packet format is changed and all plaintext RLP is removed.
- Packer header obfuscation was added.
- Packets sent to the wrong node ID can no longer be mistaken for WHOAREYOU.
- The handshake description is much more detailed and has moved to the theory document.
- FINDNODE now uses a list of multiple distances as the parameter.
- TALKREQ/TALKRESP packets have been added.

Co-authored-by: Mikhail Kalinin <noblesse.knight@gmail.com>
  • Loading branch information
fjl and mkalinin authored Oct 7, 2020
1 parent 4ba94c8 commit 56a498e
Show file tree
Hide file tree
Showing 8 changed files with 703 additions and 494 deletions.
2 changes: 1 addition & 1 deletion discv5/discv5-rationale.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Node Discovery Protocol v5 - Rationale

**Draft of October 2019**
**Protocol version v5.1**

Note that this specification is a work in progress and may change incompatibly without
prior notice.
Expand Down
223 changes: 214 additions & 9 deletions discv5/discv5-theory.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
# Node Discovery Protocol v5 - Theory

**Draft of October 2019.**
**Protocol version v5.1**

Note that this specification is a work in progress and may change incompatibly without
prior notice.
This document explains the algorithms and data structures used by the protocol.

## Nodes, Records and Distances

Expand All @@ -26,7 +25,7 @@ used in place of the actual distance.

logdistance(n₁, n₂) = log2(distance(n₁, n₂))

## Maintaining The Local Node Record
### Maintaining The Local Node Record

Participants should update their record, increase the sequence number and sign a new
version of the record whenever their information changes. This is especially important for
Expand All @@ -41,6 +40,206 @@ IP address and port.
If the endpoint cannot be determined (e.g. when the NAT doesn't support 'full-cone'
translation), implementations should omit IP address and UDP port from the record.

## Sessions

Discovery communication is encrypted and authenticated using session keys, established in
the handshake. Since every node participating in the network acts as both client and
server, a handshake can be initiated by either side of communication at any time.

### Handshake Steps

#### Step 1: Node A sends message packet

In the following definitions, we assume that node A wishes to communicate with node B,
e.g. to send a FINDNODE message. Node A must have a copy of node B's record in order to
communicate with it.

If node A has session keys from prior communication with B, it encrypts its request with
those keys. If no keys are known, it initiates the handshake by sending an ordinary
message packet with random message content.

A -> B FINDNODE message packet encrypted with unknown key

#### Step 2: Node B responds with challenge

Node B receives the message packet and extracts the source node ID from the packet header.
If node B has session keys from prior communication with A, it attempts to decrypt the
message data. If decryption and authentication of the message succeeds, there is no need
for a handshake and node B can simply respond to the request.

If node B does not have session keys or decryption is not successful, it must initiate a
handshake by by responding with a [WHOAREYOU packet].

It first generates a unique `id-nonce` value and includes it in the packet. Node B also
checks if it has a copy of node A's record. If it does, it also includes the sequence
number of this record in the challenge packet, otherwise it sets the `enr-seq` field to
zero.

Node B must also store the A's record and the WHOAREYOU challenge for a short duration
after sending it to node A because they will be needed again in step 4.

A <- B WHOAREYOU packet including id-nonce, enr-seq

#### Step 3: Node A processes the challenge

Node A receives the challenge sent by node B, which confirms that node B is alive and is
ready to perform the handshake. The challenge can be traced back to the request packet
which solicited it by checking the `nonce`, which mirrors the request packet's `nonce`.

Node A proceeds with the handshake by re-sending the FINDNODE request as a [handshake
message packet]. This packet contains three parts in addition to the message:
`id-signature`, `ephemeral-pubkey` and `record`.

The handshake uses the unmasked WHOAREYOU challenge as an input:

challenge-data = masking-iv || static-header || authdata

Node A can now derive the new session keys. To do so, it first generates an ephemeral key
pair on the elliptic curve used by node B's identity scheme. As an example, let's assume
the node record of B uses the "v4" scheme. In this case the `ephemeral-pubkey` will be a
public key on the secp256k1 curve.

ephemeral-key = random private key generated by node A
ephemeral-pubkey = public key corresponding to ephemeral-key

The ephemeral key is used to perform Diffie-Hellman key agreement with node B's static
public key and the session keys are derived from it using the HKDF key derivation
function.

dest-pubkey = public key corresponding to node B's static private key
secret = ecdh(ephemeral-key, dest-pubkey)
kdf-info = "discovery v5 key agreement" || node-id-A || node-id-B
prk = HKDF-Extract(secret, challenge-data)
key-data = HKDF-Expand(prk, kdf-info)
initiator-key = key-data[:16]
recipient-key = key-data[16:]

Node A creates the `id-signature`, which proves that it controls the private key which
signed its node record. The signature also prevents replay of the handshake.

id-signature-text = "discovery v5 identity proof"
id-signature-input = id-signature-text || challenge-data || ephemeral-pubkey || node-id-B
id-signature = id_sign(sha256(id-signature-input))

Finally, node A compares the `enr-seq` element of the WHOAREYOU challenge against its own
node record sequence number. If the sequence number in the challenge is lower, it includes
its record into the handshake message packet.

The request is now re-sent, with the message encrypted using the new session keys.

A -> B FINDNODE handshake message packet, encrypted with new initiator-key

#### Step 4: Node B receives handshake message

When node B receives the handshake message packet, it first loads the node record and
WHOAREYOU challenge which it sent and stored earlier.

If node B did not have the node record of node A, the handshake message packet must
contain a node record. A record may also be present if node A determined that its record
is newer than B's current copy. If the packet contains a node record, B must first
validate it by checking the record's signature.

Node B then verifies the `id-signature` against the identity public key of A's record.

After that, B can perform the key derivation using its own static private key and the
`ephemeral-pubkey` from the handshake packet. Using the resulting session keys, it
attempts to decrypt the message contained in the packet.

If the message can be decrypted and authenticated, Node B considers the new session keys
valid and responds to the message. In our example case, the response is a `NODES` message:

A <- B NODES encrypted with new recipient-key

#### Step 5: Node A receives response message

Node A receives the message packet response and authenticates/decrypts it with the new
session keys. If decryption/authentication succeeds, node B's identity is verified and
node A also considers the new session keys valid.

### Identity-Specific Cryptography in the Handshake

Establishment of session keys is dependent on the [identity scheme] used by the recipient
(i.e. the node which sends WHOAREYOU). Likewise, the signature over `id-sig-input` is made
by the identity key of the initiator. It is not required that initiator and recipient use
the same identity scheme in their respective node records. Implementations must be able to
perform the handshake for all supported identity schemes.

At this time, the only supported identity scheme is "v4".

`id_sign(hash)` creates a signature over `hash` using the node's static private key. The
signature is encoded as the 64-byte array `r || s`, i.e. as the concatenation of the
signature values.

`ecdh(pubkey, privkey)` creates a secret through elliptic-curve Diffie-Hellman key
agreement. The public key is multiplied by the private key to create a secret ephemeral
key `eph = pubkey * privkey`. The 33-byte secret output is `y || eph.x` where `y` is
`0x02` when `eph.y` is even or `0x03` when `eph.y` is odd.

### Handshake Implementation Considerations

Since a handshake may happen at any time, UDP packets may be reordered by transmitting
networking equipment, implementations must deal with certain subtleties regarding the
handshake.

In general, implementations should keep a reference to all sent request packets until the
request either times out, is answered by the corresponding response packet or answered by
WHOAREYOU. If WHOAREYOU is received as the answer to a request, the request must be
re-sent as a handshake packet.

If an implementation supports sending concurrent requests, multiple responses may be
pending when WHOAREYOU is received, as in the following example:

A -> B FINDNODE
A -> B PING
A -> B TOPICQUERY
A <- B WHOAREYOU (nonce references PING)

When this happens, all buffered requests can be considered invalid (the remote end cannot
decrypt them) and the packet referenced by the WHOAREYOU `nonce` (in this example: PING)
must be re-sent as a handshake. When the response to the re-sent is received, the new
session is established and other pending requests (example: FINDNODE, TOPICQUERY) may be
re-sent.

Note that WHOAREYOU is only ever valid as a response to a previously sent request. If
WHOAREYOU is received but no requests are pending, the handshake attempt can be ignored.

Another important issue is the processing of message packets while a challenge is
received: consider the case where node A has sent a packet that B cannot decrypt, and B
has responded with WHOAREYOU.

A -> B FINDNODE
A <- B WHOAREYOU

Node B is now waiting for a handshake message packet to complete the new session, but
instead receives another ordinary message packet.

A -> B ORDINARY MESSAGE PACKET

In this case, implementations should respond with a new WHOAREYOU challenge referencing
the message packet.

### Session Cache

Nodes should store session keys for communication with other recently-seen nodes. Since
sessions are ephemeral and can be re-established whenever necessary, it is sufficient to
store a limited number of sessions in an in-memory LRU cache.

To prevent IP spoofing attacks, implementations must ensure that session secrets and the
handshake are tied to a specific UDP endpoint. This is simple to implement by using the
node ID and IP/port as the 'key' into the in-memory session cache. When a node switches
endpoints, e.g. when roaming between different wireless networks, sessions will to be
re-established by handshaking again. This requires no effort on behalf of the roaming node
because the recipients of protocol messages will simply refuse to decrypt messages from
the new endpoint and reply with WHOAREYOU.

The number of messages which can be encrypted with a certain session key is limited
because encryption of each message requires a unique nonce for AES-GCM. In addition to the
keys, the session cache must also keep track of the count of outgoing messages to ensure
the uniqueness of nonce values. Since the wire protocol uses 96 bit AES-GCM nonces, it is
strongly recommended to generate them by encoding the current outgoing message count into
the first 32 bits of the nonce and filling the remaining 64 bits with random data
generated by a cryptographically secure random number generator.

## Node Table

Nodes keep information about other nodes in their neighborhood. Neighbor nodes are stored
Expand Down Expand Up @@ -70,6 +269,9 @@ bucket addition and occasionally verify that a random node in a random bucket is
sending [PING]. When the PONG response indicates that a new version of the node record is
available, the liveness check should pull the new record and update it in the local table.

If a node's liveness has been verified many times, implementations may consider occasional
non-responsiveness permissible and assume the node is live.

When responding to FINDNODE, implementations must avoid relaying any nodes whose liveness
has not been verified. This is easy to achieve by storing an additional flag per node in
the table, tracking whether the node has ever successfully responded to a PING request.
Expand All @@ -95,7 +297,7 @@ initiator has queried and gotten responses from the `k` closest nodes it has see
To improve the resilience of lookups against adversarial nodes, the algorithm may be
adapted to perform network traversal on multiple disjoint paths. Not only does this
approach benefit security, it also improves effectiveness because more nodes are visited
during a single lookup. The initial `k` closest nodes are partioned into multiple
during a single lookup. The initial `k` closest nodes are partitioned into multiple
independent 'path' buckets, and ​concurrent FINDNODE​ requests executed as described above,
with one difference: results discovered on one path are not reused on another, i.e. each
path attempts to reach the closest nodes to the lookup target independently without
Expand Down Expand Up @@ -321,12 +523,15 @@ encountered during lookup are asked for topic queue entries using the [TOPICQUER
Radius estimation for topic search is similar to the estimation procedure for
advertisement, but samples the average number of results from TOPICQUERY instead of
average time to registration. The radius estimation value can be shared with the
registration algorithm if the the same topic is being registered and searched for.
registration algorithm if the same topic is being registered and searched for.

[EIP-778]: ../enr.md
[identity scheme]: ../enr.md#record-structure
[handshake message packet]: ./discv5-wire.md#handshake-message-packet-flag--2
[WHOAREYOU packet]: ./discv5-wire.md#whoareyou-packet-flag--1
[PING]: ./discv5-wire.md#ping-request-0x01
[PONG]: ./discv5-wire.md#pong-response-0x02
[FINDNODE]: ./discv5-wire.md#findnode-request-0x03
[REGTOPIC]: ./discv5-wire.md#regtopic-request-0x05
[REGCONFIRMATION]: ./discv5-wire.md#regconfirmation-response-0x07
[TOPICQUERY]: ./discv5-wire.md#topicquery-request-0x08
[REGTOPIC]: ./discv5-wire.md#regtopic-request-0x07
[REGCONFIRMATION]: ./discv5-wire.md#regconfirmation-response-0x09
[TOPICQUERY]: ./discv5-wire.md#topicquery-request-0x0a
Loading

0 comments on commit 56a498e

Please sign in to comment.