Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discv5: protocol version v5.1 #157

Merged
merged 40 commits into from
Oct 7, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
b48b231
discv5: WIP protocol version v5.1
fjl Apr 28, 2020
4f3c7dc
discv5: move handshake to theory document and import the new packet f…
fjl Jun 5, 2020
ee30374
discv5: improve FINDNODE, NODES text
fjl Jun 30, 2020
3b03774
discv5: add description of header masking
fjl Aug 31, 2020
55160b6
discv5: update packet layout images
fjl Aug 31, 2020
1fd2385
discv5: clarify FINDNODE processing
fjl Aug 31, 2020
172c117
discv5: remove special case for empty request-id in TALKREQ
fjl Aug 31, 2020
92bbbd2
discv5: require sending record when enr-seq == 0
fjl Aug 31, 2020
da37f72
discv5: update handshake section cross links in discv5-wire.md
fjl Aug 31, 2020
47a94d6
discv5: fix markdown lint issues in discv5-theory.md
fjl Aug 31, 2020
d1903ec
discv5: improve handshake text and re-add id-signature definition
fjl Aug 31, 2020
fc5fdd0
discv5: listify FINDNODE distances
fjl Aug 31, 2020
1086cdd
discv5: remove auth-resp-key KDF output
fjl Sep 2, 2020
13c2302
discv5: delete unused image discv-topic-queue-diagram.png
fjl Sep 2, 2020
34b96c8
discv5: fix typo
fjl Sep 2, 2020
6e32e21
discv5: fix issue in description of AES/GCM authenticated data
fjl Sep 17, 2020
88b82a2
discv5: remove definition of rlp_bytes
fjl Sep 17, 2020
6e96b92
discv5: fix message-ids
fjl Sep 25, 2020
163c73f
discv5: limit request ID size
fjl Sep 25, 2020
4a9cc85
discv5: document TALKEQ/TALKRESP field types
fjl Sep 25, 2020
d244367
discv5: bump request-id max size to 8 bytes
fjl Sep 25, 2020
c13e4cf
discv5: improve the handshake text
fjl Sep 25, 2020
d19fbe7
discv5: add destination ID into id-sig-input
fjl Sep 25, 2020
995e68e
discv5: fix link
fjl Sep 25, 2020
5b59e88
discv5: put version into protocol-id
fjl Sep 30, 2020
60d9107
discv5: reduce id-nonce size and use IV in handshake
fjl Sep 30, 2020
2acb646
discv5: change text in id-signature-input
fjl Sep 30, 2020
1d133e6
discv5: be more exact about packet names
fjl Sep 30, 2020
23e35a7
discv5: define minimum packet size
fjl Sep 30, 2020
eae09e8
discv5: add text about limit for total nodes responses
fjl Sep 30, 2020
74c4af8
discv5: move src-id into authdata, nonce into static header
fjl Sep 30, 2020
a785698
discv5: update test vectors
fjl Sep 30, 2020
0c1b004
discv5: delete old auth message test vectors
fjl Sep 30, 2020
98c7cf9
discv5: update low-level test vectors
fjl Sep 30, 2020
a0a45ae
discv5: text fixes in test vectors document
fjl Sep 30, 2020
96e6b4b
discv5: text improvements
fjl Oct 1, 2020
6cf063c
discv5: use all of WHOAREYOU for id-signature and KDF
fjl Oct 2, 2020
815ef40
discv5: use entire unmasked packet header as authenticated data
fjl Oct 2, 2020
2526ef0
discv5: update test vectors
fjl Oct 2, 2020
1a4a974
discv5: fix challenge-data in test vector
fjl Oct 2, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
209 changes: 137 additions & 72 deletions discv5/discv5-theory.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,108 +43,130 @@ translation), implementations should omit IP address and UDP port from the recor
## Sessions

Discovery communication is encrypted and authenticated using session keys, established in
the handshake.
the handshake. Since every node participating in the network acts as both client and
server, a handshake can be initiated by either side of communication at any time.

### Handshake
### Handshake Steps

Since every node participating in the network acts as both client and server, a handshake
can be initiated by either side of communication at any time. In the following
definitions, we assume that node A wishes to communicate with node B, e.g. to send a
FINDNODE message.
#### Step 1: Node A sends message packet

Node A must have a node record for node B and know B's node ID to communicate with it. If
node A has session keys from prior communication, it encrypts its request with those keys.
If no keys are known, it initiates the handshake by sending a packet with random content.
In the following definitions, we assume that node A wishes to communicate with node B,
e.g. to send a FINDNODE message. Node A must have a copy of node B's record in order to
communicate with it.

If node A has session keys from prior communication with B, it encrypts its request with
those keys. If no keys are known, it initiates the handshake by sending an ordinary
message packet with random message content.

A -> B FINDNODE message packet encrypted with unknown key

Node B receives the initial packet, extracts the source node ID from the packet (see
[encoding section]), and continues the handshake by responding with [WHOAREYOU]. The
WHOAREYOU packet contains the id-nonce value to be signed by A as well as the highest
known ENR sequence number of node A's record.
#### Step 2: Node B responds with challenge

Node B receives the message packet and extracts the source node ID from the packet header.
If node B has session keys from prior communication with A, it attempts to decrypt the
message data. If decryption and authentication of the message succeeds, there is no need
for a handshake and node B can simply respond to the request.

If node B does not have session keys or decryption is not successful, it must initiate a
handshake by by responding with a [WHOAREYOU packet].

It first generates a unique `id-nonce` value and includes it in the packet. Node B also
checks if it has a copy of node A's record. If it does, it also includes the sequence
number of this record in the challenge packet, otherwise it sets the `enr-seq` field to
zero.

Node B must also store the WHOAREYOU challenge and A's record for a short duration after
sending it to node A because they will be needed again in step 4.

A <- B WHOAREYOU packet including id-nonce, enr-seq

Node A now knows that node B is alive and is ready to perform the handshake. The handshake
proceeds by re-sending the FINDNODE request in a [handshake packet].
#### Step 3: Node A processes the challenge

The handshake packet includes an ephemeral public key in the cryptosystem used by B's
identity scheme (e.g. an elliptic curve key on the secp256k1 curve if node B uses the "v4"
scheme). This key is used to perform Diffie-Hellman key agreement with B's static public
key and the session keys are derived from it using the HKDF key derivation function.
Node A receives the challenge sent by node B, which confirms that node B is alive and is
ready to perform the handshake. The challenge can be traced back to the request packet
which solicited it by checking the `nonce`, which mirrors the request packet's `nonce`.

ephemeral-key = random private key generated by node A
ephemeral-pubkey = public key corresponding to ephemeral-key
dest-pubkey = public key of node B
secret = ecdh(ephemeral-key, dest-pubkey)
info = "discovery v5 key agreement" || node-id-A || node-id-B
prk = HKDF-Extract(secret, id-nonce)
Node A proceeds with the handshake by re-sending the FINDNODE request as a [handshake
message packet]. This packet contains three parts in addition to the message:
`id-signature`, `ephemeral-pubkey` and `record`.

initiator-key, recipient-key = HKDF-Expand(prk, info)
Creating the handshake packet requires information from the received WHOAREYOU packet:

The handshake packet also contains a signature over `id-nonce` as well as node A's record
if the local sequence number is higher than `enr-seq`. The signature proves that node A
controls the identity key which signed its record and also prevents replay of the
handshake.
challenge-id-input = masking-iv || id-nonce
challenge-salt = masking-iv || authdata

id-sig-input = "discovery-id-nonce" || id-nonce || ephemeral-pubkey || node-id-B
id-signature = id_sign(sha256(id-sig-input)
Node A can now derive the new session keys. To do so, it first generates an ephemeral key
pair on the elliptic curve used by node B's identity scheme. As an example, let's assume
the node record of B uses the "v4" scheme. In this case the `ephemeral-pubkey` will be a
public key on the secp256k1 curve.

The request is now re-sent:
ephemeral-key = random private key generated by node A
ephemeral-pubkey = public key corresponding to ephemeral-key

A -> B FINDNODE message as handshake packet, encrypted with new initiator-key
The ephemeral key is used to perform Diffie-Hellman key agreement with node B's static
public key and the session keys are derived from it using the HKDF key derivation
function.

Node B receives the handshake packet and verifies that the signature over `id-nonce` was
created by node A's public key. To verify the signature, it looks at node A's record which
it either already has a copy of or which was received in the header.
dest-pubkey = public key corresponding to node B's static private key
secret = ecdh(ephemeral-key, dest-pubkey)
kdf-info = "discovery v5 key agreement" || node-id-A || node-id-B
prk = HKDF-Extract(secret, challenge-salt)
key-data = HKDF-Expand(prk, kdf-info)
initiator-key = key-data[:16]
recipient-key = key-data[16:]

Node B then performs key agreement/derivation using its own static private key and
`ephemeral-pubkey`. It can now decrypt the FINDNODE message.
Node A creates the `id-signature`, which proves that it controls the private key which
signed its node record. The signature also prevents replay of the handshake.

If the `id-nonce` signature is valid, and the message could be decrypted, Node B considers
the new session keys valid and responds to the request. In our example case, the response
is a `NODES` message:
id-signature-input = "discovery-id-nonce" || challenge-id-input || ephemeral-pubkey || node-id-B
id-signature = id_sign(sha256(id-signature-input))

A <- B NODES encrypted with new recipient-key
Finally, node A compares the `enr-seq` element of the WHOAREYOU challenge against its own
node record sequence number. If the sequence number in the challenge is lower, it includes
its own record into the handshake message packet.

Node A receives the response and authenticates/decrypts it with the new session keys. If
decryption succeeds, node B's identity is verified. Node A now considers the new session
keys valid and uses them for all further communication.
The request is now re-sent, with the message encrypted using the new session keys.

### Handshake Implementation Considerations
A -> B FINDNODE message as handshake packet, encrypted with new initiator-key

Since a handshake may happen at any time, implementations should keep a reference to all
sent request packets until the request either times out, is answered by the corresponding
response packet or answered by WHOAREYOU. If WHOAREYOU is received as the answer to a
request, the request must be re-sent with an authentication header containing new keys.
#### Step 4: Node B receives handshake message

Multiple responses may be pending when WHOAREYOU is received, as in the following example:
When node B receives the handshake message packet, it first loads the WHOAREYOU challenge
and node record which it stored earlier.

A -> B FINDNODE
A -> B PING
A -> B TOPICQUERY
A <- B WHOAREYOU (token references PING)
If node B did not have the node record of node A, the handshake message packet must
contain a node record. A record may also be present if node A determined that its record
is newer than B's current copy. If the packet contains a node record, B must first
validate it by checking the record's signature.

In those cases, pending requests can be considered invalid (the remote end cannot decrypt
them) and the packet referenced by WHOAREYOU (example: PING) must be re-sent with an
authentication header. When the response to the re-sent request (example: PONG) is
received, the new session is established and other pending requests (example: FINDNODE,
TOPICQUERY) may be re-sent.
Node B then verifies the `id-signature` against the identity public key of A's record.

Note that WHOAREYOU is only ever valid as a response to a previously sent request. If
WHOAREYOU is received but no requests are pending, the handshake attempt can be ignored.
After that, B can perform the key derivation using its own static private key and the
`ephemeral-pubkey` from the handshake packet. Using the resulting session keys, it
attempts to decrypt the message contained in the packet.

If the message can be decrypted and authenticated, Node B considers the new session keys
valid and responds to the message. In our example case, the response is a `NODES` message:

A <- B NODES encrypted with new recipient-key

#### Step 5: Node A receives response message

Node A receives the message packet response and authenticates/decrypts it with the new
session keys. If decryption/authentication succeeds, node B's identity is verified and
node A also considers the new session keys valid.

### Identity-Specific Cryptography in the Handshake

Establishment of session keys is dependent on the identity scheme of the recipient (i.e.
the node which sends WHOAREYOU). Similarly, the signature over `id-nonce-input` is made by
the identity key of the initiator. Although initiator and recipient might not be using the
same identity scheme in their respective node records, implementations must be able to
handle handshaking for all supported identity schemes.
Establishment of session keys is dependent on the [identity scheme] used by the recipient
(i.e. the node which sends WHOAREYOU). Likewise, the signature over `id-sig-input` is made
by the identity key of the initiator. It is not required that initiator and recipient use
the same identity scheme in their respective node records. Implementations must be able to
perform the handshake for all supported identity schemes.

At this time, the only supported identity scheme is "v4".

`id_sign(data)` creates a signature over `data` using the node's static private key. The
`id_sign(hash)` creates a signature over `hash` using the node's static private key. The
signature is encoded as the 64-byte array `r || s`, i.e. as the concatenation of the
signature values.

Expand All @@ -153,6 +175,49 @@ agreement. The public key is multiplied by the private key to create a secret ep
key `eph = pubkey * privkey`. The 33-byte secret output is `y || eph.x` where `y` is
`0x02` when `eph.y` is even or `0x03` when `eph.y` is odd.

### Handshake Implementation Considerations

Since a handshake may happen at any time, UDP packets may be reordered by transmitting
networking equipment, implementations must deal with certain subtleties regarding the
handshake.

In general, implementations should keep a reference to all sent request packets until the
request either times out, is answered by the corresponding response packet or answered by
WHOAREYOU. If WHOAREYOU is received as the answer to a request, the request must be
re-sent as a handshake packet.

If an implementation supports sending concurrent requests, multiple responses may be
pending when WHOAREYOU is received, as in the following example:

A -> B FINDNODE
A -> B PING
A -> B TOPICQUERY
A <- B WHOAREYOU (nonce references PING)

When this happens, all buffered requests can be considered invalid (the remote end cannot
decrypt them) and the packet referenced by the WHOAREYOU `nonce` (in this example: PING)
must be re-sent as a handshake. When the response to the re-sent is received, the new
session is established and other pending requests (example: FINDNODE, TOPICQUERY) may be
re-sent.

Note that WHOAREYOU is only ever valid as a response to a previously sent request. If
WHOAREYOU is received but no requests are pending, the handshake attempt can be ignored.

Another important issue is the processing of message packets while a challenge is
received: consider the case where node A has sent a packet that B cannot decrypt, and B
has responded with WHOAREYOU.

A -> B FINDNODE
A <- B WHOAREYOU

Node B is now waiting for a handshake message packet to complete the new session, but
instead receives another ordinary message packet.

A -> B ORDINARY MESSAGE PACKET

In this case, implementations should respond with a new WHOAREYOU challenge referencing
the message packet.

### Session Cache

Nodes should store session keys for communication with other recently-seen nodes. Since
Expand Down Expand Up @@ -458,9 +523,9 @@ average time to registration. The radius estimation value can be shared with the
registration algorithm if the same topic is being registered and searched for.

[EIP-778]: ../enr.md
[encoding section]: ./discv5-wire.md#packet-encoding
[handshake packet]: ./discv5-wire.md#handshake-message-packet-flag--2
[WHOAREYOU]: ./discv5-wire.md#whoareyou-packet-flag--1
[identity scheme]: ../enr.md#record-structure
[handshake message packet]: ./discv5-wire.md#handshake-message-packet-flag--2
[WHOAREYOU packet]: ./discv5-wire.md#whoareyou-packet-flag--1
[PING]: ./discv5-wire.md#ping-request-0x01
[PONG]: ./discv5-wire.md#pong-response-0x02
[FINDNODE]: ./discv5-wire.md#findnode-request-0x03
Expand Down
20 changes: 10 additions & 10 deletions discv5/discv5-wire.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ Here we present the notation that is used throughout this document.
`aesctr_encrypt(key, iv, pt)`\
    is unauthenticated AES/CTR symmetric encryption with the given `key` and `iv`.\
    Size of `key` and `iv` is 16 bytes (AES-128).\
`aesgcm_encrypt(key, nonce, pt, ad)`\
    is AES-GCM encryption/authentication with the given `key`, `nonce` and additional\
    authenticated data `ad`. Size of `key` is 16 bytes (AES-128), size of `nonce` 12 bytes.
`aesgcm_encrypt(key, nonce, pt)`\
    is AES-GCM encryption/authentication with the given `key` and `nonce`.\
    Size of `key` is 16 bytes (AES-128), size of `nonce` 12 bytes.

## UDP Communication

Expand Down Expand Up @@ -73,10 +73,10 @@ message.
Header information is 'masked' using symmetric encryption in order to avoid static
identification of the protocol by firewalls.

packet = iv || masked-header || message
iv = uint128 -- random data unique to packet
masked-header = aesctr_encrypt(masking-key, iv, header)
packet = masking-iv || masked-header || message
masked-header = aesctr_encrypt(masking-key, masking-iv, header)
masking-key = dest-id[:16]
masking-iv = uint128 -- random data unique to packet

The `masked-header` contains the actual packet header, which starts with a fixed-size
`static-header`, followed by a variable-length `authdata` section (of size `authdata-size`).
Expand All @@ -100,7 +100,7 @@ In ordinary message packets and handshake message packets, the packet contains a
authenticated message after the authdata section. For WHOAREYOU packets, the `message` is
empty. Implementations must generate a unique `nonce` value for every packet.

message = aesgcm_encrypt(initiator-key, nonce, message-pt, header)
message = aesgcm_encrypt(initiator-key, nonce, message-pt)
fjl marked this conversation as resolved.
Show resolved Hide resolved
message-pt = message-type || message-data

The `flag` field of the header identifies the kind of packet and determines the encoding
Expand All @@ -121,9 +121,9 @@ In WHOAREYOU packets, the `authdata` section contains information for the verifi
procedure. The `message` field of WHOAREYOU packets is always empty.

authdata = request-nonce || id-nonce || enr-seq
authdata-size = 52
authdata-size = 36
request-nonce = uint96 -- nonce of request packet that couldn't be decrypted
id-nonce = uint256 -- random bytes
id-nonce = uint128 -- random bytes
enr-seq = uint64 -- ENR sequence number of the requesting node

![whoareyou packet layout](./img/whoareyou-packet-layout.png)
fjl marked this conversation as resolved.
Show resolved Hide resolved
Expand Down Expand Up @@ -293,7 +293,7 @@ topic.
A collection of test vectors for this specification can be found at
[discv5 wire test vectors].

[handshake section]: ./discv5-theory.md#handshake
[handshake section]: ./discv5-theory.md#handshake-steps
[topic queue]: ./discv5-theory.md#topic-table
[theory section on tickets]: ./discv5-theory.md#tickets
[EIP-778]: ../enr.md
Expand Down