Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discv5: protocol version v5.1 #157

Merged
merged 40 commits into from
Oct 7, 2020
Merged

discv5: protocol version v5.1 #157

merged 40 commits into from
Oct 7, 2020

Conversation

fjl
Copy link
Collaborator

@fjl fjl commented Jul 29, 2020

This change defines the first stable (non-draft) spec version of discv5.

Work in progress. Still needs:

Closes #152
Closes #141
Closes #135
Closes #131
Closes #117
Closes #83

@pipermerriam
Copy link
Member

Looking at the wire protocol document it doesn't appear to specify the computation of the checksum field. It does appear to be defined in #152 as crc64("discv5" || dest-id)

@pipermerriam
Copy link
Member

I'm realizing that the new documents don't mention the checksum and it is only present in the svg images that visualize the packet encoding.

@fjl
Copy link
Collaborator Author

fjl commented Aug 1, 2020

Yeah, sorry, I need to update the images. In a discussion a couple weeks ago, we realized that the checksum is silly because the obfuscation layer hides away the plaintext protocol identifier, and the recipient node won't be able to decrypt the packet header if it is sent to the wrong node.

@pipermerriam
Copy link
Member

@fjl are you referring to "Proposal 2" from #152 which uses aesctr_encrypt(masking-key, iv, plain-header) to obfuscate the header?

@fjl
Copy link
Collaborator Author

fjl commented Aug 3, 2020

Yes

@pipermerriam
Copy link
Member

pipermerriam commented Aug 5, 2020

I've generated the following test vectors that I'd be interested in validating since they represent my collective interpretation of the spec as documented in this PR, #152, as well as potentially some inference in areas where neither fully documents something yet.

# MessagePacket (flag=0)
initiator-key: 0x01010101010101010101010101010101
nonce: 0x020202020202020202020202
source-node-id: 0x0303030303030303030303030303030303030303030303030303030303030303
dest-node-id: 0x0404040404040404040404040404040404040404040404040404040404040404
message: PING: request-id=0x01  enr-sequence-number=0x00
auth-data: MessagePacket aesgcm-nonce= 0x050505050505050505050505

encoded-packet: 0x0aa2702c8135c1276a1c00c81e04f6cec58e496f3d2dd3283f34cbdeb97364e15e201821954a5ad0f20a16d9385f1d56eb847b9d8b936c09661ee331db73fd4166474b1a962a42522611e34e608ad5f8b619eeef6ad820edf74d35

# WhoAreYouPacket (flag=1)
initiator-key: 0x01010101010101010101010101010101
nonce: 0x020202020202020202020202
source-node-id: 0x0303030303030303030303030303030303030303030303030303030303030303
dest-node-id: 0x0404040404040404040404040404040404040404040404040404040404040404
message: PING: request-id=0x01  enr-sequence-number=0x00
auth-data: WhoAreYouPacket: request-nonce=0x050505050505050505050505  id-nonce=0x0606060606060606060606060606060606060606060606060606060606060606  enr-sequence-number=0x7

encoded-packet: 0xdb05f3d612ef0bf450a932aab441ab0b794619e0818f750601eeeb25017f1be55e187c749f21dab55d3031ecf7a0395adddab0c85508c14a92cd3ecefb2536aec95c618998c914a8bf7f9998bce735cc305ae648e825eb28554b6abd79193724a72a043a3dabc6cdecb49ba9e35251522611e35ecf69051276c32e0b149b8fee2c6729

# HandshakePacket (flag-2) (WITHOUT ENR)
initiator-key: 0x01010101010101010101010101010101
nonce: 0x020202020202020202020202
source-node-id: 0x0303030303030303030303030303030303030303030303030303030303030303
dest-node-id: 0x0404040404040404040404040404040404040404040404040404040404040404
message: PING: request-id=0x01  enr-sequence-number=0x00
auth-data: HandshakePacket: 
    auth-data-head=(version=0x01 signature_size=0x40 ephemeral_key_size=0x21)  
    id-signature=0x05050505050505050505050505050505050505050505050505050505050505050505050505050505050505050505050505050505050505050505050505050505  
    ephemeral-public-key=0x060606060606060606060606060606060606060606060606060606060606060606  
    record=0x

encoded-packet:  0xba4358d5b59616eae9bba87bdf70363fdef69b3ca7006d54713a1bc41c72c465414fc0e2672960eb14b6094c1a38a7d1bd2f2dd6021b4050963d07eb88b6e3c4b8b8c3342690f6d53fe1f09682fb139ad46f5fa2734c2e4eecd835ca726d95e80e5d73d90d022505bc8353749eda90128b32ff00e08687ec40d03fb818736041b7e49d295b4c55f351be13ed6dc5bb5959f5c80e7931c1565e2e925a2c9ea2522611e35e7c358bfc3987df51a2770f0c7b2f7f

# HandshakePacket (flag-2) (WITH ENR)
initiator-key: 0x01010101010101010101010101010101
nonce: 0x020202020202020202020202
source-node-id: 0x0303030303030303030303030303030303030303030303030303030303030303
dest-node-id: 0x0404040404040404040404040404040404040404040404040404040404040404
message: PING: request-id=0x01  enr-sequence-number=0x00
auth-data: HandshakePacket: 
    auth-data-head=(version=0x01 signature_size=0x40 ephemeral_key_size=0x21)  
    id-signature=0x05050505050505050505050505050505050505050505050505050505050505050505050505050505050505050505050505050505050505050505050505050505  
    ephemeral-public-key=0x060606060606060606060606060606060606060606060606060606060606060606  
    record=0xf88bb840242198bffec59a15dd134b44a125e39a3f0330aa481ca7359f17dd1889e7427a3f8974a0c4e64cd518619e12ffec0ece01f85e7807d3aa62d07443b847b7566635826964827634826970848d3ca05389736563703235366b31a103e3449d6a29899c3b29b2c4bc27b6407070d7f33b5524fd2e7f1d0d3dfa9bc68483746370829ceb8375647082851d

encoded-packet:  0x2e47005ae2c0c50eea1f32fd86029a128e26ad45a6e1b5d0dc58759528d237b0ba5bf7d24087e3add447709db01dd1cf53304a64f26e02d6a1dd895932bc2621e60eef1301be21851a04d3dd2b9398d290573cb2baceafe8e5e438953c787832e8295642ed61c676122e98c15e1fe7cb4319ffc72125c63c5bfb760ea945c456d1a919239078d632585ae192184f6e7ddef806d9d2e8530803c07a18597d4631d675ec7815939aea0194d88c608cf4fc253eacb43d8d5cb71a5c722cd835050b7cc606e4d0e9e8b48cb9fd72c48a39c43c268e2307644f13743ba1ce404b68b3588e5b8eff48ae34175e15d5fbcd1ecb0a572abbea4310986a8176b01da5be2a9b42f40740eebd965dd31ec4a0f6acb3e4f9ff935352165ea9fe24dd4357d628df45ce79884255f83e38489c522611e3d9cf00dc787a383be694bef0bb69b456

Note: The ENR record in the last example was randomly generated. If these get elevated to a more official status, that record should re-generated with appropriate test vector data.

discv5/discv5-wire.md Outdated Show resolved Hide resolved
discv5/discv5-wire.md Outdated Show resolved Hide resolved
discv5/discv5-wire.md Outdated Show resolved Hide resolved
@pipermerriam
Copy link
Member

pipermerriam commented Aug 25, 2020

Consider this case. There is a handshake in progress between Initiator and Recipient. The Recipient has sent the WhoAreYouPacket and is waiting for the HandshakePacket to finalize the handshake.

The Recipient receives a MessagePacket.

At this stage there are two possibilities.

  • Case A
    • The MessagePacket arrived out of order. A HandshakePacket will arrive shortly, after which the MessagePacket can be decoded successfully with the derived session keys.
  • Case B:
    • The MessagePacket is actually the initiation of a new handshake (and there will not be a HandshakePacket arriving to finalize the current in-progress handshake).

The Recipient must make a choice.

  • If the situation is case A they should buffer the packet and decode it in a moment once session keys have been derived.
  • If the situation is case B they should initiate a new session and respond with a WhoAreYouPacket

Since the Recipient doesn't know which case it is, I believe we should encourage implementations to assume case B and to always respond with a WhoAreYouPacket when encountering this situation.

  • If the situation is actually case A, then the WhoAreYouPacket should simply be ignore since the Initiator would already consider the handshake completed and a WhoAreYouPacket will not incidentally trigger initiation of a new handshake.
  • If the situation if actually case B, then the Recipient will have responded correctly, allowing the new handshake to proceed, with the initial handshake eventually timing-out/expiring.

This is a subtle enough case that it seems potentially worth documenting in some sort of implementers guide since an implementation that chooses the other route will result in both Initiator and Recipient having incomplete in-progress handshakes that will have to wait to be timed-out/expired before either is likely to initiate again.

@fjl
Copy link
Collaborator Author

fjl commented Aug 31, 2020

I'm finally updating the spec with the header masking, and two new ideas came up:

  • In Lighthouse replies to undersized RANDOM PACKETS sigp/lighthouse#1568, it came to light that we need to have a minimum packet size for message packets. I was aware of this issue but didn't think about it lately, mostly because I didn't have a good solution.

    Ultimately, the issue is that WHOAREYOU packets are too large in comparison to message packets.
    This is because they need to carry 32 bytes of uniqueid-nonce random data back to the initiator.
    But now, with the header masking which we are adding, we already send 16 bytes of random iv data.
    This means we might be able to shrink the id-nonce by at least 16 bytes and redefine the ID-nonce hash formula to be sha256("discovery-id-nonce" || iv || authdata || eph-pubkey). Is this too risky? Need to figure it out!

  • Regarding the issue @pipermerriam described above, I think it might be a good idea to distinguish requests from responses via the flag field. One of the underlying assumptions has always been that responses (i.e. PONG, NODES, ...) should not initiate a new session. This assumption is in place because the handshake works by resending the request when WHOAREYOU is received as the response. If WHOAREYOU is received as the response to a response, it should be discarded anyway.

    If we add another flag value for responses, the recipient would know that it shouldn't respond with WHOAREYOU even if it can't decrypt the message for some reason.

@fjl
Copy link
Collaborator Author

fjl commented Aug 31, 2020

Damn, it looks like I wiped out the definitions of id-nonce-input and id-signature when moving the handshake to theory.md. Will put them back!

@fjl
Copy link
Collaborator Author

fjl commented Aug 31, 2020

Another thing we could still change: feed the dest-id into id-signature.

I'm suggesting this because I got mildly concerned that the removal of the encrypted auth-response might create an MITM risk. In the 5.0 handshake, we encrypted the id-signature with a separate session secret, but this was removed and the signature is now sent clear text (technically within the masked-header container, but that's no better than clear text).

Consider communication A -> X -> B, with X being the MITM node.

Preconditions:

  • A has node record of B, i.e. it knows B's ID
  • X also has the node record of B, and is positioned to intercept A's packets

Communication flow:

  • A sends a message packet (or a random packet).
  • Message packet arrives at X, and it can unmask the header. X learns about A's ID here.
  • X forwards the message packet to B, changing the return address.
  • B responds to X with WHOAREYOU containing id-nonce.
  • X forwards WHOAREYOU to A.
  • A derives session keys and responds to X with handshake packet containing id-signature + eph-pubkey.
  • X forwards the handshake packet. X can see signature and eph-pubkey, but it can't change eph-pubkey because it would invalidate id-signature. X can't decrypt message either because it doesn't have any of the private keys.
  • B verifies id-signature and derives session keys. B can now decrypt the message.

So, no MITM possible.

But can X replay the id-signature somehow to authenticate as A to a different node C? Technically not, because X can't influence the id-nonce sent by node C. But it feels too shaky with all those malleable parts. I think it might be safer to put B's ID into the signature to ensure this can't happen, if only to make the id-signature an even more specific statement.

@fjl fjl force-pushed the discv5-v1-update branch from 0675791 to 3b03774 Compare August 31, 2020 13:03
@zilm13
Copy link

zilm13 commented Aug 31, 2020

@fjl Do you have Geth fork with all in it?

@fjl
Copy link
Collaborator Author

fjl commented Sep 30, 2020

I've just pushed the last spec update. Thank you so much for all your amazing feedback on this pull request. I have tried to address all of the issues you guys have raised.

Here's the list of changes in the last couple commits:

  • The version field of the handshake message packet is gone. Instead, the protocol-id is now
    6 bytes (it was 8 before) and the two bytes after protocol-id are now the version.
  • id-nonce size is now 16 bytes
  • The definitions for id-signature and the key derivation have been changed to include
    the masking-iv and some other data. Please check the definitions against your implementation
    to see what you need to change. I will publish new test vectors for this part very soon.
  • The 'authenticated data' of AES/GCM is not used anymore.
  • The minimum packet size is now 63 bytes because the size of WHOAREYOU is
    63 bytes and all other packets are larger by definition.

We need to stop changing the spec now and focus on interop.

TODO: update images

@fjl fjl changed the title discv5: WIP protocol version v5.1 discv5: protocol version v5.1 Sep 30, 2020
@fjl fjl added this to the Discovery v5 milestone Oct 1, 2020
discv5/discv5-wire.md Outdated Show resolved Hide resolved
@fjl
Copy link
Collaborator Author

fjl commented Oct 2, 2020

Seeing all this confusion about the challenge fields and the concerns about packet header authentication, I have decided to use the simplest solution possible: just use the entire unmasked packet header as the challenge data for both id signature and KDF, and also use it as authenticated data in the AES/GCM step.

From a spec perspective, this change is very nice and I wonder why we didn't do it like that earlier.

From an implementation point of view, this change sucks, at least for my own implementation. Since the data dependencies have changed so much, I had to change the ordering of some of the encoding steps in order to be able to have the encoded headers and masking-iv available when encrypting the message. But it's implemented in go-ethereum now, and the test vectors are also updated.

BTW, I noticed that there was a problem with my HKDF usage, my own tests worked on Go 1.15 but not on earlier versions. This is now corrected and I hope you will be able to match the new test vectors.

@kdeme
Copy link

kdeme commented Oct 2, 2020

BTW, I noticed that there was a problem with my HKDF usage, my own tests worked on Go 1.15 but not on earlier versions. This is now corrected and I hope you will be able to match the new test vectors.

I can confirm that I can now decode your latest provided test vectors 👍

@fjl
Copy link
Collaborator Author

fjl commented Oct 2, 2020

Very nice! All of them?

If we get another implementation to agree, (and when I'm done updating the images) the PR can be merged!

@pipermerriam
Copy link
Member

I'm now passing all of the test vectors in my implementation.

@Nashatyrev
Copy link
Member

I'm passing them too with JVM implementation

@fjl fjl merged commit 56a498e into ethereum:master Oct 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants