Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify onionmessage decryption #1179

Conversation

rustyrussell
Copy link
Collaborator

I recently re-implemented this to get our code production ready, and I found the spec unclear and even misleading. This series seeks to rework it to be clearer and more coherent.

Most pointedly, it's now clear that blinding applies to the onion, not the internal encrypted blob: the spec was confused about this at various points. The onion decoding is now specified exactly, and is general. The old references to "realms" (from legacy onions) is removed.

Copy link

@thomash-acinq thomash-acinq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blinding applies to the onion, not the internal encrypted blob.

That's not what's implemented in eclair and phoenix, and it contradicts the Route Blinding Requirements section.
I guess the error comes from the fact that there are two ways to use the blinding:

  • For the onion itself we use $B_i = HMAC256(\text{"blinded\_node\_id"}, ss_i) * N_i$
  • For the encrypted blob inside the onion we use $rho_i = HMAC256(\text{"rho"}, ss_i)$

1. an initial introduction point (`first_node_id`)
2. an initial tweak to modify the first node_id to decrypt the onion (`blinding`)
3. a series of tweaked node ids (`path.blinded_node_id`)
4. a series binary blobs encrypted to the real node ids (`path.encrypted_recipient_data`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect, the blob is not encrypted to the real node id, but with a key that is derived from both the blinding and the node's private key:
$ss_i = SHA256(e_i * N_i) = SHA256(k_i * E_i)$ (ECDH shared secret known only by $N_r$ and $N_i$)
$rho_i = HMAC256(\text{"rho"}, ss_i)$ (key used to encrypt the payload for $N_i$ by $N_r$)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is what we have in LDK as well, although I'm not too familiar with the notation being used. E.g., for ss_0, we compute ecdh(N_0, session_priv), which I think is equivalent to SHA256(e_0 * N_0) as in the above comment. Eclair is the most interop tested with us so far so pretty sure we're on the same page.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect, the blob is not encrypted to the real node id, but with a key that is derived from both the blinding and the node's private key: ssi=SHA256(ei∗Ni)=SHA256(ki∗Ei) (ECDH shared secret known only by Nr and Ni) rhoi=HMAC256("rho",ssi) (key used to encrypt the payload for Ni by Nr)

Well, yes. I should spell that out, indeed you use that same blinding as the ECDH ephemeral key.

What you don't do, is ssi=SHA256(ei*Bi). That would be a valid construction (which I would call "encrypting to the blinded node pubkey"). i.e. you don't use blinding as blinding at all here...

i.e. the onion is encrypted to the blinded node id, the inner encrypted_recipient_data is not. This is important!

@@ -290,13 +295,11 @@ The reader:
- If `encrypted_recipient_data` is present:
- If `blinding_point` is set in the incoming `update_add_htlc`:
- MUST return an error if `current_blinding_point` is present.
- MUST use that `blinding_point` as the blinding point for decryption.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this? blinding_point needs to be used as $E_i$ for the Route Blinding Requirements section.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it's not used as a blinding point. I changed this and the next one:

    - If `blinding_point` is set in the incoming `update_add_htlc`:
      - MUST return an error if `current_blinding_point` is present.
	  - MUST use `blinding_point` as $`E_i`$
    - Otherwise:
      - MUST return an error if `current_blinding_point` is not present.
      - MUST use `current_blinding_point` as $`E_i`$
      - SHOULD add a random delay before returning errors.
    - MUST return an error if `encrypted_recipient_data` does not decrypt to a valid `encrypted_data_tlv` as described in [Route Blinding](#route-blinding).

?

- Otherwise:
- MUST return an error if `current_blinding_point` is not present.
- MUST use that `current_blinding_point` as the blinding point for decryption.
- MUST use that `current_blinding_point` as `E_i` to derive the following blinding point.
Copy link

@thomash-acinq thomash-acinq Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean by "following blinding point" here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong, it's used both to derive the ss and to derive the next blinding point. Will fix...

Comment on lines 907 to 914
- if `blinding` is specified:
- Calculate the `blinding_ss` as ECDH(`blinding`, `node-privkey`)
- Tweak `public_key` by multiplying by $`HMAC256(\text{"blinded\_node\_id}", blinding_ss)`$

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing. We need to tweak the private key. And then we need to use the tweaked key instead of the public one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the alternative construction, where we tweak the onion ephemeral key (which is equivalent). (CLN needs to do this because we have an HSM and we didn't want to teach it to tweak, but it's also easier in the text).

We could specify both, but we do note the private key method alternative in the rationale?

- $`E_0 = e_0 \cdot G`$
- For every node in the route:
- let $`N_i = k_i * G`$ be the `node_id` ($`k_i`$ is $`N_i`$'s private key)
- $`ss_i = SHA256(e_i * N_i) = SHA256(k_i * E_i)$` (ECDH shared secret known only by $`N_r`$ and $`N_i`$)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- $`ss_i = SHA256(e_i * N_i) = SHA256(k_i * E_i)$` (ECDH shared secret known only by $`N_r`$ and $`N_i`$)
- $ss_i = SHA256(e_i * N_i) = SHA256(k_i * E_i)$ (ECDH shared secret known only by $N_r$ and $N_i$)

Markdown rendering is broken because of a misplaced `, we should probably remove all the ` when we already have $.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, I didn't even know that worked. But if we take out ` the formatting changes from monospaced to normal, which would be confusing unless we do it everywhere.

Fixed for now...

Revisiting our implementation recently, I found it difficult to follow.

In particular:
1. The onion is encrypted to the blinded node id, including (slightly redundantly) the introduction point.
2. The encrypted payload is encrypetd to the *unblinded* node id.

Make this clear, and give an example.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's currently a *description* of how to decrypt an onion, and some requirements
in forwarding.  But it also applies to onion messages, so:

1. Turn the description into actual enumerated requirements.
2. Ensure the description covers both payload and messaging onions.
3. Leave the actual handling of the extracted payload (payment vs messaging onion) to those specific sections (e.g. reporting failure)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
…sages and payments.

Simply tie the sections together, and put a section on how to handle blinding when
decrypting onions.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use blinding for the onion, not the `encrypted_recipient_data`.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. The writer creates a `blinded_path` object, so twealk requirements to
   refer directly to how to populate those fields.
2. This isn't just for payments, but also onion messages, so make language
   clearer.
3. There are two readers: the sender, who uses the blinded path to create
   an onion, and the other nodes who decrypt the encrypted_recipient_data.
   Make separate requirements for each.
4. Put the `encrypted_data_tlv` definition into its own section, even
   though requirements are in onion messages / payload sections.  Some
   rationale remains.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was from the legacy onion, and is no longer present.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
You can't actually generate encrypted_data_tlv until you've created
the blinding points for the shared secrets (needed for both tweaking
the outer onion and decryting the `encrypted_recipient_data`).

It makes the explanation more complex, but the previous one glossed
over too much.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@rustyrussell rustyrussell force-pushed the guilt/clarify-onionmessage-decryption branch from d3e24c6 to 465e470 Compare July 10, 2024 00:52
@rustyrussell
Copy link
Collaborator Author

OK, so I rolled in Thomas' feedback, which lead me to think that we should simply avoid calling the point "blinding" (as that's not all it's used for). Example commit which changes it to "path_key" at the end?

I think this is clearer. WDYT?

…ng blinding

We are handed an ephemeral key (E_i) to derive a shared secret, which derives
both the blinding tweak for the onion, *and* the key to decrypt the encrypted_recipient_data.

But we call it "blinding", which is confusing!  How about we call it "path_key"?

1. update_add_htlc_tlv's "blinding_point.blinding" -> "blinded_path.path_key"
2. payload's "current_blinding_point" -> "current_path_key"
3. blinded_path's "blinding" -> "first_path_key".
4. encrypted_data_tlv's "next_blinding_override.blinding" -> "next_path_key_override.path_key"

This sweep found other changes:
1. Writer of the TLV `payload`: simply refer here to Route Blinding requirements which
   says how to use the `blinded_path` if we have one (and gets it right on the two
   ways to use current_blinding_point).
2. Refer concretely to `blinded_path` (a type) rather than "blinded route" (a concept).
3. Use the term `encrypted_recipient_data` everywhere for consistency, and `encrypted_data_tlv` once it's decrypted.
4. Note explictly that you can't use the "next_path_key_override" if the prior node doesn't support route blinding.
5. Header "Blinding Ephemeral Keys" -> "Blinding Ephemeral Onion Keys" to avoid confusion with blinded paths!
6. When using `reply_path` for onion messages, simply refer to the Route Blinding requirements.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@rustyrussell rustyrussell force-pushed the guilt/clarify-onionmessage-decryption branch from 78fe75f to 2646637 Compare July 10, 2024 02:50
Copy link
Collaborator

@t-bast t-bast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stopped reviewing this half-way, when I reached the incorrect statement that Alice encrypts the onion to (Bob', Carol', Dave'). As detailed in my comment below, this is not how it works, and leads me to think that this updated version is actually less clear than the previous one.

But I do see where the confusion comes from: there is a fundamental difference between onion messages and payments, because onion messages combine two blinded paths while payments only use the recipient path. That's why I think the way it is specified currently on master makes more sense: how you decrypt the onion only depends on whether you received a path key or not (if you receive a path key, the onion is encrypted for your blinded node_id, otherwise it's encrypted to your normal node_id). Then, once you have a decrypted onion, if it contains an encrypted_data_tlv, you will use the path key you must have received somehow to derive the shared secret to decrypt it. That is a pretty simple rule to follow, so the reader requirements should be very straightforward. We may be missing some detailed writer requirements however, which is where there are different cases for payments and onion messages.

I think we can create a smaller patch to clear up those confusions than what this PR does, I'd like to try a different approach for that.

I like the renaming of blinding points to path keys though!

@@ -2040,8 +2040,10 @@ A receiving node:
- if other `id` violations occur:
- MAY send a `warning` and close the connection, or send an
`error` and fail the channel.
- if `blinding_point` is provided:
- MUST use the corresponding blinded private key to decrypt the `onion_routing_packet` (see [Route Blinding](04-onion-routing.md#route-blinding))
- MUST decrypt `onion_routing_packet` as specified in [Onion Decryption](04-onion-routing.md#onion-decryption) using `payment_hash` as `associated_data` (and `path_key` if specified).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's a bit confusing to how path_key is mentioned here, it feels like we should include it in the associated_data (which is not the case), what about:

Suggested change
- MUST decrypt `onion_routing_packet` as specified in [Onion Decryption](04-onion-routing.md#onion-decryption) using `payment_hash` as `associated_data` (and `path_key` if specified).
- MUST decrypt `onion_routing_packet` as specified in [Onion Decryption](04-onion-routing.md#onion-decryption):
- MUST use `path_key` if specified to derive the onion decryption key.
- MUST use `payment_hash` as `associated_data`.

I've done that change in t-bast@c073ef8 to simplify updates.

- MUST decrypt `onion_routing_packet` as specified in [Onion Decryption](04-onion-routing.md#onion-decryption) using `payment_hash` as `associated_data` (and `path_key` if specified).
- If that fails, or the payload is not a valid `payload` TLV:
- MUST report the failure to the origin node as described in [Returning Errors](04-onion-routing.md#returning-errors)
- MUST follow the requirements for processing the payload under [Failure Messages](04-onion-routing.md#failure-messages)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indenting is wrong:

Suggested change
- MUST follow the requirements for processing the payload under [Failure Messages](04-onion-routing.md#failure-messages)
- MUST follow the requirements for processing the payload under [Failure Messages](04-onion-routing.md#failure-messages)

Comment on lines +485 to +486
Alice encrypts an onion to Bob', Carol', Dave' and gives it to Bob
with the `first_path_key`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When used for payments, Alice doesn't encrypt the onion to Bob', she encrypts it to Bob and includes the first_path_key in the onion. That's what allows nodes between Alice and Bob to be unaware that a blinded path is used downstream. If the onion was encrypted to Bob', Bob would have no way of obtaining the path_key to decrypt it.

Alice uses Bob' only when she combines a blinded path from herself to Bob to the blinded path from Bob to Dave (which is what is done for onion messages).

In both cases, Bob only uses that path key to decrypt the encrypted_data_tlv: it's only nodes that are not the introduction node that use the path key to decrypt the onion.

I clarified that in t-bast@c073ef8

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree on the details, just not the framing.

I think of the broad case which I'm describing here as "encrypt to the entire blinded chain", with a special exception for the payment case.

This is spelled out clearly in the new requirements (which I just noticed did not replace blinding so uses the old terminology):

The reader of the `blinded_path`:
- MUST create its own onion to reach the `first_node_id`
- For the first entry in `path`:
  - if it is sending a payment:
    - MAY encrypt the first blinded path onion to `first_node_id` and include `blinding` as `current_blinding_point`
  - if it does not do that:
    - MUST encrypt the first blinded path onion to the first `blinded_node_id`.
    - MUST set `next_blinding_override` in the prior onion payload to `blinding`.
  - MUST include the first `path` `encrypted_recipient_data` in each onion payload within the blinded path.

And in the rationale:

Note that there are two ways for the sender to reach the introduction
point: one is to create a normal (unblinded) payment, and place the
initial blinding point in `current_path_key` along with the
`encrypted_recipient_data` in the onion payload for the introduction point to
start the blinded path. The second way (which is the only way for
onion messages) is to create a blinded path to
the introduction point, set `next_path_key_override` inside the
`encrypted_data_tlv` on the hop prior to the introduction point to the
`first_path_key`, so it is sent to the introduction node.  However, this
only works if that prior node supports blinded paths.

@t-bast
Copy link
Collaborator

t-bast commented Jul 10, 2024

I think that one of the things that makes this PR hard to review is that 5983830 and 570f6fa are unrelated to route blinding and could be extracted to its own PR. It's easier to review on its own and we should reach agreement quickly on it and merge it as a first step.

@rustyrussell
Copy link
Collaborator Author

I think that one of the things that makes this PR hard to review is that 5983830 and 570f6fa are unrelated to route blinding and could be extracted to its own PR. It's easier to review on its own and we should reach agreement quickly on it and merge it as a first step.

Agreed. It's only when I started editing I found these.

I will open a new PR for those, then rebase on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants