Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

commit_sig, revoke_and_ack ordering in channel reestablish flow #794

Closed
Crypt-iQ opened this issue Aug 11, 2020 · 19 comments · Fixed by #810
Closed

commit_sig, revoke_and_ack ordering in channel reestablish flow #794

Crypt-iQ opened this issue Aug 11, 2020 · 19 comments · Fixed by #810

Comments

@Crypt-iQ
Copy link
Contributor

If we have the following state transitions:

A            B
<---add-----
----add---->
<---sig-----
----rev----x
----sig----x

In this scenario, Bob's node dies and the revocation and signature don't reach him.
Since Alice just revoked her last commitment, the next sig also covers the add that Bob sent, in addition to the one Alice sent. So the commit_sig has two htlc signatures.

If, on restart, Alice sends the wrong order:

----add-->
----sig-->
----rev-->

then Bob will interpret the commit_sig that Alice sends as only having one sig, when it has two. Bob should then force close the channel.

Thus, the ordering of these two messages matters on startup. We can't hardcode a specific ordering as the "incorrect" ordering could always happen. This came up because in lnd, we always send the revocation first. This should be made explicit in the spec, but thought I'd open an issue up first.

@t-bast
Copy link
Collaborator

t-bast commented Aug 12, 2020

Can you clarify what part of the spec (if any) you disagree with here?

I also don't see why Alice would send what you suggest.
This will depend on what commitment/revocation numbers Alice and Bob provide in their channel_reestablish.
We can't discuss what messages should be sent afterwards without detailing the contents of channel_reestablish, and that should indicate exactly what messages need to be re-sent.

@Crypt-iQ Crypt-iQ reopened this Aug 12, 2020
@pm47
Copy link
Collaborator

pm47 commented Aug 12, 2020 via email

@Crypt-iQ
Copy link
Contributor Author

If on restart Alice sends add-sig-rev with 2 htlcs included in the sig, then it is simply a protocol violation, there is no ambiguity there. Le mer. 12 août 2020 à 12:32, Eugene notifications@github.com a écrit :

Reopened #794 <#794>. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#794 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPPFPWXDFJ24JUMJPLMVL3SAJVVFANCNFSM4P3PBJGA .

Yes, my point is that if one side owes two updates, they must be sent out in order on reconnect. That's what should be explicit

@Crypt-iQ
Copy link
Contributor Author

Crypt-iQ commented Sep 4, 2020

If we have the following state transitions in a channel (starting from the beginning):

A            B
<---add-----
----add---->
<---sig-----
----rev----x
----sig----x

If these two messages do not reach Bob, then on reconnect:

  • Alice sends reestablish with next_commitment_number of 2, next_revocation_number of 1
  • Bob sends reestablish with next_commitment_number of 1, next_revocation_number of 1

Alice then owes Bob

  • one update_add_htlc message
  • one revoke_and_ack message
  • one commit_sig message

However, what matters is that Alice send these updates in the correct order on startup. The order in which she sent them before going down.

There are two ways that Alice could send these three messages that could cause confusion:
The correct way:

----add-->
----rev-->
----sig-->

The wrong way:

----add-->
----sig-->
----rev-->

If Alice sends the revoke_and_ack before commit_sig like she did initially, there will be no problems. In this scenario, Bob will correctly interpret the ordering of revoke_and_ack and commit_sig and recognize that the commit_sig should have signatures for two htlcs. However, if commit_sig comes first and has two signatures on it, Bob will interpret this as invalid as he can only accept a signature covering the htlc he sent if he receives a revoke_and_ack first. Does this clarify why Alice needs to remember the ordering she sent them in?

@Crypt-iQ
Copy link
Contributor Author

Hey is there anything I can help to move discussion on this?

@t-bast
Copy link
Collaborator

t-bast commented Sep 23, 2020

That sounds like a good candidate for discussion during the next spec meeting if you want more eyes on it.
Can you attend? It's next monday at 8pm UTC on the #lightning-dev IRC channel.
I won't be able to attend this instance but there should be many other implementers.

@Crypt-iQ
Copy link
Contributor Author

I can attend, I'm not sure about a best possible path to fix it. I think others may have had this issue come up perhaps in rust-lightning or c-lightning?

@Crypt-iQ
Copy link
Contributor Author

Crypt-iQ commented Sep 29, 2020

I forgot when the meeting was, when is the next one? Sorry about that.

@ysangkok
Copy link
Contributor

@Crypt-iQ The meeting is every two weeks as you can infer from the dates of the minutes.

@t-bast
Copy link
Collaborator

t-bast commented Sep 30, 2020

There's also a tracking issue here with the meeting agenda.

@t-bast
Copy link
Collaborator

t-bast commented Oct 12, 2020

If Alice sends the revoke_and_ack before commit_sig like she did initially, there will be no problems. In this scenario, Bob will correctly interpret the ordering of revoke_and_ack and commit_sig and recognize that the commit_sig should have signatures for two htlcs. However, if commit_sig comes first and has two signatures on it, Bob will interpret this as invalid as he can only accept a signature covering the htlc he sent if he receives a revoke_and_ack first. Does this clarify why Alice needs to remember the ordering she sent them in?

At first glance, I do agree with that statement.
Alice does have to re-send messages in the order you mention.

To be sure I understand, your whole point on this issue is that the spec should make the order of messages sent when reconnecting clearer, right?

@Crypt-iQ
Copy link
Contributor Author

Yes since there is both a commit_sig and revoke_and_ack message that Alice must send, the spec should clarify that the order does matter. I believe the order should be the original order before reconnection.

@ysangkok
Copy link
Contributor

@Crypt-iQ What is your IRC nick, are you going to be at the meeting in a few hours?

@Crypt-iQ
Copy link
Contributor Author

@ysangkok ill be able to join - I don't have an irc nick so it will probably be my gh handle with some numbers.

@rustyrussell
Copy link
Collaborator

Thanks for this!

The problem is that we generally treat "receive sig, send revocation" as an atomic operation. By acknowledging that we've received the sig, but not immediately sending the revocation, we can break this and Badness Happens.

The answer is indeed, to send the revocation first for this case. But this is not the only case

c-lightning indeed remembers what it sent before:

	/* We have to re-send in the same order we sent originally:
	 * revoke_and_ack (usually) alters our next commitment. */
	if (retransmit_revoke_and_ack && !peer->last_was_revoke)
		resend_revoke(peer);

	if (retransmit_commitment_signed)
		resend_commitment(peer, peer->last_sent_commit);

	/* This covers the case where we sent revoke after commit. */
	if (retransmit_revoke_and_ack && peer->last_was_revoke)
		resend_revoke(peer);

@t-bast
Copy link
Collaborator

t-bast commented Oct 13, 2020

I added a test in eclair for exactly that scenario (ACINQ/eclair@a7ec104).

First of all, just to make sure we start on the right foot, I think the next_revocation_number you mention are wrong (or it's eclair who's wrong). Let me know if that was a typo or if it is really lnd's behavior.

It should be:

  • Alice sends reestablish with next_commitment_number of 2, next_revocation_number of 0
  • Bob sends reestablish with next_commitment_number of 1, next_revocation_number of 0

I do agree with the conclusion though: revoke_and_ack needs to be sent before commit_sig.
We can choose to replay the update_add_htlc either before or after the revoke_and_ack, this one shouldn't have any impact.

There is already a section in Bolt 2 that mentions this:

if next_revocation_number is equal to the commitment number of the last revoke_and_ack the receiving node sent, AND the receiving node hasn't already received a closing_signed:
    - MUST re-send the revoke_and_ack.

It feels to me that we "simply" need to enrich this with an additional:

    - MUST not send any commit_sig message before it has re-sent the revoke_and_ack

WDYT?

@Crypt-iQ
Copy link
Contributor Author

Yes @rustyrussell I should have outlined the other case - I can't really parse the C code but it looks like what I'm describing. @t-bast there is another case to handle where the commit_sig is sent before revoke_and_ack and so the above suggestion would not work. Also I was wrong, next_revocation_number should be 0 in my example above.

Another scenario (starting from the beginning):

A            B
<---add-----
----add---->
<---sig-----
----sig----x
----rev----x

Here Alice sends a commit_sig with just one signature and then replies with the revoke_and_ack.
If these two messages do not reach Bob, then on reconnect:

  • Alice sends reestablish with next_commitment_number of 2, next_revocation_number of 0
  • Bob sends reestablish with next_commitment_number of 1, next_revocation_number of 0

However, Alice cannot send the revoke_and_ack here before the commit_sig or else Bob should fail the channel as the commit_sig does not contain two signatures.

@t-bast
Copy link
Collaborator

t-bast commented Oct 15, 2020

Also I was wrong, next_revocation_number should be 0 in my example above.

Great, that's a relief :)

However, Alice cannot send the revoke_and_ack here before the commit_sig or else Bob should fail the channel as the commit_sig does not contain two signatures.

I updated my test in ACINQ/eclair@77ff04b to repro this modified scenario. You are right, in that case Alice needs to re-transmit:

----add---->
----sig----> (with only one signature, for her HTLC)
----rev---->

And in that case everything works fine.
But that indeed means that my suggestion to require revoke_and_ack to always be sent before commit_sig isn't the right recommendation 😅, it's a bit harder than that.

I think the right behaviour is to view your outgoing messages as a queue, that you persist on disk.
You may remove messages from that queue once you know they have been committed on both sides.
Once reconnection, you send messages in the same order as what you initially sent, starting from a point that's fully determined by the content of channel_reestablish (we need to expand the details a bit more and give concrete examples to make it easier to grasp). WDYT?

@Crypt-iQ
Copy link
Contributor Author

@t-bast That sounds like the right behavior - if we owe any updates based on the interpretation of channel_reestablish, we need to send them out again in their queue order.

t-bast added a commit that referenced this issue Oct 23, 2020
The existing requirements were not specifying the case where both a
`commitment_signed` and `revoke_and_ack` need to be retransmitted.

This is an important case to specify because if the relative order is not
preserved, the channel will close.

Fixes #794
t-bast added a commit that referenced this issue Nov 23, 2020
The existing requirements were not specifying the case where both a
`commitment_signed` and `revoke_and_ack` need to be retransmitted.

This is an important case to specify because if the relative order is not
preserved, the channel will close.

Fixes #794
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants
@ysangkok @rustyrussell @pm47 @Crypt-iQ @t-bast and others