-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
htlcswitch+lnwallet+channeldb: invalid sig fix #3872
htlcswitch+lnwallet+channeldb: invalid sig fix #3872
Conversation
a422d10
to
88163f4
Compare
fb66fd1
to
012a488
Compare
When restoring an htlc fulfill update from disk, the payment hash wasn't restored previously.
c5d8cbb
to
0e8a372
Compare
0e8a372
to
5211a31
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great PR! I definitely think this is a nice way of solving this :)
My main concern is that we now restore the update logs from several sources: the local/remote commit, the pending remote commit and the dangling remote updates. Ideally we would reduce this, but that seems hard while still being backwards compatible, so this seems like the way to do it. We should just be careful that updates won't be restored from more that one of the sources, as that could lead to some subtle bugs.
lnwallet/channel.go
Outdated
localCommitmentHeight uint64) error { | ||
|
||
if len(danglingRemoteUpdates) == 0 { | ||
lc.log.Debugf("No dangling remote updates to restore") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug felt appropriate to me, given that this functionality is newly introduced. The log volume is still acceptable for users out there to collect. Do you think this is too spammy? We are already quite verbose on debug and this seemed to fit into that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could conditionally call this method instead (lift the if statement out to the caller).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed special casing completely. The for loop below is skipped and the sanity checks may be useful also in case there are no updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change uncovered that the sanity checks actually had false positives. A fee update may have already disappeared from the log, causing the last present update in the log not to equal the last committed update. Relaxed the check and brought it into the switch statement below.
|
||
switch payDesc.EntryType { | ||
case Add: | ||
lc.remoteUpdateLog.restoreHtlc(payDesc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some adds will be restored twice now, both when checking incoming HTLCs on our local commit, and here. But maybe that's okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm... maybe it isn't necessary then to save and restore add msgs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought about this a bit more. The operation is idempotent, so it doesn't seem to hurt. But maybe we do need the save/restore for add msgs in case an htlc is added and later settled on the local commit. If we aren't able to sign during that time, we still need to process those messages later.
Although this is probably not even possible. We can only settle if we've also added the add to our remote commit tx. But it seems safer to do it like it is now.
5211a31
to
c643ee5
Compare
Yes, fully agree to this. Maybe it would have been better to persist the complete local and remote update logs, to not have to think so hard about reconstruction. |
Any updates that aren't committed are to be forgotten upon reconnection. If we persisted all updates, we'd have issues w.r.t preemptively attempting to evaluate a commitment view using abandoned updates. |
Haven't dug into the PR yet, but this seems incorrect. Instead, we should only store the updates when we receive a signature from the remote party. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excited to finally, finally (for real this time) squash this annoying bug. Completed an initial pass through the PR and may have found a bug that would cause us to store more than necessary (updates of theirs sent after their signature ).
lnwallet/channel.go
Outdated
localCommitmentHeight uint64) error { | ||
|
||
if len(danglingRemoteUpdates) == 0 { | ||
lc.log.Debugf("No dangling remote updates to restore") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could conditionally call this method instead (lift the if statement out to the caller).
b7e45b9
to
fce0658
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this one is getting close!
I really like the way the commits are structured, with a simple buildup of commits before the last, important commit, where the real meat of the change is found.
lnwallet/channel.go
Outdated
|
||
// Fetch the last remote updates that have been committed on both sides. | ||
lastRemoteCommitted := lc.remoteCommitChain.tip().theirMessageIndex | ||
lastLocalCommitted := lc.localCommitChain.tip().theirMessageIndex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically this should be tail
since that is the oldest unrevoked commitment == the last index we acked. In this case I don't think it matters since we will always revoke after receiving commitment, but I think it'll be more correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
This test asserts that remote updates that are locked-in on the local commitment, but haven't been signed for on the remote commitment, are properly restored after a restart.
This method is only used to update the local commitment transaction. Updated comment accordingly.
Extract functionality to methods as a preparation for serializing remote log updates.
Extract method in preparation for restoring dangling remote updates. We need to get rid of the early return.
This commit updates the channel state machine to persistently store remote updates that we have received a signature for, but that we haven't yet included in a commit signature of our own. Previously those updates were only stored in memory and dropped across restarts. This lead to the production of an invalid signature and channel force closure. The remote party expects us to include those updates.
fce0658
to
8257940
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 💯
|
||
lc.localUpdateLog.markHtlcModified(payDesc.ParentIndex) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: also add sanity check that the last update index matches the log index after all pending updates are restored? (doesn't have to be in this method, just after both state logs are fully restored)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, so I discovered that this does not always need to be the case. #3872 (comment)
@@ -3092,6 +3277,96 @@ func (lc *LightningChannel) createCommitDiff( | |||
}, nil | |||
} | |||
|
|||
// getUnsignedAckedUpdates returns all remote log updates that we haven't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "that we have acked, but haven't..."
Running this patch on testnet now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 💴
chanID := lnwire.NewChanIDFromOutPoint(&lc.channelState.FundingOutpoint) | ||
|
||
// Fetch the last remote update that we have signed for. | ||
lastRemoteCommitted := lc.remoteCommitChain.tip().theirMessageIndex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 w.r.t the slight reshuffling of logic here, I find it much easier to follow now.
This PR persistently stores updates received from the remote party until we've produced a signature that includes these updates. Previously those updates were only stored in memory and dropped across restarts. This lead to the production of an invalid signature and channel force closure.
Fixes #3822
Fixes #3126