-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
htlcswitch: pipeline settles to switch #3143
htlcswitch: pipeline settles to switch #3143
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Crypt-iQ this looks great, only one minor comment. looking forward to benchmarking the performance improvement!
I have been running flakehunter unit tester on the failed test in travis and it hasn't failed... so I think it's unrelated. I have seen that error on my mac though (unrelated to this branch). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Crypt-iQ a couple other small comments, fixups can be squashed as well!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker, but since there'll be duplicate settles we'll see some error logs that aren't necessary and might lead to users thinking there's an issue. Below are the trace logs for a completed payment:
2019-07-09 18:55:57.835 [DBG] HSWC: ChannelLink(668de44edf52c85c7ec6e54e604dcfe88d6a312cb08afb7d25bf49eb11ecedb4:0): sampled fee rate for 3 block conf: 12500 sat/kw
2019-07-09 18:56:02.893 [TRC] HSWC: Committing fresh circuits: ([]channeldb.CircuitKey) (len=1 cap=1) {
(channeldb.CircuitKey) (Chan ID=0:0:0, HTLC ID=3)
}
2019-07-09 18:56:02.919 [TRC] HSWC: ChannelLink(401:1:0) received switch packet inkey=(Chan ID=0:0:0, HTLC ID=3), outkey=(Chan ID=401:1:0, HTLC ID=0)
2019-07-09 18:56:02.919 [TRC] HSWC: ChannelLink(401:1:0) Received downstream htlc: payment_hash=71b691fe313daa0a8ff7b18824f755240561a11512a18fe12c7c745dee7ed532, local_log_index=2, batch_size=1
2019-07-09 18:56:02.919 [DBG] HSWC: ChannelLink(401:1:0) Queueing keystone of ADD open circuit: (Chan ID=0:0:0, HTLC ID=3)->(Chan ID=401:1:0, HTLC ID=2)
2019-07-09 18:56:02.972 [TRC] HSWC: Opening finalized circuits: ([]htlcswitch.Keystone) (len=1 cap=1) {
(htlcswitch.Keystone) (Chan ID=0:0:0, HTLC ID=3) --> (Chan ID=401:1:0, HTLC ID=2)
}
2019-07-09 18:56:03.008 [DBG] HSWC: ChannelLink(401:1:0) removing Add packet (Chan ID=0:0:0, HTLC ID=3) from mailbox
2019-07-09 18:56:03.008 [TRC] HSWC: Deleting resolved circuits: ([]channeldb.CircuitKey) <nil>
2019-07-09 18:56:03.067 [TRC] HSWC: ChannelLink(401:1:0) processing 0 remote adds for height 5
2019-07-09 18:56:03.211 [DBG] HSWC: Closed completed SETTLE circuit for 71b691fe313daa0a8ff7b18824f755240561a11512a18fe12c7c745dee7ed532: (0:0:0, 3) <-> (401:1:0, 2)
2019-07-09 18:56:03.211 [DBG] HSWC: Tearing down open circuit with SETTLE pkt, removing circuit=(Chan ID=0:0:0, HTLC ID=3) with keystone=(Chan ID=401:1:0, HTLC ID=2)
2019-07-09 18:56:03.218 [TRC] HSWC: Deleting resolved circuits: ([]channeldb.CircuitKey) (len=1 cap=1) {
(channeldb.CircuitKey) (Chan ID=0:0:0, HTLC ID=3)
}
2019-07-09 18:56:03.257 [DBG] HSWC: Closed completed SETTLE circuit for 71b691fe313daa0a8ff7b18824f755240561a11512a18fe12c7c745dee7ed532: (0:0:0, 3) <-> (401:1:0, 2)
2019-07-09 18:56:03.364 [TRC] HSWC: Deleting resolved circuits: ([]channeldb.CircuitKey) <nil>
2019-07-09 18:56:03.440 [DBG] HSWC: ChannelLink(401:1:0): settle-fail-filter &{1 [0]}
2019-07-09 18:56:03.440 [TRC] HSWC: ChannelLink(401:1:0) processing 0 remote adds for height 6
2019-07-09 18:56:03.440 [ERR] HSWC: Unable to find target channel for HTLC settle/fail: channel ID = 401:1:0, HTLC ID = 2
2019-07-09 18:56:03.441 [ERR] HSWC: ChannelLink(401:1:0) unhandled error while forwarding htlc packet over htlcswitch: Unable to find target channel for HTLC settle/fail: channel ID = 401:1:0, HTLC ID = 2
2019-07-09 18:56:12.673 [DBG] HSWC: Sent 100 satoshis and received 0 satoshis in the last 10 seconds (0.200000 tx/sec)
2019-07-09 18:56:18.475 [DBG] HSWC: Acked 1 settle fails: ([]channeldb.SettleFailRef) (len=1 cap=1) {
(channeldb.SettleFailRef) {
Source: (lnwire.ShortChannelID) 401:1:0,
Height: (uint64) 6,
Index: (uint16) 0
}
}
server.go
Outdated
@@ -430,6 +430,8 @@ func newServer(listenAddrs []net.Addr, chanDB *channeldb.DB, cc *chainControl, | |||
htlcswitch.DefaultFwdEventInterval), | |||
LogEventTicker: ticker.New( | |||
htlcswitch.DefaultLogInterval), | |||
AckEventTicker: ticker.New( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: wrapping should be (if it doesn't fit in 80 chars):
ticker.New(
htlcswitch.DefaultAckInterval,
)
// because the settles are pipelined to the switch and otherwise | ||
// the bandwidth won't be updated by the time Alice receives a | ||
// response here. | ||
time.Sleep(2 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use a mock ticker to send a force tick and avoid the sleep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused about how this would work. We can feed ticks to the AckEventTicker
, but that won't help since we need to actually wait for the revocation from Bob (the duplicate settle doesn't matter here) and there doesn't seem to be a straightforward way to be notified upon receiving a revocation. We could loop and timeout after some timeout interval, but that's not so clean either...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems TestChannelLinkSingleHopPayment
fails for a similar reason and can be updated with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, you're right. I see why the sleep is indeed needed, seems fine to me as is for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, the TestChannelLinkSingleHopPayment
no longer fails on my machine with flake-unit
, so maybe it's just my Mac.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, it flakes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can use lntest.WaitPredicate instead
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get an import cycle then, but we could just copy the function?
@wpaulino I wasn't entirely sure what to do about the extra error logs - I could remove it if the packet has a |
09f980e
to
9f065dc
Compare
Sounds good to me. I think @cfromknecht had some thoughts on possibly refactoring that portion of code to improve clarity of the new behavior, though I'd say it's not a blocker for this PR. |
If we don't log the error, we should still return the error and it will still get reported in the link though. So idk if the change is worth it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This diff turned out to be much smaller than I had originally anticipated, me gusta!
|
||
// DefaultAckInterval is the duration between attempts to ack any settle | ||
// fails in a forwarding package. | ||
DefaultAckInterval = 15 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would this we want this value to be much lower, like in the ms
? The two above are for non-critical operations like logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't too critical, it just needs to happen at some point so we stop resending the htlc internally. if it doesn't happen before shutdown it will be done when on the next connection
6f1c1c3
to
029956e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Nice work on this optimization @Crypt-iQ !! 💯
Needs a rebase @Crypt-iQ. |
This commit makes the outgoing link pipeline the settle to the switch as soon as it receives it. Previously, it would wait for a revocation before sending it, which caused increased latency on payments as well as possibly never settling on the incoming link. A duplicate settle is still sent to the switch, but it is handled gracefully. A new AckEventTicker was added to the switch which acknowledges any pending settle / fail entries in an outgoing link's fwd pkgs in batch. This was needed in order to reduce the number of db txn's which would have been incurred by acking whenever we receive a duplicate settle without batching.
029956e
to
00814dc
Compare
Rebased and squashed fixups @Roasbeef |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ⛩
@Crypt-iQ Cheers to lower latency for anyone routing through LND!! 🍾🎉🔥 |
Created issue for unjustified error: #3656 Not ideal that there is always an error added to the log during the very basic operation of a successful payment. |
This PR makes the outgoing link pipeline the settle to the
switch as soon as it receives it. Previously, it would wait for a
revocation before sending it, which caused increased latency on
payments as well as possibly never settling on the incoming link.
A duplicate settle is still sent to the switch, but it is handled
gracefully. A new AckEventTicker was added to the switch which
acknowledges any pending settle / fail entries in an outgoing
link's fwd pkgs in batch. This was needed in order to reduce the
number of db txn's which would have been incurred by acking whenever
we receive a duplicate settle without batching.
Fixes #3069