Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[anchors] HTLC second level aggregation in the sweeper #4779

Merged
merged 25 commits into from
Dec 12, 2020

Conversation

halseth
Copy link
Contributor

@halseth halseth commented Nov 18, 2020

This PR finishes the changes needed to make the sweeper bundle HTLC second level transaction in a sweep tx, and resign the inputs.

To achieve this we extend the resolvers to include SignDetails containing signatures and information needed for when the inputs are signed.

Depends on #4750
Depends on #4838

@halseth halseth added this to the 0.12.0 milestone Nov 18, 2020
@halseth halseth requested a review from Roasbeef November 18, 2020 18:41
@halseth halseth requested a review from cfromknecht as a code owner November 18, 2020 18:41
@@ -328,6 +328,12 @@ const (
AcceptedHtlcScriptSize = 3*1 + 20 + 5*1 + 33 + 8*1 + 20 + 4*1 +
33 + 5*1 + 4 + 8*1

// AcceptedHtlcScriptSizeConfirmed 143 bytes
//
// TODO(halseth): the non-confirmed version currently includes the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that it's currently being over estimated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's why the overhead is commented out below. Aiming to fix this in #4775

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc we decided having the extra constant wasn't worth it bc we were only overestimating by 3 bytes. what changed?

found the comment: https://github.com/lightningnetwork/lnd/pull/3821/files#r387578921

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's right. I will look into in the other PR whether it is worth keeping it this way, but I felt as we now need a new witnesstype for these inputs, it was better to define a separate weight constants for them.

// AcceptedHtlcSuccessWitnessSizeConfirmed 327 bytes
//
// Input to second level success tx, spending 1 CSV delayed HTLC output.
AcceptedHtlcSuccessWitnessSizeConfirmed = 1 + 1 + 1 + 73 + 1 + 73 + 1 + 32 + 1 +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the defined constants should use the existing "unrolled annotated" comment structure to be more self documenting, and make it easier to catch any errors in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, not entirely sure what you mean, but probably a discussion for #4775.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean a comment annotating the set of integers being summed up here, can mostly be copy-pasted from prior versions.


// secondLevelConfTarget is the confirmation target we'll use when
// adding fees to our second-level HTLC transactions.
secondLevelConfTarget = 6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good enough for now, but eventually this should start to be a function of the current height and the deadline to expiry (if the HTLC was contested).

contractcourt/htlc_success_resolver.go Outdated Show resolved Hide resolved
contractcourt/htlc_success_resolver.go Outdated Show resolved Hide resolved
contractcourt/htlc_success_resolver.go Outdated Show resolved Hide resolved
contractcourt/htlc_success_resolver.go Outdated Show resolved Hide resolved
contractcourt/htlc_success_resolver.go Outdated Show resolved Hide resolved
contractcourt/htlc_timeout_resolver.go Outdated Show resolved Hide resolved
contractcourt/htlc_timeout_resolver.go Outdated Show resolved Hide resolved
Copy link
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 initial pass to familiarize myself with the changes

sweep/sweeper.go Outdated
@@ -833,6 +834,103 @@ func (s *UtxoSweeper) clusterBySweepFeeRate(inputs pendingInputs) []inputCluster
return inputClusters
}

// zipClusters merges pairwise clusters from as and bs such that cluster a from
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cluster a

sweep/sweeper.go Outdated

// Go through each cluster in as, and merge with the next one from bs
// if it has at least the fee rate needed.
for i := range as {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for a, i := range as?

sweep/sweeper.go Outdated

// We can merge.
merged := mergeClusters(a, bs[j])
finalClusters = append(finalClusters, merged...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if there are multiple clusters in b that an be merged with this a?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it will be merged with the first one compatible. This is definitely not an optimized version, there are certainly "smarter" ways of finding clusters to merge with, but wanted to keep it simple for now.

sweep/sweeper.go Outdated
sweepFeeRate += inputFeeRates[op]
}

sweepFeeRate /= chainfee.SatPerKWeight(len(inputs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn’t this need to be a weighted average based on the inputs’ actual weight? this method assumes all inputs are of equal weight

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct (if we want to average the feerates), since we taking the average fee rate here, not the fee. This is pre-existing though, I'm just using the same logic as below.

Copy link
Contributor

@cfromknecht cfromknecht Nov 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm i'm not so sure, say you have two inputs A (100w, 1000sat/kw) and B (200w, 3000sat/kw). individually, input A would pay 100w * 1000sat/kw = 100 sat and B would pay 200w * 3000sat/kw = 600sat. combined they pay 700sat total for 300w, or 2.333sat/w.

the unweighted average of the fee rates gives you 2000sat/kw, so total of 300w * 2000sat/kw = 600sat.

however, a weighted average produces (1000sat/kw + 2 * 3000sat/kw)/3 = 2333sat/kw which gives us the expected fee rate. then 100w * 2333sat/kw + 200w * 2333sat/kw = 700sat as expected.

This is pre-existing though, I'm just using the same logic as below.

This may be, but I still think it's incorrect :P

Copy link
Contributor Author

@halseth halseth Nov 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which gives us the expected fee rate

Sure, but I don't think it is defined what is the "expected" fee rate here. Here we cluster some inputs with different fee rates togheter, and need to decide which fee rate to use for all of them. Even though one is heavier than the other, I'm not sure it is "more correct" to choose a fee rate that is closer to that one.

I do agree though, that it might be a better strategy. Since many tiny outputs could bring the fee rate down (and since they are tiny they don't cost much anyway) it could be more correct to take the weighted average. You okay with me doing this as a follow up (must change it in the pre-existing case as well)?

Copy link
Contributor

@cfromknecht cfromknecht Nov 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, maybe it helps to clarify the objective. say you have the weights and fee rate for each input, (w_i, r_i), i in [1, n]. my assumption was that this is trying to compute r_agg where sum (w_i * r_i) = r_agg * sum (w_i), where r_agg represents the actual fee rate of the cluster after aggregating all the desired fees paid by individual inputs.

a different example, it might make sense to take r_max = max(r_i), which then forces all lower fee-rate inputs to pay the same fee-rate as the input with highest priority. unlike r_agg, this way creates a sweepFeeRate that never decrease the priority of an input, which seems useful if certain inputs are time-sensitive.

i could also see e.g taking the most recent r_i, which is maybe viewed as the most up-to-date fee rate from our estimator/user, and forcing all inputs to readjust to that. this value could fluctuate though, so maybe it's not ideal for time-sensitive inputs.

i agree that there are different heuristics we can choose here, and one may not necessarily be more correct. if the goal is simply to pick an arbitrary value between min(r_i) and max(r_i), then an unweighted average works. however it still doesn't translate into anything meaningful, which is why imo it seems less correct than some of the other options available.

You okay with me doing this as a follow up (must change it in the pre-existing case as well)?

Sure! Yeah I'm okay with a follow up :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: #4812

@@ -83,14 +83,18 @@ type txInputSet struct {
wallet Wallet
}

func dustLimit(relayFee chainfee.SatPerKWeight) btcutil.Amount {
return txrules.GetDustThreshold(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be wrong for litecoin..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. Added to the tracking issue: #3946

sweep/txgenerator.go Outdated Show resolved Hide resolved
// If the input comes with a required tx out that is below dust, we
// won't add it.
reqOut := inp.RequiredTxOut()
if reqOut != nil && btcutil.Amount(reqOut.Value) < t.dustLimit {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it depends on whether the relay fee changes, so I would say it's unlikely.

//
// TODO(halseth): the non-confirmed version currently includes the
// overhead.
AcceptedHtlcScriptSizeConfirmed = AcceptedHtlcScriptSize // + HtlcConfirmedScriptOverhead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intentional comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@halseth halseth force-pushed the anchor-htlc-aggregation branch 2 times, most recently from 291f669 to e5c10db Compare November 23, 2020 12:08
@halseth
Copy link
Contributor Author

halseth commented Nov 23, 2020

Pushed new version that does checkpointing, and moved much of the logic into separate methods. PTAL

Copy link
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

latest version looking good! just some minor nits

contractcourt/briefcase.go Outdated Show resolved Hide resolved
@@ -328,6 +328,12 @@ const (
AcceptedHtlcScriptSize = 3*1 + 20 + 5*1 + 33 + 8*1 + 20 + 4*1 +
33 + 5*1 + 4 + 8*1

// AcceptedHtlcScriptSizeConfirmed 143 bytes
//
// TODO(halseth): the non-confirmed version currently includes the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc we decided having the extra constant wasn't worth it bc we were only overestimating by 3 bytes. what changed?

found the comment: https://github.com/lightningnetwork/lnd/pull/3821/files#r387578921

contractcourt/htlc_success_resolver.go Outdated Show resolved Hide resolved
contractcourt/htlc_timeout_resolver.go Outdated Show resolved Hide resolved
@halseth halseth force-pushed the anchor-htlc-aggregation branch 3 times, most recently from 5e5ab68 to f536488 Compare November 27, 2020 14:35
@halseth
Copy link
Contributor Author

halseth commented Nov 27, 2020

Pushed an itest for the HTLC aggregation, and noticed that we cannot rely on the output script when finding the second-level output (since they can be equal for equal HTLCs). Now using the input index instead: https://github.com/lightningnetwork/lnd/compare/b80b30502c9a35fd23128543abb354c531ad0cdc..5e5ab68d384c547e83ac355fe69632e2ecaed08d#diff-b38dbdd572300de530a55756fa7af0afc2935a2a643bdcdef2f12f2253c1d313R356

Here are the full changes since last review (mostly tests and logs in addition to the mentioned bug fix): https://github.com/lightningnetwork/lnd/compare/b80b30502c9a35fd23128543abb354c531ad0cdc..5e5ab68d384c547e83ac355fe69632e2ecaed08d

and addressing the review comments: https://github.com/lightningnetwork/lnd/compare/5e5ab68d384c547e83ac355fe69632e2ecaed08d..f5364880827f01a9e2df7ff71ba1e4852f1898d6

@halseth halseth requested a review from cfromknecht November 27, 2020 14:41
if ok {
return fmt.Errorf("duplicate HashLock")
}
htlcHashes[string(htlc.HashLock)] = struct{}{}
htlcHashes[h] = struct{}{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

input/size.go Show resolved Hide resolved
// AcceptedHtlcSuccessWitnessSizeConfirmed 327 bytes
//
// Input to second level success tx, spending 1 CSV delayed HTLC output.
AcceptedHtlcSuccessWitnessSizeConfirmed = 1 + 1 + 1 + 73 + 1 + 73 + 1 + 32 + 1 +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean a comment annotating the set of integers being summed up here, can mostly be copy-pasted from prior versions.

contractcourt/htlc_timeout_resolver.go Outdated Show resolved Hide resolved
contractcourt/commit_sweep_resolver.go Show resolved Hide resolved
contractcourt/htlc_success_resolver_test.go Outdated Show resolved Hide resolved
lntest/itest/lnd_multi-hop_htlc_aggregation_test.go Outdated Show resolved Hide resolved
lntest/itest/lnd_multi-hop_htlc_aggregation_test.go Outdated Show resolved Hide resolved
// Carol will also sweep her anchor output in a separate tx (since it
// will be low fee).
if c == commitTypeAnchors {
expectedTxes = 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dem sweet sweet savings

// resolve them using the second level timeout and success transactions. In
// case of anchor channels, the second-level spends can also be aggregated and
// properly feebumped, so we'll check that as well.
func testMultiHopHtlcAggregation(net *lntest.NetworkHarness, t *harnessTest,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love how comprehensive this test is!

@Roasbeef
Copy link
Member

Mainly only non-blocking comments left, can't wait to get this in! This is such an outstanding PR and really moves lnd forward a large step with by finally starting to utilize more of what anchors gives us.

@halseth halseth force-pushed the anchor-htlc-aggregation branch 3 times, most recently from f50b1ba to 9eaf67f Compare November 30, 2020 14:37
@halseth
Copy link
Contributor Author

halseth commented Nov 30, 2020

Addressed review, and also added unit tests that should cover all scenarios also for the timeout resolver: 558b386

While doing this I noticed I don't have to find the input index manually: 9eaf67f

And also found a small report bug for the re-ordered HTLC outputs: 55d1ae3

@halseth halseth requested a review from Roasbeef November 30, 2020 14:38
@Roasbeef
Copy link
Member

Roasbeef commented Dec 1, 2020

Lingering linter error:

contractcourt/htlc_success_resolver_test.go:87:15: unlambda: replace `func(c ContractResolver,
	r ...*channeldb.ResolverReport) error {
	return testCtx.checkpoint(c, r...)
}` with `testCtx.checkpoint` (gocritic)
		Checkpoint: func(c ContractResolver,

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🧗‍♀️

Should be ready to be rebased and land now! Once in tree, I also plan to run more experiments on testnet to exercise more general functionality and UX from the PoV of a CLI user.

contractcourt/briefcase.go Outdated Show resolved Hide resolved
continue
}
return fmt.Errorf("node %x didn't have the "+
"payHash %v active", node.PubKey[:],
payHash)
h)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this commit be equally accomplished by replacing %v with %x? also doesn't incur the 2x memory overhead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would, but while debugging I found it useful to just spew the htlcHashed map, so nice to have them hex there IMO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like a simple change to make in the event one needs to debug, but it's not that important

log.Infof("%T(%x): waiting for CSV lock to expire at height %v",
h, h.htlc.RHash[:], waitHeight)

err := waitForHeight(waitHeight, h.Notifier, h.quit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: much of the logic below looks duplicated from before, not sure if there is a way to reuse

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FindInputIndex is removed in later commit (will squash with this one), most of the remainder is custom I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah was just looking at everything from watiForHeight to SpendInput, it is very similar but i think it's fine here

contractcourt/htlc_timeout_resolver_test.go Outdated Show resolved Hide resolved
contractcourt/htlc_success_resolver.go Outdated Show resolved Hide resolved
contractcourt/htlc_timeout_resolver.go Show resolved Hide resolved
Only value was populated for some, which would cause code to rely on the
PkScript being there to fail.
Since the tests set a quite high fee rate before the node goes to chain,
the HTLCs wouldn't be economical to sweep at this fee rate.

Pre sweeper handling of the second-level transactions this was not a
problem, since the fees were set when the second-levels were created,
before the fee estimate was increased.
We define the witness constanst we need for fee estimation for this
HTLC second level type.
These will only be used for size upper bound estimations by the sweeper.
…er pointer

To make the linter happy, make a pointer to the inner resolver.
Otherwise the linter would complain with

copylocks: literal copies lock value

since we'll add a mutex to the resolver in following commits.
…RemoteCommitOutput

This move the logic for sweeping the HTLC output on the remote
commitment into its own method.
The sweep tx is not actually part of the resolver's encoded data, so the checkpointing was essentially a noop.
We add checkpoint assertions and resume the resolver from every
checkpoint to ensure it can handle restarts.
success tx

This commit makes the HTLC resolutions having non-nil SignDetails
(meaning we can re-sign the second-level transactions) go through the
sweeper. They will be offered to the sweeper which will cluster them and
arrange them on its sweep transaction. When that is done we will further
sweep the output on this sweep transaction as any other second-level tx.

In this commit we do this for the HTLC success resolver and the
accompanying HTLC success transaction.
Test success resolvers going through the sweeper.
This commit moves the code doing the initial spend of the HTLC output of
the commit tx into its own method.
…ansaction

This commit moves the logic for sweeping the confirmed second-level
timeout transaction into its own method.

We do a small change to the logic: When setting the spending tx in the
report, we use the detected commitspend instead of the presigned tiemout
tx. This is to prepare for the coming change where the spending
transaction might actually be a re-signed timeout tx, and will therefore
have a different txid.
In this commit we make the sweeper handle second level transactions for
HTLC timeout resolvers for anchor channels.
Since we are checking HTLC aggregation, we must give the sweeper a bit
more time to aggregate them to avoid flakes.
In case of anchor channel types, we mine one less block before we expect
the second level sweep to appear in the mempool, since the sweeper
sweeps one block earlier than the nursery.
Now that the HTLC second-level transactions are going through the
sweeper instead of the nursery, there are a few things we must account
for.
1. The sweeper sweeps the CSV locked HTLC output one block earlier than
   the nursery.
2. The sweeper aggregates several HTLC second levels into one
   transaction. This also means it is not enough to check txids of the
   transactions spent by the final sweep, but we must use the actual
   outpoint to distinguish.
@halseth halseth force-pushed the anchor-htlc-aggregation branch from db60ade to 1627310 Compare December 10, 2020 13:24
Copy link
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ⚓️

@cfromknecht cfromknecht merged commit 4af2415 into lightningnetwork:master Dec 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants