lnwire+funding: introduce new protocol extension for explicit commitment type negotiation #5373

wpaulino · 2021-06-10T00:37:10Z

In this PR, we put forth an implementation of the explicit channel commitment type negotiation feature. Explicit commitment type negotiation prepares us for a future where no "default" channel type actually exists. Compared to implicit commitment type negotiation, rather than using a set of overlapping feature bits to determine which channel type is being used during funding negotiation, we instead have both sides signal what their desired channel type is to be. This prepares us for a future of dynamic commitments as the same commitment type reference will be used to change the commitment type in-flight. Additionally, moving to an explicit commitment type negotiation gives us a better way to remove unsupported commitment types, as we'd simply stop advertising the optional feature bit and reject all attempts to use that commitment type.

The set of multi-hop integration tests have been modified to open the third hop using the different supported commitment types through this new protocol extension.

Notes for reviewers:

IMO no new channels should be opened using the legacy commitment type (no static remote key). This type is still supported through explicit negotiation in the earlier revision of this PR, so I decided to leave it as is. Is this something we want?
The earlier revision of this PR also had a CommitmentType field for the AcceptChannel message. I decided to remove it from this PR as I deemed it not necessary, though maybe I'm missing some context on its use.

Replaces #5071.

Crypt-iQ

First-pass review through - still wrapping my head around it, but I think this patch takes a good direction, and paves the way for future improvements. Will review again after comments addressed!

lnwire/open_channel.go

Crypt-iQ · 2021-06-11T17:05:40Z

lnwire/lnwire_test.go

-			// shutdown script.
-			if r.Intn(2) == 0 {
-				// TLV type 1 of length 2.
-				req.ExtraData = []byte{1, 2, 0xff, 0xff}


This test has changed a bit so that ReadMessage will no longer encounter ExtraData. This is fine for this test, but to preserve the coverage we could have testing/quick of ReadMessage by itself instead of round-trip serialization + deserialization. WDYT?

I've changed to test to also include ExtraData, though I think the implementation may be wrong. IIUC, when writing out ExtraData, we should join whatever unknown records are there with the known records and create a new stream including both?

Crypt-iQ · 2021-06-11T17:31:07Z

lnrpc/rpc.proto

+    /*
+    Returned when the commitment type isn't known or unavailable.
+    */
+    UNKNOWN_COMMITMENT_TYPE = 0;


I don't know too much about how end-users use lnrpc, but is this a breaking API change? I would think that end-users would compare responses to the lnrpc types rather than their numerical values

Agreed, which is why I went with this approach. Also, before this change, the CommitmentType message is only ever included in responses and not requests.

It's a breaking change from the PoV of a client that has an older set of protos, and uses the existing enums/constants to make an assertion against these new values. However it was only used in listchannels and pendingchannels.

@wpaulino so the rationale w.r.t making the zero type "unknown" is that we can eventually enforce that users always set a type explicitly? How will a "default" value be added? We could instead go with a "this value is the default" approach similar to what we went w/ in Pool w.r.t selecting node tier.

@wpaulino so the rationale w.r.t making the zero type "unknown" is that we can eventually enforce that users always set a type explicitly? How will a "default" value be added? We could instead go with a "this value is the default" approach similar to what we went w/ in Pool w.r.t selecting node tier.

We want to have an "unknown" variant as the zero value so that we can easily detect when a user hasn't provided a channel type explicitly. This also allows us to set whatever default we deem reasonable from the RPC server without having to make proto changes.

We can achieve this in two ways:

Rework the existing CommitmentType proto to have this unknown variant, and be comfortable with breaking it.

Add a new CommitmentType proto alternative and deprecate the old one.

Gotcha, I think we're using unknown/default in the same manner, need to circle back on the diff, but I think the current approach will work, assuming we map "unknown" to w/e out default is at the time.

funding/manager.go

rustyrussell · 2021-06-12T02:07:49Z

May I suggest you look at the channel types defined in lightning/bolts#868. And maybe propose a spec change to use them in open/accept?

rustyrussell · 2021-06-21T02:10:38Z

May I suggest you look at the channel types defined in lightningnetwork/lightning-rfc#868. And maybe propose a spec change to use them in open/accept?

I've pulled the required changes just for open/accept into their own PR now: lightning/bolts#880

Roasbeef

Happy to finally move forward with this! Main blocker I see here is resolving the set up of the enums on the RPC level to allow us to default to the "safest" channel type when the user opts to not explicitly select their channel type.

Roasbeef · 2021-07-20T23:40:51Z

lnrpc/rpc.proto

+    /*
+    Returned when the commitment type isn't known or unavailable.
+    */
+    UNKNOWN_COMMITMENT_TYPE = 0;


It's a breaking change from the PoV of a client that has an older set of protos, and uses the existing enums/constants to make an assertion against these new values. However it was only used in listchannels and pendingchannels.

@wpaulino so the rationale w.r.t making the zero type "unknown" is that we can eventually enforce that users always set a type explicitly? How will a "default" value be added? We could instead go with a "this value is the default" approach similar to what we went w/ in Pool w.r.t selecting node tier.

rpcserver.go

lnwire/features.go

Roasbeef · 2021-07-20T23:46:59Z

funding/commitment_type_negotiation.go

+
+	// No features, use legacy commitment type.
+	channelFeatures := lnwire.RawFeatureVector(channelType)
+	if channelFeatures.OnlyContains() {


Won't this always return false? No bits passed in as arg, so the length check fails.

I think maybe we should go with a different name for this method also, not that clear IMO what it does at a glance w/o needing refer back to the implementation details.

Perhaps we should put certain behavior behind a build flag also? Can't think of a scenario outside of the itests that we actually want to allow using the legacy channel type.

Won't this always return false? No bits passed in as arg, so the length check fails.

The length check wouldn't fail since no bits are provided as arguments and a legacy channel has no features.

I think maybe we should go with a different name for this method also, not that clear IMO what it does at a glance w/o needing refer back to the implementation details.

I agree, renamed to IsEmpty.

Perhaps we should put certain behavior behind a build flag also? Can't think of a scenario outside of the itests that we actually want to allow using the legacy channel type.

Sounds good to me. Without explicit negotiation though, we already allow legacy channels to be opened if we downgrade the static_remote_key feature from required to optional to a peer we have an already opened legacy channel with. It seems this build flag would also need to handle that?

guggero

Did a drive-by review as a preparation to reviewing #5549.

guggero · 2021-07-27T14:54:39Z

lnwire/accept_channel.go

-	if len(tlvRecords) == 0 {
-		return nil, tlvRecords, nil
+	// Set the corresponding TLV types if they were included in the stream.
+	if _, ok := typeMap[ChannelTypeRecordType]; ok {


According to the godoc of the tlv.TypeMap we should also check that the byte slice is nil:

If the byte slice is nil, the type was successfully parsed. Otherwise the value is byte slice containing the encoded data.

Fixed. Seems we should fix these as well?

Yeah, would probably make sense to look into those cases, yes. Doesn't have to be this PR though IMO.

lnrpc/rpc.proto

lnwire/features.go

Roasbeef

Super close now! IMO the only lingering thing is how to handle the case where a user doesn't specify an explicit channel type (as most won't, since this field doesn't exist rn). As is, the PR falls back to implicit negotiation, but IMO it should attempt to propose our current preferred channel type. In the end, both routes seem to achieve the same thing other than when multiple feature bits are defined that each have a "default" level, but of which, one of them is more preferred from our PoV.

Roasbeef · 2021-07-29T00:28:49Z

rpcserver.go

+	var channelType *lnwire.ChannelType
+	switch in.CommitmentType {
+	case lnrpc.CommitmentType_UNKNOWN_COMMITMENT_TYPE:
+		break


Shouldn't this instead set the current "default" type, so we handle the case earlier vs needing to reoncile w/ a nil/unknown chan type by the time we get all the way into the funding manager.

The current negotiation flow will fail if we try explicit negotiation and the remote peer doesn't advertise the feature bit, so setting a default here would prevent opening channels to any node that doesn't. If we allow the negotiation to fall back to implicit, then we'd need a distinct parameter to determine whether the explicit type being negotiated is a default set by us or set by the RPC user. In the latter case, we wouldn't want to fall back to implicit as that would go against what the user requested.

Roasbeef · 2021-07-29T00:30:38Z

funding/commitment_type_negotiation.go

+		)
+	}
+
+	return implicitNegotiateCommitmentType(local, remote), nil


Re the existing comment, if a user doesn't set this new value when opening a channel, then we'll default to the old implicit negotiation. IMO instead, we want to map the unknown value (so them not setting a value at all), to our current "default" channel type, and still use the explicit chan negotiation.

Roasbeef · 2021-07-29T00:42:12Z

One other "nice to have" here, would be adding an arg on the CLI to allow specifying a custom channel type (we wouldn't ask them to specify feature bits ofc, just the enum type).

guggero

Great work, LGTM 🎉

Just one nit and the PR needs an entry in the release notes.

guggero · 2021-07-30T11:55:48Z

lnwire/features.go

+	// open/maintain types of several channels over its lifetime.
+	//
+	// TODO: Decide on actual feature bit value.
+	ExplicitChannelTypeRequired = 2020


Is this TODO meant to be resolved before merging the PR? Who can/should decide?

It needs to be agreed upon within the spec PR. I think @Roasbeef was fine with moving forward with this PR and updating the feature bit value once decided later on.

In this commit, we add a new TLV record that's intended to be used as an explicit channel commitment type for a new form of funding negotiation, and later on a dynamic commitment upgrade protocol. As defined, we have 3 channel types: base (the OG), tweakless, and anchors w/ zero fee HTLCs. We omit the original variant of anchors as it was never truly deployed from the PoV of lnd.

In this commit, we add a new ChannelType field as a new TLV record to the OpenChannel message. During this change, we make a few tweaks to the generic TLV encode/decode methods for the ExtraOpaqueData struct to have it work on the level of tlv.RecordProducer instead of tlv.Record, as this reduces line noise a bit. We also partially undo existing logic that would attempt to "prepend" any new TLV records to the end of the ExtraOpaqueData if one was already present within the struct. This is based on the assumption that if we've read a message from disk to order to re-send/transmit it, then the ExtraOpaqueData is fully populated so we'll write that as is. Otherwise, a message is being encoded for the first time, and we expect all fields that are known TLV fields to be specified within the struct itself. This change required the unit tests to be modified slightly, as we'll always encode a fresh set of TLV records if none was already specified within the struct.

If these bits are present, then both sides can examine the new CommitmentType TLV field that's present and use this in place of the existing implicit commiment type negotiation. With this change, it's now possible to actually deprecate old unsupported commitment types properly.

This field will be examined later down the stack along with the set of feature bits to determine if explicit channel commitment type negotiation is possible or not.

This commit adds the ability for a channel initiator/responder to determine whether the channel to be opened can use a specific commitment type through explicit negotiation. It also includes the existing implicit negotiation logic to fall back on if explicit negotiation is not supported.

In this commit, we modify the existing logic that defaults to implicit commitment type negotiation to support explicit negotiation if the new feature bit is set. This change allows us to ditch the notion of a "default" commitment type, as we'll now use feature bits to signal our understanding of a commiment type, but allow peers to select which commitment type they actually wish to use. In addition, this explicit negotiation removes the need for using the required bit of any commitment types. Instead, if an implementation wishes to no longer support a commitment type, they should simply stop advertising the optional bit.

guggero · 2021-08-05T11:27:19Z

I only took a superficial look at the PR, @Crypt-iQ I think it would be good if you could give this a full review still. Thanks!

Crypt-iQ · 2021-08-05T14:35:29Z

@guggero will review, have been a bit busy. Should one of us take over the pull now though?

t-bast · 2021-08-17T10:38:24Z

I've tested this branch against ACINQ/eclair#1867 and everything seems to run smoothly!
I had to tweak eclair to send feature bit 2021 though, that requirement should be removed from lnd.

Here is what I tested:

eclair opens static_remote_key channel type -> OK
eclair tries to open anchor_output (without zero-fee htlc) channel type -> correctly rejected by lnd
eclair opens legacy channel type -> OK
lnd opens static_remote_key channel type -> OK

Here are some scenarios I would have liked to test but couldn't test:

lnd opens legacy channel type -> this isn't exposed (only tweakless and anchors options available)
lnd tries to open static_remote_key but eclair has disabled the feature -> lnd sets static_remote_key to mandatory in init so we can't even connect without it
anchor outputs negotiation -> eclair doesn't support the 0-fee-htlc variant yet and lnd doesn't support the old one anymore

I don't think any of these missing tests is blocking though, and all the mechanisms described in lightning/bolts#880 seem to work!

Roasbeef · 2021-08-26T00:09:27Z

Replaced by #5669

Roasbeef · 2021-08-30T22:41:36Z

I had to tweak eclair to send feature bit 2021 though, that requirement should be removed from lnd.

@t-bast missed this the first time around. If a feature isn't defined up front, then it isn't possible for a node to handle the case where one side sends the channel type, and the other doesn't know about it. Worst case, this causes an invalid commitment signature to be exchanged w/o resulting in a breakdown of channel negotiation. Instead, IMO a feature bit should be used here to explicitly guard this new logic, and not have to handle weird edge cases where one side is trying to pick a channel type, and the other is falling back to "default" behavior.

t-bast · 2021-08-31T07:00:08Z

See rusty's comment here.
A feature bit here actually means more code and cases to handle.
I agree we could have added one, but it's an additional effort that doesn't bring any value IMHO.

Worst case, this causes an invalid commitment signature to be exchanged w/o resulting in a breakdown of channel negotiation.

I don't see why? The state machine is very clearly defined to be backwards-compatible, it will never lead to that (unless incorrectly implemented). What guards the behavior is the presence of the channel_type TLV instead of the presence of a feature bit.

Similarly to the quick close feature, it stays simple mostly because there is a message ordering: the funder always starts.
If the funder proposes a channel_type, and the fundee doesn't support this new negotiation and sends back accept_channel without a channel_type and the funder doesn't like the commitment that would be chosen based on feature bits alone, he can always bail out when receiving accept_channel (send an error and abort).

This is expected behavior for the funder (since they wanted a specific channel_type) and transparent for the fundee (because they didn't request a channel anyway, it just came to them and was aborted, they didn't lose anything).

Roasbeef · 2021-09-01T01:56:23Z

A feature bit here actually means more code and cases to handle.

Totally disagree. Either way you need to be able to handle the situation where the remote party doesn't understand what you're trying to do. A feature bit lets you just not do that behavior. Without it, the possibility still exists, but now you need to be able to recover/handle it after they've already done the action.

What guards the behavior is the presence of the channel_type TLV instead of the presence of a feature bit.

Yes, that's the case when the other party even knows the new TLV exists. What if they don't? Wouldn't it be nice for them to signal they actually understand it in the first place?

I mean, it's called a feature bit for a reason ;)

I think this whole "actually we don't need feature bits" direction is ill advised, and seems to be motivated by shaving 3 lines of code (likely just an extra conditional but we're srsly couting straws here), when you'd need the extra logic to handle the entire state machine space anyway. I think the fundamental difference of opinion here is that:

I prefer feature bits as they let you avoid needing to bail out at some later point in the protocol, since you know upfront if that new behavior is enabeled or not.
The route without a feature bit just boils down to: send it and see what happens.

The former here is 100% explicit and allows us to more precisely specific behavior, and imo actually saves on code since you don't need to be able to handle the scenario where things just bail out.

Once again, there's literally zero cost here, only upside: things are fully defined and it's easy to gate new behavior (a config/build flag starts to set the feature bit). I don't think

Roasbeef · 2021-09-01T02:00:19Z

Above everything else, the existence of the feature bit means that nodes can seek out other nodes that actually are able to select which channel type they want to use. I find it confusing how this isn't seen as a valuable protocol feature...

t-bast · 2021-09-01T06:44:26Z

Totally disagree. A feature bit lets you just not do that behavior.
I prefer feature bits as they let you avoid needing to bail out at some later point in the protocol, since you know upfront if that new behavior is enabeled or not.

That's the fundamental disagreement here then. I simply don't see how those claims can be true, unless you decide to leave a lot of undefined behavior in your code (which I'm pretty sure you don't).

Even with a feature bit, you need to handle the cases where your peer misbehaves and you need to bail out when they do.

Above everything else, the existence of the feature bit means that nodes can seek out other nodes that actually are able to select which channel type they want to use.

That's the only argument about this that I agree with you on.

If you want to be able to actively do this seek before everyone supports this on the network, then a feature bit makes sense.
I don't think we do (because all implementations signaled willingness to ship this soon and at this point in lightning nodes need to upgrade whenever a new version comes out because there are always security bugs that are fixed), but it's a matter of predicting the future so none of us can prove they're right until we see it happen (or not happen).

Roasbeef · 2021-09-07T21:58:24Z

Even with a feature bit, you need to handle the cases where your peer misbehaves and you need to bail out when they do.

Correct, independent of anything we try to specify, peers can just do the wrong thing accidentally or purposefully.

That's the only argument about this that I agree with you on.

Happy we agree on this! So from here, do you agree it's essentially zero cost on our end to add the bit, which'll allow the network to evolve more freely as peers are able to pre-filter those that support their desired feature set? The alternative is an interactive dance one needs to do with each peer in order to be able to suss out their core feature set.

I don't think it's wise to always assume that each new thing added with instantly be adopted by all nodes. Even if it were, I don't think that argues against the feature bit system, as there're still nodes that don't quite stay on the latest and greatest or they maintain their own fork for their company/service so they lang behind somewhat. At this point, I don't think we'll ever again have 100% protocol feature parity across all the majro implementations like we once did before (people have diff resources, priorities, roadmaps, etc, etc). That's fine though, we should embrace that by ensuring we place the proper feature bits where applicable so those that do have an overlapping feature set can find each other at the peer discovery phase.

wpaulino requested a review from Roasbeef June 10, 2021 00:37

wpaulino added this to the v0.14.0 milestone Jun 10, 2021

wpaulino added commitments Commitment transactions containing the state of the channel funding Related to the opening of new channels with funding transactions on the blockchain rpc Related to the RPC interface labels Jun 10, 2021

wpaulino mentioned this pull request Jun 10, 2021

lnwire+funding: introduce new protocol extension for explicit channel type negotiation #5071

Closed

3 tasks

wpaulino force-pushed the explicit-chan-negotiation branch from 9e0aac6 to 6ee6f48 Compare June 10, 2021 19:53

wpaulino requested a review from Crypt-iQ June 10, 2021 21:23

wpaulino force-pushed the explicit-chan-negotiation branch from 6ee6f48 to a1d49ad Compare June 11, 2021 00:39

Crypt-iQ reviewed Jun 11, 2021

View reviewed changes

Roasbeef mentioned this pull request Jun 21, 2021

BOLT 2: send channel_type in open/accept lightning/bolts#880

Merged

wpaulino force-pushed the explicit-chan-negotiation branch from a1d49ad to d139ae8 Compare July 13, 2021 01:42

wpaulino requested a review from Crypt-iQ July 13, 2021 01:43

wpaulino force-pushed the explicit-chan-negotiation branch 2 times, most recently from 93adef9 to 5563f72 Compare July 13, 2021 02:13

Roasbeef requested changes Jul 20, 2021

View reviewed changes

wpaulino mentioned this pull request Jul 21, 2021

Introduce new commitment format to enforce channel lease expirations in scripts #5549

Closed

wpaulino force-pushed the explicit-chan-negotiation branch from 5563f72 to 4daf82e Compare July 21, 2021 20:55

wpaulino requested a review from Roasbeef July 21, 2021 20:56

guggero reviewed Jul 27, 2021

View reviewed changes

lnwire/features.go Outdated Show resolved Hide resolved

wpaulino requested a review from guggero July 27, 2021 21:09

wpaulino force-pushed the explicit-chan-negotiation branch from 4daf82e to 9d0239f Compare July 27, 2021 21:10

Roasbeef requested changes Jul 29, 2021

View reviewed changes

wpaulino requested a review from Roasbeef July 29, 2021 19:52

guggero approved these changes Jul 30, 2021

View reviewed changes

Roasbeef and others added 11 commits July 30, 2021 14:55

rpc: add new commitment_type field to OpenChannelRequest

268990a

This field will be examined later down the stack along with the set of feature bits to determine if explicit channel commitment type negotiation is possible or not.

lnwire: extend RawFeatureVector with helper methods

459d7dc

lntest: replace commitType type with rpc alternative

082c019

lntest: use explicit channel commitment negotiation for multi-hop itests

478f640

lncli: add channel type flag to openchannel command

450f2be

docs: add explicit channel negotiation text to release notes

0c56b07

wpaulino force-pushed the explicit-chan-negotiation branch from 9d0239f to 0c56b07 Compare July 30, 2021 22:15

t-bast mentioned this pull request Aug 17, 2021

Explicit channel type in channel open ACINQ/eclair#1867

Merged

Roasbeef mentioned this pull request Aug 26, 2021

lnwire+funding: introduce new protocol extension for explicit commitment type negotiation #5669

Merged

Roasbeef closed this Aug 26, 2021

Roasbeef mentioned this pull request Sep 1, 2021

BOLT-02+09: introduce feature bit to gate new channel_type feature - 44/45 lightning/bolts#906

Merged

wpaulino deleted the explicit-chan-negotiation branch October 27, 2021 19:31

lnwire+funding: introduce new protocol extension for explicit commitment type negotiation #5373

lnwire+funding: introduce new protocol extension for explicit commitment type negotiation #5373

Conversation

wpaulino commented Jun 10, 2021 • edited Loading

Crypt-iQ left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rustyrussell commented Jun 12, 2021

rustyrussell commented Jun 21, 2021

Roasbeef left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guggero left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Roasbeef left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Roasbeef commented Jul 29, 2021

guggero left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guggero commented Aug 5, 2021

Crypt-iQ commented Aug 5, 2021

t-bast commented Aug 17, 2021

Roasbeef commented Aug 26, 2021

Roasbeef commented Aug 30, 2021

t-bast commented Aug 31, 2021

Roasbeef commented Sep 1, 2021 • edited Loading

Roasbeef commented Sep 1, 2021

t-bast commented Sep 1, 2021

Roasbeef commented Sep 7, 2021

wpaulino commented Jun 10, 2021 •

edited

Loading

Roasbeef commented Sep 1, 2021 •

edited

Loading