-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lnwire+funding: introduce new protocol extension for explicit commitment type negotiation #5373
lnwire+funding: introduce new protocol extension for explicit commitment type negotiation #5373
Conversation
9e0aac6
to
6ee6f48
Compare
6ee6f48
to
a1d49ad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First-pass review through - still wrapping my head around it, but I think this patch takes a good direction, and paves the way for future improvements. Will review again after comments addressed!
// shutdown script. | ||
if r.Intn(2) == 0 { | ||
// TLV type 1 of length 2. | ||
req.ExtraData = []byte{1, 2, 0xff, 0xff} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test has changed a bit so that ReadMessage
will no longer encounter ExtraData
. This is fine for this test, but to preserve the coverage we could have testing/quick of ReadMessage
by itself instead of round-trip serialization + deserialization. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed to test to also include ExtraData
, though I think the implementation may be wrong. IIUC, when writing out ExtraData
, we should join whatever unknown records are there with the known records and create a new stream including both?
lnrpc/rpc.proto
Outdated
/* | ||
Returned when the commitment type isn't known or unavailable. | ||
*/ | ||
UNKNOWN_COMMITMENT_TYPE = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know too much about how end-users use lnrpc
, but is this a breaking API change? I would think that end-users would compare responses to the lnrpc
types rather than their numerical values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, which is why I went with this approach. Also, before this change, the CommitmentType
message is only ever included in responses and not requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a breaking change from the PoV of a client that has an older set of protos, and uses the existing enums/constants to make an assertion against these new values. However it was only used in listchannels
and pendingchannels
.
@wpaulino so the rationale w.r.t making the zero type "unknown" is that we can eventually enforce that users always set a type explicitly? How will a "default" value be added? We could instead go with a "this value is the default" approach similar to what we went w/ in Pool w.r.t selecting node tier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wpaulino so the rationale w.r.t making the zero type "unknown" is that we can eventually enforce that users always set a type explicitly? How will a "default" value be added? We could instead go with a "this value is the default" approach similar to what we went w/ in Pool w.r.t selecting node tier.
We want to have an "unknown" variant as the zero value so that we can easily detect when a user hasn't provided a channel type explicitly. This also allows us to set whatever default we deem reasonable from the RPC server without having to make proto changes.
We can achieve this in two ways:
- Rework the existing
CommitmentType
proto to have this unknown variant, and be comfortable with breaking it. - Add a new
CommitmentType
proto alternative and deprecate the old one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, I think we're using unknown/default in the same manner, need to circle back on the diff, but I think the current approach will work, assuming we map "unknown" to w/e out default is at the time.
May I suggest you look at the channel types defined in lightning/bolts#868. And maybe propose a spec change to use them in open/accept? |
I've pulled the required changes just for open/accept into their own PR now: lightning/bolts#880 |
a1d49ad
to
d139ae8
Compare
93adef9
to
5563f72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to finally move forward with this! Main blocker I see here is resolving the set up of the enums on the RPC level to allow us to default to the "safest" channel type when the user opts to not explicitly select their channel type.
lnrpc/rpc.proto
Outdated
/* | ||
Returned when the commitment type isn't known or unavailable. | ||
*/ | ||
UNKNOWN_COMMITMENT_TYPE = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a breaking change from the PoV of a client that has an older set of protos, and uses the existing enums/constants to make an assertion against these new values. However it was only used in listchannels
and pendingchannels
.
@wpaulino so the rationale w.r.t making the zero type "unknown" is that we can eventually enforce that users always set a type explicitly? How will a "default" value be added? We could instead go with a "this value is the default" approach similar to what we went w/ in Pool w.r.t selecting node tier.
|
||
// No features, use legacy commitment type. | ||
channelFeatures := lnwire.RawFeatureVector(channelType) | ||
if channelFeatures.OnlyContains() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this always return false? No bits passed in as arg, so the length check fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe we should go with a different name for this method also, not that clear IMO what it does at a glance w/o needing refer back to the implementation details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should put certain behavior behind a build flag also? Can't think of a scenario outside of the itests that we actually want to allow using the legacy channel type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this always return false? No bits passed in as arg, so the length check fails.
The length check wouldn't fail since no bits are provided as arguments and a legacy channel has no features.
I think maybe we should go with a different name for this method also, not that clear IMO what it does at a glance w/o needing refer back to the implementation details.
I agree, renamed to IsEmpty
.
Perhaps we should put certain behavior behind a build flag also? Can't think of a scenario outside of the itests that we actually want to allow using the legacy channel type.
Sounds good to me. Without explicit negotiation though, we already allow legacy channels to be opened if we downgrade the static_remote_key
feature from required to optional to a peer we have an already opened legacy channel with. It seems this build flag would also need to handle that?
5563f72
to
4daf82e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a drive-by review as a preparation to reviewing #5549.
lnwire/accept_channel.go
Outdated
if len(tlvRecords) == 0 { | ||
return nil, tlvRecords, nil | ||
// Set the corresponding TLV types if they were included in the stream. | ||
if _, ok := typeMap[ChannelTypeRecordType]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the godoc of the tlv.TypeMap
we should also check that the byte slice is nil:
If the byte slice is nil, the type was successfully parsed. Otherwise the value is byte slice containing the encoded data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Seems we should fix these as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, would probably make sense to look into those cases, yes. Doesn't have to be this PR though IMO.
4daf82e
to
9d0239f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super close now! IMO the only lingering thing is how to handle the case where a user doesn't specify an explicit channel type (as most won't, since this field doesn't exist rn). As is, the PR falls back to implicit negotiation, but IMO it should attempt to propose our current preferred channel type. In the end, both routes seem to achieve the same thing other than when multiple feature bits are defined that each have a "default" level, but of which, one of them is more preferred from our PoV.
var channelType *lnwire.ChannelType | ||
switch in.CommitmentType { | ||
case lnrpc.CommitmentType_UNKNOWN_COMMITMENT_TYPE: | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this instead set the current "default" type, so we handle the case earlier vs needing to reoncile w/ a nil/unknown chan type by the time we get all the way into the funding manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current negotiation flow will fail if we try explicit negotiation and the remote peer doesn't advertise the feature bit, so setting a default here would prevent opening channels to any node that doesn't. If we allow the negotiation to fall back to implicit, then we'd need a distinct parameter to determine whether the explicit type being negotiated is a default set by us or set by the RPC user. In the latter case, we wouldn't want to fall back to implicit as that would go against what the user requested.
) | ||
} | ||
|
||
return implicitNegotiateCommitmentType(local, remote), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re the existing comment, if a user doesn't set this new value when opening a channel, then we'll default to the old implicit negotiation. IMO instead, we want to map the unknown value (so them not setting a value at all), to our current "default" channel type, and still use the explicit chan negotiation.
One other "nice to have" here, would be adding an arg on the CLI to allow specifying a custom channel type (we wouldn't ask them to specify feature bits ofc, just the enum type). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, LGTM 🎉
Just one nit and the PR needs an entry in the release notes.
// open/maintain types of several channels over its lifetime. | ||
// | ||
// TODO: Decide on actual feature bit value. | ||
ExplicitChannelTypeRequired = 2020 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this TODO meant to be resolved before merging the PR? Who can/should decide?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to be agreed upon within the spec PR. I think @Roasbeef was fine with moving forward with this PR and updating the feature bit value once decided later on.
In this commit, we add a new TLV record that's intended to be used as an explicit channel commitment type for a new form of funding negotiation, and later on a dynamic commitment upgrade protocol. As defined, we have 3 channel types: base (the OG), tweakless, and anchors w/ zero fee HTLCs. We omit the original variant of anchors as it was never truly deployed from the PoV of lnd.
In this commit, we add a new ChannelType field as a new TLV record to the OpenChannel message. During this change, we make a few tweaks to the generic TLV encode/decode methods for the ExtraOpaqueData struct to have it work on the level of tlv.RecordProducer instead of tlv.Record, as this reduces line noise a bit. We also partially undo existing logic that would attempt to "prepend" any new TLV records to the end of the ExtraOpaqueData if one was already present within the struct. This is based on the assumption that if we've read a message from disk to order to re-send/transmit it, then the ExtraOpaqueData is fully populated so we'll write that as is. Otherwise, a message is being encoded for the first time, and we expect all fields that are known TLV fields to be specified within the struct itself. This change required the unit tests to be modified slightly, as we'll always encode a fresh set of TLV records if none was already specified within the struct.
If these bits are present, then both sides can examine the new CommitmentType TLV field that's present and use this in place of the existing implicit commiment type negotiation. With this change, it's now possible to actually deprecate old unsupported commitment types properly.
This field will be examined later down the stack along with the set of feature bits to determine if explicit channel commitment type negotiation is possible or not.
This commit adds the ability for a channel initiator/responder to determine whether the channel to be opened can use a specific commitment type through explicit negotiation. It also includes the existing implicit negotiation logic to fall back on if explicit negotiation is not supported.
In this commit, we modify the existing logic that defaults to implicit commitment type negotiation to support explicit negotiation if the new feature bit is set. This change allows us to ditch the notion of a "default" commitment type, as we'll now use feature bits to signal our understanding of a commiment type, but allow peers to select which commitment type they actually wish to use. In addition, this explicit negotiation removes the need for using the required bit of any commitment types. Instead, if an implementation wishes to no longer support a commitment type, they should simply stop advertising the optional bit.
9d0239f
to
0c56b07
Compare
I only took a superficial look at the PR, @Crypt-iQ I think it would be good if you could give this a full review still. Thanks! |
@guggero will review, have been a bit busy. Should one of us take over the pull now though? |
I've tested this branch against ACINQ/eclair#1867 and everything seems to run smoothly! Here is what I tested:
Here are some scenarios I would have liked to test but couldn't test:
I don't think any of these missing tests is blocking though, and all the mechanisms described in lightning/bolts#880 seem to work! |
Replaced by #5669 |
@t-bast missed this the first time around. If a feature isn't defined up front, then it isn't possible for a node to handle the case where one side sends the channel type, and the other doesn't know about it. Worst case, this causes an invalid commitment signature to be exchanged w/o resulting in a breakdown of channel negotiation. Instead, IMO a feature bit should be used here to explicitly guard this new logic, and not have to handle weird edge cases where one side is trying to pick a channel type, and the other is falling back to "default" behavior. |
See rusty's comment here.
I don't see why? The state machine is very clearly defined to be backwards-compatible, it will never lead to that (unless incorrectly implemented). What guards the behavior is the presence of the Similarly to the quick close feature, it stays simple mostly because there is a message ordering: the funder always starts. This is expected behavior for the funder (since they wanted a specific |
Totally disagree. Either way you need to be able to handle the situation where the remote party doesn't understand what you're trying to do. A feature bit lets you just not do that behavior. Without it, the possibility still exists, but now you need to be able to recover/handle it after they've already done the action.
Yes, that's the case when the other party even knows the new TLV exists. What if they don't? Wouldn't it be nice for them to signal they actually understand it in the first place? I mean, it's called a feature bit for a reason ;) I think this whole "actually we don't need feature bits" direction is ill advised, and seems to be motivated by shaving 3 lines of code (likely just an extra conditional but we're srsly couting straws here), when you'd need the extra logic to handle the entire state machine space anyway. I think the fundamental difference of opinion here is that:
The former here is 100% explicit and allows us to more precisely specific behavior, and imo actually saves on code since you don't need to be able to handle the scenario where things just bail out. Once again, there's literally zero cost here, only upside: things are fully defined and it's easy to gate new behavior (a config/build flag starts to set the feature bit). I don't think |
Above everything else, the existence of the feature bit means that nodes can seek out other nodes that actually are able to select which channel type they want to use. I find it confusing how this isn't seen as a valuable protocol feature... |
That's the fundamental disagreement here then. I simply don't see how those claims can be true, unless you decide to leave a lot of undefined behavior in your code (which I'm pretty sure you don't). Even with a feature bit, you need to handle the cases where your peer misbehaves and you need to bail out when they do.
That's the only argument about this that I agree with you on. If you want to be able to actively do this seek before everyone supports this on the network, then a feature bit makes sense. |
Correct, independent of anything we try to specify, peers can just do the wrong thing accidentally or purposefully.
Happy we agree on this! So from here, do you agree it's essentially zero cost on our end to add the bit, which'll allow the network to evolve more freely as peers are able to pre-filter those that support their desired feature set? The alternative is an interactive dance one needs to do with each peer in order to be able to suss out their core feature set. I don't think it's wise to always assume that each new thing added with instantly be adopted by all nodes. Even if it were, I don't think that argues against the feature bit system, as there're still nodes that don't quite stay on the latest and greatest or they maintain their own fork for their company/service so they lang behind somewhat. At this point, I don't think we'll ever again have 100% protocol feature parity across all the majro implementations like we once did before (people have diff resources, priorities, roadmaps, etc, etc). That's fine though, we should embrace that by ensuring we place the proper feature bits where applicable so those that do have an overlapping feature set can find each other at the peer discovery phase. |
In this PR, we put forth an implementation of the explicit channel commitment type negotiation feature. Explicit commitment type negotiation prepares us for a future where no "default" channel type actually exists. Compared to implicit commitment type negotiation, rather than using a set of overlapping feature bits to determine which channel type is being used during funding negotiation, we instead have both sides signal what their desired channel type is to be. This prepares us for a future of dynamic commitments as the same commitment type reference will be used to change the commitment type in-flight. Additionally, moving to an explicit commitment type negotiation gives us a better way to remove unsupported commitment types, as we'd simply stop advertising the optional feature bit and reject all attempts to use that commitment type.
The set of multi-hop integration tests have been modified to open the third hop using the different supported commitment types through this new protocol extension.
Notes for reviewers:
CommitmentType
field for theAcceptChannel
message. I decided to remove it from this PR as I deemed it not necessary, though maybe I'm missing some context on its use.Replaces #5071.