Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update messaging and serialization #163

Merged

Conversation

Tibo-lg
Copy link
Member

@Tibo-lg Tibo-lg commented Apr 23, 2021

This PR is a proposal to update the update the serialization to make it more efficient, concise and upgradable. The current message format specifies hard coded type prefixes for embedded type that don't require them, making it a maintenance burden and inefficient. It also lacks any extensibility or upgradability mechanism.

Changes include:

  • Add protocol version to offer_dlc
  • Add tlv streams to wire messages
  • Define three types of subtypes and specific serialization format for each of them (and update type definitions accordingly)

I hesitated to move the wire message definitions to Messaging.md as previously discussed but to keep the diff small I didn't.

@Tibo-lg Tibo-lg force-pushed the serialization-update-proposal branch from 7820462 to 690a335 Compare April 23, 2021 02:48
Comment on lines +87 to +90
[//]: # (TODO: `protocol_version` should be advertised by DLC nodes in the future)
The `protocol_version` value denotes the version of the protocol that the offering node is proposing.
If the receiving node does not support the requested version, it should ignore the message.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should define a advertisement message now and just use that as an initial message in current setups so we don't have to define a new offer_dlc_vx once we have this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me know if I misunderstand you, but I feel having an advertisement message would not remove the need to have a prrotocol version field (if that's what you're proposing?). Nodes could potentially support multiple protocol version so announcing it in the offer message still seem necessary to me.

Messaging.md Outdated

1. type: 55342 (`contract_info_v0`)
1. type: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused why we are removing these types

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we used type tags in two cases:

  • for types that are embedded in others. In this case I couldn't see any use of the tag since if you know exactly what you want to parse a tag doesn't bring anything (happy to be corrected if I missed something).
  • for cases where there are multiple version of a field (this case). I couldn't see any reason to use type numbers that are assigned by LN for wire messages, as this is both less efficient and a waste of them AFAICT (again happy to be corrected if I'm wrong).

Messaging.md Outdated Show resolved Hide resolved
Messaging.md Outdated Show resolved Hide resolved
Messaging.md Outdated

Sibling sub-types are prefixed with a `bigsize` type identifier.
The type identifiers are specific to a set of type variants, and can thus be reused across different sub-types.
Type identifiers are defined within the type definitions, starting from `1` and increasing by `1`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we not want to support the "it's okay to be odd" design here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was that the odd/even requirements in LN are used for extensibility, and to determine whether a TLV or message is part of the base protocol or part of a higher layer (application specific) protocol. The way sibling types are defined, there is no case where they should be ignored, so I didn't see a good reason to use the odd/even requirements. If I am missing something and you have in mind a use case for it I'm happy to change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of envisioning sibling type, we could define an even type for a field, servicing as reference and any future, alternative field as an odd one ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ariard sorry I completely missed your comments. I don't really get what you mean here though, the point of sibling type is not for extensibility or upgradeability but just for cases where within a data structures, a field can have different types.


#### Sibling sub-type

Sibling sub-types are prefixed with a `bigsize` type identifier.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no bigsize length requirement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This follows somehow from the previous point. The length requirement is useful for parsing unknown types or things that should be ignored, but I couldn't think of how it would be useful here (again happy to change it if there is a good reason).

Copy link
Contributor

@Christewart Christewart Sep 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't you assuming everything is a static size then? An example of something that has non deterministic size is ECDSA signatures. IIRC they can very anywhere in between 68 - 72 bytes in size. If you don't explicitly encode the size, you can't parse the message if i understand this correctly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general for fields that have variable length we have length prefixes in the type definition itself, like your example of ECDSA signature

Messaging.md Outdated Show resolved Hide resolved
Messaging.md Outdated
1. type: 55332 (`oracle_announcement`)
2. data:
1. data:
* [`bigsize`: `version`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand, are these simple sub-types then? If so I don't get the point of a simple sub-type with a version number, wouldn't it make more sense to have a sibling sub-type with just one version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right updated

Protocol.md Outdated
Comment on lines 69 to 70
* [`protocol_version`: `bigsize`]
* [`byte`:`contract_flags`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems a little redundant to have both of these

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think the offer should signal the features it supports and requires which are two separate things, and then have a field in the accept with the feature set to be used, so for example you could have an offer that had a new feature enabled but not required where the accepter can choose whether or not to use it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally see these two fields as providing different information.
protocol_version indicates which language I speak, and enables the receiver to determine whether they will be able to understand the content of the messages that are exchanged. Ideally in the future nodes should advertise which version of the protocol they support so that other peers will know how to communicate with them.
contract_flags are used to signal features as you mention.
Does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a Lightning/DLC world, I'm not sure if the protocol_version is making sense as new protocol subset can be introduced as new messages types and the question of whom to talk to is already solved with the feature bits space ?
I agree with the necessity of a contract_flags, see my other point mentioning the LN discussion on channel_type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me contract_flags ad protocol_version are two separate things. The protocol version is for breaking changes in the protocol specifications and enabling nodes to see whether they are compatible with a version or not. Contract flags is for a given protocol version whether a node supports a feature or not.
I guess we could use contract version for both but it doesn't feel optimal to me.
Happy to discuss though, I don't have that strong of an opinion about how things should be, just that it's practical to have some sort of upgradeability mechanism.

Protocol.md Outdated
@@ -66,6 +66,7 @@ the funding transaction and CETs.

1. type: 42778 (`offer_dlc_v0`)
2. data:
* [`protocol_version`: `bigsize`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this is backwards, should be [`bigsize`:`protocol_version`]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks fixed

@@ -81,6 +82,11 @@ the funding transaction and CETs.
* [`u64`:`feerate_per_vb`]
* [`u32`:`cet_locktime`]
* [`u32`:`refund_locktime`]
* [`offer_tlvs`: `tlvs`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it none of offer_tlvs, accept_tlvs or sign_tlvs have any currently existing fields correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

Messaging.md Outdated
#### Optional sub-type

Optional sub-types are also prefixed with `bigsize` type identifier, but the identifiers are defined in the containing structures using the `(Optional: x)` notation where `x` is a number denoting the identifier, starting at `1` and increasing by `1`.
Optional sub-types should be gathered as the last fields of a type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the necessity for this requirement?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Especially seeing as simple sub-types can presumably have optional fields?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right removed.

@Tibo-lg Tibo-lg force-pushed the serialization-update-proposal branch from 690a335 to 34bb0e9 Compare April 28, 2021 01:24
Messaging.md Outdated
#### Optional sub-type

Optional sub-types are also prefixed with `bigsize` type identifier, but the identifiers are defined in the containing structures using the `Optional(type_name, x)` notation where `type_name` is the name of the type and `x` is a number denoting the identifier, starting at `1` and increasing by `1`.
In the absence of any optional sub-type in a structure, a zero byte `0x00` should be written to enable parsing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this now a 0x00 for every absent optional field or for every empty group of consecutive fields or?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed replaced the type identifier with a single byte indicating absence/presence of an optional field.

@Tibo-lg Tibo-lg force-pushed the serialization-update-proposal branch from 34bb0e9 to 5bfa850 Compare April 28, 2021 06:41
Copy link
Contributor

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading this proposal, I'm leaning towards the LN-way for upgradeability/nesting.

If the message is flow is different, introduce a new message type. If the message flow is the same and we're extending message semantic, add a TLV field.

If we want nested data struct, I think we can achieve it the way onions (BOLT4) are done, versioning and new namespaces at each nesting entry ?


### Type-Length-Value

A TLV stream is defined for each wire message, and can be used to extend the protocol and for application specific features.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Browsing through the BOLTs again, tlv_stream are extensions to already defined messages, allowing to add optional fields in a backward compatible-way ?

Throughout the protocol, a TLV (Type-Length-Value) format is used to allow for the backwards-compatible addition of new fields to existing message types.

IIUC your proposal correctly it defines the content of any of our DLC messages as a TLV namespace itself ? Not sure if it's achieving it's purpose of bandwidth-saving as the type/length components of a tlv_record are redundant information for mandatory fields ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was to have tlv_stream for the exact same purpose as LN, so maybe the way I've written things is confusing?

Messaging.md Outdated

Sibling sub-types are prefixed with a `bigsize` type identifier.
The type identifiers are specific to a set of type variants, and can thus be reused across different sub-types.
Type identifiers are defined within the type definitions, starting from `1` and increasing by `1`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of envisioning sibling type, we could define an even type for a field, servicing as reference and any future, alternative field as an odd one ?


#### Plain sub-type

Plain sub-types do not have any particular format, and their field can simply be replaced in place where they are used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example of "plain sub-type" ? I'm not sure I'm following on the replacement part...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an example. I think the naming is probably not great. But basically the point is that before we had type prefixes for fields that were neither optional or sibling.


[//]: # (TODO: `protocol_version` should be advertised by DLC nodes in the future)
The `protocol_version` value denotes the version of the protocol that the offering node is proposing.
If the receiving node does not support the requested version, it should ignore the message.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should wait for the channel_type discussion to settle on the LN-side and piggy-back there, defining new DLC features as-yet-another-channel ?

See lightning/bolts#880 and this week LN meeting : https://gnusha.org/lightning-dev/2021-07-05.log

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify what you would like to do here?

@Tibo-lg Tibo-lg added the breaking The proposed change breaks backward compatibility label Sep 16, 2021
@@ -81,6 +82,11 @@ the funding transaction and CETs.
* [`u64`:`feerate_per_vb`]
* [`u32`:`cet_locktime`]
* [`u32`:`refund_locktime`]
* [`offer_tlvs`: `tlvs`]

[//]: # (TODO: `protocol_version` should be advertised by DLC nodes in the future)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any precedent for this in other deployed cryptocurrency protocols? Could you reference some resources you used for this if so?

It seems like this isn't just a problem with our protocol, but every cryptocurrency protocol that has versioning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, this is a TODO nevermind.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having a protocol version is really common in any kind of protocol. It's for example used in the bitcoin p2p protocol: https://en.bitcoin.it/wiki/Protocol_documentation#version

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to have a list of known protocol_versions. My understanding is the first one is introduced in this PRs test vectors. It would be nice to have it listed somewhere in this doc

@Tibo-lg Tibo-lg force-pushed the serialization-update-proposal branch from 5bfa850 to 0103cc5 Compare September 21, 2021 01:01
Messaging.md Outdated

Sibling sub-types are prefixed with a `bigsize` type identifier.
The type identifiers are specific to a set of type variants, and can thus be reused across different sub-types.
Type identifiers are defined within the type definitions, starting from `1` and increasing by `1`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not start at 0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 Looks like #171 starts with 0

@Tibo-lg Tibo-lg force-pushed the serialization-update-proposal branch 2 times, most recently from d544073 to 927ce76 Compare October 6, 2021 01:43
Messaging.md Outdated
* [`u16`:`nb_nonces`]
* [`nb_nonces*x_point`:`oracle_nonces`]
* [`u32`:`event_maturity_epoch`]
* [`event_descriptor`:`event_descriptor`]
* [`string`:`event_id`]

### The `oracle_announcement` Type
### The `oracle_announcement` TLV
Copy link
Contributor

@matthewjablack matthewjablack Oct 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused about the purpose of the TLV here

Should there be a separate TLV and message format?

Currently OracleAnnouncementV0 and OracleAttestmentV0 are both TLVs (at least they're both implemented as TLVs in bitcoin-s and node-dlc)

(Also looks like TLV is not specified in #171 https://github.com/discreetlogcontracts/dlcspecs/pull/171/files#diff-b8a9c2888fe05c198758fbe393798cd0802b578f7eb07dbf82cc2c90616f97c3R402 )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 if nothing else is changing about the announcements I don't see the reasoning of changing the type

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I confused myself. I think they should be wire messages as they can be broadcast directly (please see updated version). I guess we could define two formats but for the sake of simplicity I would personally prefer keeping a single one.

+1 if nothing else is changing about the announcements I don't see the reasoning of changing the type

I was thinking since we are going to have breaking changes on these anyway we might as well make it so that the wire message types follow each others (I left an open space for the close dlc. Though now thinking about it I wonder if close dlc should actually have an odd type number 🤔

Copy link
Contributor

@matthewjablack matthewjablack Oct 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I confused myself. I think they should be wire messages as they can be broadcast directly (please see updated version). I guess we could define two formats but for the sake of simplicity I would personally prefer keeping a single one.

+1 for having one format for sake of simplicity. Keeping it as a wire message makes sense to me

Though now thinking about it I wonder if close dlc should actually have an odd type number 🤔

What would be the rationale for this? I'm guessing because it's optional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the rationale for this? I'm guessing because it's optional?

Yes exactly, that's at least the semantics specified in BOLT 1: https://github.com/lightningnetwork/lightning-rfc/blob/master/01-messaging.md#lightning-message-format

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that in the latest version of #163 OracleAnnouncement, OracleEvent, EnumEventDescriptor and DigitDecompositionEventDescriptor are all wire messages

This makes sense for OracleAnnouncement, since it's likely announcements and attestments would be communicated over the transport layer. However, shouldn't the wire messages inside OracleAnnouncement actually be a TLVs since it's unlikely for them to be communicated over the transport layer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason was that it was decided not to touch any oracle related message in this PR so that no update to oracle software is required for now, and then make all oracle related changes as part of #167 at once.

@Tibo-lg Tibo-lg force-pushed the serialization-update-proposal branch 3 times, most recently from 271be3f to d6a25a6 Compare October 6, 2021 05:29
@Tibo-lg Tibo-lg force-pushed the serialization-update-proposal branch from d6a25a6 to f413e79 Compare October 14, 2021 05:08
@Tibo-lg
Copy link
Member Author

Tibo-lg commented Oct 14, 2021

Fixed the sibling type numbers which where not consistent with the latest update.

@Tibo-lg
Copy link
Member Author

Tibo-lg commented Oct 14, 2021

Noticed that we were sometimes using BigSize and sometimes u16 for collection prefixes. Consolidated it to use BigSize everywhere as for the CetSignatures type it is probably safer than having a restriction on size.

@Tibo-lg Tibo-lg force-pushed the serialization-update-proposal branch from 588190c to a291f99 Compare October 15, 2021 12:53
Protocol.md Outdated
@@ -66,6 +66,7 @@ the funding transaction and CETs.

1. type: 42778 (`offer_dlc_v0`)
2. data:
* [`bigsize`: `protocol_version`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe u16:num_funding_inputs is a bigsize in the test vectors (specifically enum_single_oracle_test.json)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I had forgotten to update it, fixed.

Messaging.md Outdated

1. type: 42772 (`funding_input_v0`)
2. data:
1. data:
* [`u64`:`input_serial_id`]
* [`u16`:`prevtx_len`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bigsize now, at least according to the enum_single_oracle_test.json

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes thanks fixed.

@@ -185,7 +191,8 @@ and closing transactions.
* [`u64`:`change_serial_id`]
* [`cet_adaptor_signatures`:`cet_adaptor_signatures`]
* [`signature`:`refund_signature`]
* [`negotiation_fields`:`negotiation_fields`]
* [`negotiation_fields`:`negotiation_fields`] (Optional: 1)
* [`accept_tlvs`: `tlvs`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry not sure which field you're referring to in this comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_funding_inputs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Tibo-lg Tibo-lg force-pushed the serialization-update-proposal branch 4 times, most recently from ea650db to 2147ee2 Compare November 2, 2021 02:00
@Tibo-lg
Copy link
Member Author

Tibo-lg commented Nov 2, 2021

  • Switched protocol version from BigSize to u32 as I find it cleaner.

  • Fixed the missing protocol version in the test vectors.

@Tibo-lg Tibo-lg force-pushed the serialization-update-proposal branch from 319dd78 to 4ee603b Compare April 18, 2022 04:41
PayoutCurve.md Outdated
@@ -69,7 +69,7 @@ In this section we detail the TLV serialization for a general `payout_function`.

1. type: 42790 (`payout_function_v0`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be updated as I don't believe its a TLV in the #163 test vectors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Fixed everything in that file, as well as the RoundingIntervals that was also not updated.

@Tibo-lg Tibo-lg force-pushed the serialization-update-proposal branch from 5221dc7 to 8517c72 Compare April 26, 2022 23:38
@@ -66,8 +66,7 @@ up in the case of a tie.

#### Rounding Interval Serialization

1. type: 42788 (`rounding_intervals_v0`)
2. data:
1. data:
* [`u16`:`num_rounding_intervals`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tibo-lg Is this consistent with your implementation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 70032e3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is another bug here, I believe this is prefixed by a bigsize in your implementation @Tibo-lg rather than a u16.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes in that case I should have updated the spec to be consistent with other collection prefixes. Updated now.

@Tibo-lg Tibo-lg force-pushed the serialization-update-proposal branch from 9518a5a to 463dc59 Compare May 3, 2022 11:28
@@ -181,17 +194,18 @@ and closing transactions.
* [`point`:`funding_pubkey`]
* [`spk`:`payout_spk`]
* [`u64`:`payout_serial_id`]
* [`u16`:`num_funding_inputs`]
* [`bigsize`:`num_funding_inputs`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to make this change we need to either

  1. Add a version to the top of the accept_dlc_tlv_v0
  2. revert this back to u16

The reason this is needed is to preserve backwards compatability with the previous version of the spec. This is how i've implemented the backwards compatible parse currently

    //parser for `offer_tlv`
    val fundingInputs = {
      if (protocolVersionOpt.isEmpty) {
        //means we are working with the old version of the spec
        iter.takeU16PrefixedList(() => iter.take(FundingInputV0TLV))
      } else {
        //means we are using version 1 of the spec
        iter.takeBigSizePrefixedList(() => iter.take(FundingInputV0TLV))
      }
    }

parser for the accept_tlv that is an attempt at backwards compatibility

      //accept_tlv
      //
      //we don't have a version # here so we don't know what version we are trying to parse?
      //I suspect this code in the try block will not always cause an exception
      //so this is truely a guess.
      try {
        iter.takeU16PrefixedList(() => iter.take(FundingInputV0TLV))
      } catch {
        case scala.util.control.NonFatal(_) =>
          iter.takeBigSizePrefixedList(() => iter.take(FundingInputV0TLV))
      }

The crux of the issue is that if funding inputs are prefixed with a u16 as required in the old version of the spec, a bigsize can be mistake as a u16. bigsize is required by the new version of the spec, which gives us ambiguity in the parser.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind adding a version at the top of the message, but can't you just keep the version from the offer message?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't work for a few reasons

  1. Its a layer violation to require a database read just to parse a tlv correctly.
  2. You cannot guarantee you will parse the accept_tlv since you don't know which serialization format it uses to obtain information to make the database read. I guess you might be able to parse the first 32 bytes (tempContractId) to parse just the 32 bytes, make the database read, and then determine which format the offer had, but this is very cumbersome.

Lets keep the messages -- at the binary serialization level -- context free so that you can parse the offer, accept, and sign tlvs independent of one another.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a protocol_version field to the accept message and updated the test vectors. I wonder if it wouldn't be better to also add one to the sign message, what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it wouldn't be better to also add one to the sign message, what do you think?

I think its probably a good idea. AFAIK, the sign_tlv does not have the same sort of ambiguity (funding_inputs) that offer,accept have so it isn't strictly necessary now, but it seems prudent to future proof a bit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems prudent to future proof a bit.

Yeah I feel like that too. Updated docs and test vectors accordingly.

@Tibo-lg Tibo-lg force-pushed the serialization-update-proposal branch from 85680d5 to b4264fc Compare May 31, 2022 00:08
@Tibo-lg Tibo-lg mentioned this pull request May 31, 2022
@Christewart
Copy link
Contributor

So this PR on bitcoin-s passes all test vectors in this PR:

I believe we have achieved the 2 impl requirement, which means we may want to merge this? 👀 👀

bitcoin-s/bitcoin-s#3859

@Christewart
Copy link
Contributor

As discussed at the June meeting, we have 2 implementations now so merging this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking The proposed change breaks backward compatibility needs 2 impl
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants