Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BOLT01: TLV proposal #607

Merged
merged 1 commit into from
Jul 8, 2019
Merged

BOLT01: TLV proposal #607

merged 1 commit into from
Jul 8, 2019

Conversation

cfromknecht
Copy link
Collaborator

@cfromknecht cfromknecht commented May 9, 2019

This proposal outlines a TLV format intended for wire messages. It contains additional considerations for gossip messages that would require maintaining the ability to verify signatures over unknown fields, and the ability to propagate those messages to peers.

This proposal makes different tradeoffs from #603 in terms of:

  • requiring strict canonical encoding
  • fixed-size length encoding
  • supporting reserialization of unknown records
  • using 255 as the sentinel type identifier
  • removing even/odd requirement from the parsing logic, favoring negotiation via feature bits or application of such constraints external to the fundamental parsing requirements

After significant feedback and discussion, the proposal now entails:

  • type and length both use the bitcoin varint format, minimally-encoded
  • no repeated fields
  • canonical ordering is defined by sorting type only
  • readers must enforce canonical order
  • includes even/odd rule (fail when reading an unknown even type)
  • no sentinel value

The proposal is meant to encompass a single TLV format for use in wire and onion payloads.

Reference implementation: lightningnetwork/lnd#3061

@Roasbeef Roasbeef added the Meeting Discussion Raise at next meeting label May 13, 2019
01-messaging.md Outdated Show resolved Hide resolved
@t-bast
Copy link
Collaborator

t-bast commented May 15, 2019

Thanks for this proposal @cfromknecht! It helps clarify some choices made in #603 by showing alternative choices.

After reading both proposals, I still think that we should decide on a single TLV encoding, enforced everywhere (wire, onion, etc), that combines your proposal with rusty's. Having distinct encodings is error-prone and confusing and I don't think the gains justify it (but it might be because some gains aren't obvious to me, feel free to detail them more if you disagree).

I'm in favor of keeping a varint for type and length: it can be more compact, it's more flexible and the performance impact is negligible (compared to the cryptographic operations that are done with every message).

I'm in favor of strict canonical encoding as you propose, but allowing multiple occurrences of the same type in a TLV stream.

I don't think the sentinel value is needed at all (and I think the result of the IRC discussion was a consensus on that, is that correct?).

I think it's still useful to keep the even/odd distinction to avoid feature flags bloat (I'm afraid we would need an explosion of feature flags which becomes more complex to track than the simple even/odd rule).

I agree with you that the optimization for integer values is a layering violation and I prefer your suggestion to let it be defined by the TLV type itself.

@cfromknecht
Copy link
Collaborator Author

cfromknecht commented May 16, 2019

Thanks for the feedback @t-bast! I sincerely apologize in advance for the brevity of this reply, but it felt like a good time to do a brain dump from the last 6 months or so of research and explain the rationale behind certain aspects of the proposal that maybe haven't been discussed as much.

I don't think the sentinel value is needed at all (and I think the result of the IRC discussion was a consensus on that, is that correct?).

Yes that was the consensus, thanks to @cdecker @niftynei for helpful discussion! it's been removed in the fixups, along with the portions about retaining unknown values (which they convinced me is more of an implementation detail).

I'm in favor of keeping a varint for type and length: it can be more compact, it's more flexible and the performance impact is negligible (compared to the cryptographic operations that are done with every message).

I'm not married to the fixed-size encoding for the length, using a varint was actually how this proposal was drafted a few months back :) It is nice that in the fixed size case, this limit is enforced directly rather than needing to enforce the constraint as an additional check on the parsed value, since we should only need to handle up to two-byte values for wire messages.

I'm much more skeptical of making type a varint, with a worst-case size of 9 bytes and no other considerations like max message size to constrain the possible values. I'm fairly convinced that 64-bit namespace is overkill. If 8-bits really isn't enough, we can always create extensions to the namespace by nesting another tlv_stream in a select tlv_record. I really don't think (or hope) we'll need to, but it does exist as a safety valve.

performance impact is negligible (compared to the cryptographic operations that are done with every message).

Serialization under the wire format doesn't necessarily mean that the message is going to be encrypted, messages can be written to databases, over rpc connections, into message digests, etc. I agree tho that the overhead in moving from 2-byte length to varint (with realistic max of 65KB) is not so bad and can save space, however I don't think the same argument holds for type.

Consider a tlv_stream with types 0-255 under both proposals. The 8-bit identifiers can be held in a few cache lines, while the 64-bit identifiers require half of a standard page size, even though the size of the encodings are almost identical. An 8x memory overhead from types isn't negligible IMO, especially considering the vast majority of it would be zeros and is never even serialized in the message contents. Additionally, unlike lengths which can be dynamically computed at the time of encoding, types I suspect are more likely to be long-lived in order to retain the mapping to their respective fields.

Keep in mind we want to remain friendly to low-power devices with more constrained resources ;)

I'm in favor of strict canonical encoding as you propose, but allowing multiple occurrences of the same type in a TLV stream.

I'm with you on the strict canonical encoding!

There are a multitude of reasons however why I'm strongly opposed to allowing duplicate fields, mostly performance related (i swear, this is the shortened list):

  • Introduces unnecessary overhead:
    • The signaling overhead of each redundant field is at least 2 or 3 additional bytes. It's more efficient to have a single field and concatenate their encodings, e.g. num_items || encode(item0) || encode(item1) || ...
  • Unclear semantics of duplicate fields:
    • If duplicate fields are meant to be appended:
      • Then all TLV fields are inherently vectors, and one can't define just a uint32, it must be []uint32
        • This can be somewhat skirted if a particular type could specify that it can't be repeated, but the receiver would be unable to determine whether an unknown field is allowed to have one or multiple values
      • Since the number of duplicate records isn't known when reading the stream sequentially, this adds additional overhead of double and copy to facilitate expansion of the underlying array (or doing multiple passes i suppose, eek). Placing everything in a single record allows the size of vectors to be accurately reserved up front
    • If duplicate fields are meant to replace the prior field:
      • then why allow duplicates?
  • Increases complexity of producing canonical ordering:
    • If I'm not mistaken, the only logical sorting the receiver can apply to values is lexicographic sorting, since it has no access to the underlying type of unknown fields, so it must compare raw bytes
      • This implies that the writer must encode all of the fields in memory before being able to apply the sort, which results in higher memory requirements and would prevent TLV streams from being written directly to the wire without this extra preprocessing phase
      • As an example, sorting the unit32 0x0100ffff and the compressed uint32 0x01ffff result in different orderings when applying lexicographic vs integer sorting. If the receiver can only apply lexicographic sorting, the values must be encoded by the writer before trying to sort or else the stream may not be canonical
    • The canonical order can't always be determined at compile-time, and instead happens at the time of encoding. If the values change in content or length, this sort must be applied upon every encoding. From my benchmarks, sorting on every encoding yields a significant performance penalty, even when only sorting on 8-bit types, let alone a more complex comparison operator. The final version of the PoC moved to having the ordering supplied at compile time and vastly improved performance
  • Requires more state in verifying canonical ordering:
    • Instead of keeping only the type from the prior field, the receiver will need keep the length and value, which increases the state from constant to potentially the length maximum possible message size. When only types are compared, the value bytes can be discarded as they are processed (in certain cases, possibly implementation dependent?)

Compressing all repeated values into a single, unique record solves all of above the issues, with no real downsides AFAICT. Either the field is defined as a vector or it isn't, but those details remain in type's specified encoding rather than leaking into the parsing requirements. I think unique records are our friend here, and greatly reduces the code complexity while maintaining comparable performance to static serialization.

I think it's still useful to keep the even/odd distinction to avoid feature flags bloat (I'm afraid we would need an explosion of feature flags which becomes more complex to track than the simple even/odd rule).

On the wire, the sender can't just arbitrarily make fields required without knowing if the receiver supports it, doing so will result in a disconnect (since it'd be equivalent to failing to the parse the message). The receiver needs to opt-in to such fields before the sender can safely send them; we already have a mechanism for signaling this, what is the purpose in having the redundancy?

For example, all new fields added to gossip messages I think need to be optional, otherwise they won't propagate through older nodes. Having the even/odd rule there, or any wire message really, would only serve to halve the range of valid type identifiers. This is more critical if we solidify on a 8-bit type of course.

Onion TLV is different, since we can't rely on connection-level feature bit negotiation, whether we have the node's latest announcement, or if the node is still running the same software/configuration as it was when the invoice or announcement was created. At the same time, failure to parse a required onion TLV record only results in a failed HTLC, and not connection-level instability. I support having even/odd rule in onion, but I don't think the rule needs to be inherited by all TLV namespaces. IMO it should be applied as an additional constraint in the onion parsing. In my mind, the distinction is purely:

    // Standard TLV parsing for virtually everything
    var parser tlv.Parser
    if err := parser.Decode(r); err != nil {
        return err
    }

    // For onion TLV, apply even/odd rule as additional constraint
    if parser.FoundUnknownRequiredFields() {
        return tlv.ErrUnknownRequiredFields
    }

I think this gets us towards a more generic, unified TLV implementation that allows certain applications to apply restrictions as necessary, rather than restricting all other use cases for a single known instance where the even/odd rule is indeed very helpful.

After reading both proposals, I still think that we should decide on a single TLV encoding, enforced everywhere (wire, onion, etc), that combines your proposal with rusty's. Having distinct encodings is error-prone and confusing and I don't think the gains justify it (but it might be because some gains aren't obvious to me, feel free to detail them more if you disagree).

Yes, I concur that it would be nice to agree on one unified approach, but want to be clear that I don't necessarily want distinct formats. Though if the tradeoffs can't be reconciled entirely, I still think it's possible to agree on a set of fundamental requirements that permit a high degree of code reuse between the two.

TL;DR

With all that in mind and attempting to incorporate the pertinent feedback, i've updated the proposal to:

  • single byte type
  • varint length
  • no sentinels
  • no weird record retaining
  • strict canonical encoding by sender and receiver
  • unique records
  • no encoding-dependent requirements in the parser
  • leave enforcement of even/odd (or any additional constraints) at a higher-layer as needed

I thinkkk this captures the majority of the feedback that's been given over the past few weeks. I would then propose that we define the onion tlv format solely by adding a check for unknown required fields to the wire tlv requirements.

Let me know if I've blatantly missed/overlooked anything, further feedback is greatly appreciated! Hopefully that also further elaborates on some distinctions I see in wire vs onion tlv, or perhaps shows my ignorance as to why they should be treated the same. To anyone who read the whole thing, I owe you a beer :)

Aside: is it worth having a separate BOLT just for the TLV spec? I'm starting to think it might be helpful to have a place where we define useful encodings, e.g. compressed integers, so we can reference a single location for common tlv_record encodings in other sections of the spec.

@t-bast
Copy link
Collaborator

t-bast commented May 16, 2019

That's great, thanks a lot for this detailed post, it's very helpful (especially for newcomers such as myself). I'll hold you to that beer when we meet IRL ;)

Your argument for removing duplicate occurrences of a TLV type makes a lot of sense. As you suggest, it's a lot cleaner (and simplifies a lot the implementation) to let the type itself define whether it's an array or not. I'm fully on board with that (and the strict canonical encoding).

I've thought more about the feature flags vs even/odd rule, and reading your comment I now agree with you that the wire encoding doesn't need the even/odd rule. We will always need feature flags to know that the other peer understand our required TLV types before sending a message. And once we have such feature flags, we should consider all the other TLV types in the message as optional. It's especially obvious when we think about gossip re-transmission (staggered broadcast): if Bob re-transmits a message from Alice to his peers, for example to Carol, Carol should ignore every TLV type she doesn't understand (otherwise Alice could indirectly force a connection-close between Bob and Carol which is undesirable). We don't want Bob to have to compare each message's inner TLV types to each of his peers' feature flags to know if it's safe or not to re-transmit the message.

I would even argue that onion TLV doesn't need the even/odd rule either. Since we want to maximize payment path success, we will use channel feature flags to decide whether to include a peer in a payment path or not (because it's risky and costly to blindly include a node that would fail to parse a TLV type and thus fail the HTLC). So we know that this node understands all our required TLV types, and we can safely treat all other TLV types as optional (for example optional routing hints for next trampoline hop).

One nice benefit of removing the even/odd rule from onion TLV is that there is now no differences between onion and wire TLV (yay single encoding!).

Do note though that I think that the even/odd rule is useful for top-level messages (as defined in Bolt #1). But for TLV types I think the even/odd rule is unnecessary because all unknown TLV types should be safely ignored (and never fail a connection).

So we should be in agreement with using feature flags instead of the even/odd rule for TLV. However I'm still not convinced about type not being a varint. I think a varint is more flexible and simple to use than nesting TLVs. And until you need more than 1 byte's worth of TLV types, it's still a single byte so it doesn't have any impact. And once you need more types, using a varint has a smaller footprint than nesting another TLV...I agree it might make the code slightly more complicated but I think it's worth it because it feel cleaner and easier to reason about than nesting TLVs. But this is definitely not a blocking point on my end, we can go with a 1-byte type if more people lean towards that.

I don't have a strong opinion on putting the TLV proposal in its own BOLT, I think it's a good idea but I'm most certainly not the most suited person to judge that.

I feel that we're making a lot of progress and we're getting close :)
I'd love to have a thorough review of our discussion by @rustyrussell though: he initiated the TLV discussion, has a lot of experience with this kind of stuff and can potentially point out things we might have overlooked that would justify the use of the even/odd rule.

Phew that ended up being a long comment too, I'll have to buy the second round of beers to our readers!

Copy link
Collaborator

@niftynei niftynei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few nits on wording. Otherwise looks good.

Here's a quick list of the differences between this and #603:

01-messaging.md Outdated Show resolved Hide resolved
01-messaging.md Outdated Show resolved Hide resolved
@cfromknecht
Copy link
Collaborator Author

cfromknecht commented May 16, 2019

Thank you @niftynei for the review and @t-bast for further input 🎉

I've thought more about the feature flags vs even/odd rule, and reading your comment I now agree with you that the wire encoding doesn't need the even/odd rule.

Indeed, though I should note that we don't need a feature bit if the field is truly optional, it only needs to be defined if we wish to make it required. My hope is that this wouldn't create feature bits like i_undestand_type_34_on_query_short_channel_id, but rather be bundled into a more logical "feature", e.g. extended_gossip_queries.

Since they are optional, the sender is free to write it, but everyone can save on bandwidth if the field is serialized conditionally on the peer's support for that feature.

I would even argue that onion TLV doesn't need the even/odd rule either.

Yeah I'm not totally certain in either direction, I suppose one could even make a case for all onion TLV fields to be required. But deferring additional constraints to a higher-level means we can retroactively add these constraints where necessary. Perhaps those will be identified as we move forward with defining the MPP onion types.

Do note though that I think that the even/odd rule is useful for top-level messages (as defined in Bolt #1). But for TLV types I think the even/odd rule is unnecessary because all unknown TLV types should be safely ignored (and never fail a connection).

Which ones exactly? Since init is transmitted first, couldn't those still be gated by feature bits?

However I'm still not convinced about type not being a varint.

A wise professor once told me that if you're designing a protocol and think a certain limit is enough, double it—in reference to IPv4 address space lol.

I left this out of the proposal, but my original idea was strip out the 4 and 8 bit encodings from the CompactType and use a varint for length, so:

 - if x < 0xff:
   - write x
 - else write 0xff || little-endian(x)

If we use this for both type and length, I think it bounds both quite nicely. I'm much more amenable to varints that don't require 64-bit types.

The implementation obviously is pretty trivial, just its mostly a copy-pasta of CompactSize. This gives us equivalent performance to a single byte type for values < 255, but also some breathing room if 256 types isn't enough. Thoughts?

I feel that we're making a lot of progress and we're getting close :)

Sure feels like it!

I'd love to have a thorough review of our discussion by @rustyrussell though: he initiated the TLV discussion, has a lot of experience with this kind of stuff and can potentially point out things we might have overlooked that would justify the use of the even/odd rule.

Same here, as I don't think we can really move forward conclusively without his input on the tradeoffs :)

I'll hold you to that beer when we meet IRL ;)

Phew that ended up being a long comment too, I'll have to buy the second round of beers to our readers!

Looking forward to it! I suppose the length of this message warrants a third round 🍻

@cfromknecht
Copy link
Collaborator Author

cfromknecht commented May 16, 2019

One other comment regarding using a varint type: encountering an EOF while parsing the type doesn't necessarily mean the stream is well-formed like it would with 1-byte types. We’d need to modify the language to consider an EOF well-formed only if zero bytes were read.

@t-bast
Copy link
Collaborator

t-bast commented May 17, 2019

Indeed, though I should note that we don't need a feature bit if the field is truly optional, it only needs to be defined if we wish to make it required. My hope is that this wouldn't create feature bits like i_undestand_type_34_on_query_short_channel_id, but rather be bundled into a more logical "feature", e.g. extended_gossip_queries.

I completely agree: a feature flag should indicate support for a bunch of required TLV types (in a given context) - for example option_trampoline_onion would indicate support for all the types necessary for trampoline onion routing. Then optional types can be added without defining any feature flag indicating their support (think of such types as optional hints that can be safely ignored, like routing hints in the case of onion) which gives more flexibility to quickly add new small optional optimizations.

A wise professor once told me that if you're designing a protocol and think a certain limit is enough, double it—in reference to IPv4 address space lol.

😆

I'm not very clear on your proposal for the tweaked varint (:tm:), maybe I'm missing something about the varint discussion in general. Bitcoin's varint use either 1, 3, 5 or 9 bytes. Your comments repeatedly mention preventing 64-bit integers (which I agree would be overkill), but that only happens in the case where we use 9 bytes, which I think would probably never happen. I expect our use of varint in TLV type to most of the time use 1 byte, and sometimes use 3 bytes.
To make sure I understand the motivation, are you proposing the tweaked varint (:tm:) variation because you'd like to be able to use only 2 bytes instead of 3 when we overflow the 1-byte limit (I agree that 3 bytes might not be necessary, 2 bytes would probably be enough)?

  • else write 0xff || little-endian(x)

How do you know you finished parsing x if the first byte doesn't explicitly tell you how many subsequent bytes are used?

Thanks for all those suggestions and the continued discussion :)

@cfromknecht
Copy link
Collaborator Author

cfromknecht commented May 17, 2019

In the process of integrating the current wire TLV with @Roasbeef’s multi-frame implementation of #593, we encountered some interesting events that reminded me of some other reasons for having a sentinel.

When the bytes containing a tlv_stream can be larger than the actual stream, this will result in a parsing failure. For example, extra appended zeros will fail when enforcing canonical ordering over sorted types.

This happens when parsing payloads in #593, since only complete frames are interpreted without specifying the number of used bytes in the payload.

This leaves us with two options:

  1. Use a sentinel type
  2. Write the tlv_stream’s length as a varint at the beginning of the payload

Writing the length will require prepending 1 or 3 bytes to the payload.

On the other hand, using a sentinel type will require at most 1 byte, let me explain.

The original proposal serialized the sentinel’s length, which I’ve realized now is not necessary. We can simply stop after parsing a sentinel type, making the sentinel smaller. For either 8-bit or varint types, we’d choose a value that only requires 1 byte.

The original proposal also made the sentinel record optional, which means that when the payload fits exactly in the provided frames, it can simply be omitted. Thus, when the tlv_stream fits exactly in a given number of frames, there is no overhead.

When it doesn’t fit exactly, we use one extra byte, but at that point the frame has already been consumed and the remainder of the bytes can’t be reclaimed. In either case, no additional frames will ever be consumed using a sentinel value.

We can simplify the encoding procedure even more, by defining the sentinel value as 0x00. Observe that after encoding the tlv_stream the remaining bytes in the frame will still be set to 0. At least in the case of onion TLV, that means the writer never even has to be aware of the sentinel (or specifically omit it), it will be done implicitly if there are more than 0 bytes remaining in the frame (otherwise the payload fits exactly). I’m beginning to think this may have part of the reason perhaps @rustyrussell and @niftynei may have chosen 0x00 as the sentinel value in the other proposal?

In addition to the above, using a sentinel value offers another significant performance benefit over explicitly writing the length, in that the tlv_stream can be written directly without needing to compute its total length. Doing so requires either serializing the stream beforehand or introducing some more complex method of computing the total length of the stream before encoding (which gets hairy if the values can change in size after encoding, e.g. integer compression).

Given that it’s optional, is it worth sharing this between wire and onion TLV? I don’t see how it hurts, and IMO would make the base TLV requirements more flexible as a whole

@rustyrussell
Copy link
Collaborator

  1. Please don't invent your own varint. Simply reuse CompactInt from bitcoin which everyone has already implemented.
  2. We have odd/even in the peer protocol for a reason; it's very useful as features explode and you end up talking to ancient implementations, which at least understand they don't understand. It easy to implement this in your TLV parser, too, rather than as a case-by-case (it already has to handle other malformed cases).
  3. varint for type costs nothing, since you need to be able to parse len already. But I feel less strongly about this since we can always make types 254 and 255 extension types.
  4. The sentinel was for inside the onion, which is fixed length. But it sucks because it makes other packets non-canonical, so we might be best to move that to an onion-specific TLV termination condition? Other option would be to spend a byte (or three) inside the onion for the total length (oops,
    you suggested that above, yes, let's do it).
  5. Agreed canonicalization is required, disagree that it needs to be verified by the reader, and disagree that types must be unique. We already have fields in bolt11 which can usefully be repeated (f fallback addresses, and r routehints). These are more naturally done as repetition, with the semantic defined by the field definition itself. Until we have a repeatable field, of course, you'll be ignoring them anyway.

@t-bast
Copy link
Collaborator

t-bast commented May 27, 2019

We have odd/even in the peer protocol for a reason; it's very useful as features explode and you end up talking to ancient implementations, which at least understand they don't understand. It easy to implement this in your TLV parser, too, rather than as a case-by-case (it already has to handle other malformed cases).

I agree that for non-onion messages even/odd could be valuable because it provides more flexibility and easier backwards-compatibility. But in the case of the onion I feel like this won't be used in practice. You really want to avoid seeing your payment fail because someone in the path didn't understand a required field in your onion. So in my opinion you will always check the feature bits of everyone on the payment path to make sure they understand the required types that you're including in your onion, which makes all other types inherently optional. Do you see a use-case where that wouldn't be true?

I can understand however that you might want to make this rule apply to the onion as well because it becomes messy to have this rule in some places but not everywhere. For the sake of clarity / avoiding spec complexity, I can agree to that (and if we have a varint type, I don't mind enforcing this rule inside the onion TLVs).

The sentinel was for inside the onion, which is fixed length. But it sucks because it makes other packets non-canonical, so we might be best to move that to an onion-specific TLV termination condition? Other option would be to spend a byte (or three) inside the onion for the total length (oops,
you suggested that above, yes, let's do it).

I think it's wasteful to add potentially 3 bytes (or more) for the total length of the TLV stream inside the onion. Couldn't we simply say that the end of an onion TLV stream needs to use padding with only 0x00 bytes? Maybe that's what you mean when you say an onion-specific TLV termination condition? If that's the case that sounds good to me.

@cfromknecht
Copy link
Collaborator Author

I've updated the proposal from the result of our last meeting and the discussion between rusty and I on irc.

A summary of the current proposal:

  • type and length both use the bitcoin varint format, minimally-encoded
  • no repeated fields
  • canonical ordering is defined by sorting type only
  • readers must enforce canonical order
  • includes even/odd rule
  • no sentinel value

@cfromknecht
Copy link
Collaborator Author

I also updated the PoC implementation here: lightningnetwork/lnd#3061

Depending on how your decoder is implemented, one gotcha with the latest variant is that you may need to handle the case where types wrap around a uint64.

01-messaging.md Outdated Show resolved Hide resolved
01-messaging.md Outdated Show resolved Hide resolved
@cfromknecht cfromknecht changed the title BOLT01: wire TLV proposal BOLT01: TLV proposal Jun 5, 2019
@cfromknecht
Copy link
Collaborator Author

@t-bast pushed a fixup which i think improves the clarity, lmk what you think

Copy link
Collaborator

@rustyrussell rustyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, and nice and canonical with the spec wording, too.

@rustyrussell rustyrussell mentioned this pull request Jun 10, 2019
@niftynei
Copy link
Collaborator

niftynei commented Jun 12, 2019

catching up after being out Monday, quick question/clarification about how the 'length of the TLV-stream' is represented in wire messages* -- will a tlv-stream be prepended by the total length of the TLV-stream portion of the message, or is there an implicit "stream-like" quality to it in that 'any additional bytes on this message are assumed to be TLV'?

in other words would wire messages have:

# Option A
last message field
length of tlv-stream
tlv-stream

or

# Option B
last message field
tlv-stream

*via the IRC meeting notes from Monday, it seems like onion formats will have to specify the frame length, so they're already doing Option A

Copy link
Collaborator

@niftynei niftynei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nits; clarifications.

01-messaging.md Outdated
The sending node:
- MUST order `tlv_record`s in a `tlv_stream` by monotonically-increasing `type`.
- MUST minimally encode `type` and `length`.
- SHOULD NOT use the varbytes encoding for `value`s containing byte arrays.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the first time we're using varbytes in the spec, should we provide a definition (or at least a link to a definition)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ended up removing the mention of varbytes in favor of rusty's language from the irc meeting, which says not to use redundant, variable-length encodings

The receiving node:
- if zero bytes remain before parsing a `type`:
- MUST stop parsing the `tlv_stream`.
- if a `type` or `length` is not minimally encoded:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clarification: is minimally encoded referring to their being a varint type, i.e. you should encode '0x01' as '0x01', not '0xfe00000001'?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@niftynei correct, i would assume most impls of compact size already check this, tho i'm not positive

Copy link
Collaborator

@Roasbeef Roasbeef Jul 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example, btcd's wire package includes checks to ensure encoded integers are canonical: https://github.com/btcsuite/btcd/blob/master/wire/common.go#L491

01-messaging.md Outdated Show resolved Hide resolved
01-messaging.md Outdated Show resolved Hide resolved
01-messaging.md Outdated Show resolved Hide resolved
@cfromknecht
Copy link
Collaborator Author

@niftynei

will a tlv-stream be prepended by the total length of the TLV-stream portion of the message, or is there an implicit "stream-like" quality to it in that 'any additional bytes on this message are assumed to be TLV'?

Yes, the assumption rn is that there is an implicit tlv_stream after all currently defined fields. Of your examples, the wire messages will use option B. The message header will include the length of all existing fields + a tlv_stream at the end, which is allowed to consume any remaining bytes after parsing the known fields.

The critical thing is that the decoder knows where it should stop reading. In the case of the onion, we prepend the length directly to the tlv_stream, and for wire messages the packet length covers any existing fields and the tlv_stream.

@niftynei
Copy link
Collaborator

@cfromknecht thanks for the clarification

@t-bast
Copy link
Collaborator

t-bast commented Jun 21, 2019

Hey @cfromknecht do you think we should add test vectors to this PR?
I think it's helpful for newcomers and it's a good safety net to verify that everyone correctly enforces the constraints described in this spec.
I have some available locally if you're interested, but I'm sure you have plenty in your own implementation ;)

@cfromknecht
Copy link
Collaborator Author

Hey @t-bast! Yes I think that’s a good idea, I’ve been meaning to convert some of the tests in the PoC to json, but haven’t had time. If you post yours here I can add them and extend with any others I think of

@t-bast
Copy link
Collaborator

t-bast commented Jun 21, 2019

I don't know what the expected JSON format would be, but here is the content of my test suite. If you can provide the JSON format, I can convert some of them to JSON and add a commit to this PR.

Some of the test cases might not be convertible to a JSON spec imo because they depend on context for the stream parser (for example to test that a stream containing an unknown even type should trigger a failure).

Because of the need to configure parsers with sample TLV types, I find it a bit hard to spec the test vectors...if you think of a better format let me know!

Test Vectors

Known namespaces and TLVs

The following tests assume that two separate TLV namespaces exist: n1 and n2.

The n1 namespace supports the following TLV types:

  • type: 0x01
    • [1:0x08] (length)
    • [8:amount_msat]
  • type: 0x02
    • [1:0x08] (length)
    • [8:short_channel_id]
  • type: 0x03
    • [1:0x31] (length)
    • [33:node_id]
    • [8:amount_msat_1]
    • [8:amount_msat_2]
  • type: 0x0d
    • [1:0x02] (length)
    • [2:cltv_delta]

The n2 namespace supports the following TLV types:

  • type: 0x0a
    • [1:0x08] (length)
    • [8:amount_msat]
  • type: 0x0b
    • [1:0x04] (length)
    • [4:cltv_expiry]

TLV decoding failure

The following TLV in namespace n1 should trigger a decoding failure:

  • 0xfd02 (type truncated)
  • 0xfd022a (truncated after type)
  • 0xfd0100 (not minimally encoded type)
  • 0x2afd02 (length truncated)
  • 0x2afd0226 (truncated after length)
  • 0x2afe01010000 (not minimally encoded length)
  • 0x2afd26020231 (value truncated)
  • 0x02012a (short channel id too short)
  • 0x0209010101010101010101 (short channel id length too big)

TLV decoding success

The following TLV in namespace n1 should correctly decode. Encoding their corresponding object should result in the same bytes as given here:

  • 0x0108000000000000002a (amount_msat = 42)
  • 0x02080000000000000226 (short_channel_id = 550)
  • 0x033102eec7245d6b7d2ccb30380bfbe2a3648cd7a942653f5aa340edcea1f28368661900000000000002310000000000000451 (node_id = 0x02eec7245d6b7d2ccb30380bfbe2a3648cd7a942653f5aa340edcea1f283686619, amount_msat_1 = 561, amount_msat_2 = 1105)

TLV stream decoding failure

The following TLV streams in namespace n1 should trigger a decoding failure:

  • 0x0108000000000000002a01 (valid tlv record followed by invalid tlv record whose value are missing)
  • 0x020800000000000002260108000000000000002a (valid tlv records but invalid ordering)
  • 0x0208000000000000023102080000000000000451 (duplicate tlv type)
  • 0x0108000000000000002a2a0101 (unknown even type)
  • 0x0a0800000000000002310b0400000451 (valid tlv records but from different namespace)

TLV stream decoding success

The following TLV streams in namespace n1 should correctly decode. Encoding their corresponding object array should result in the same bytes as given here:

  • 0x0108000000000000023102080000000000000451033102eec7245d6b7d2ccb30380bfbe2a3648cd7a942653f5aa340edcea1f28368661900000000000002310000000000000451 (amount_msat = 561, short_channel_id = 1105, node_id = 0x02eec7245d6b7d2ccb30380bfbe2a3648cd7a942653f5aa340edcea1f283686619, amount_msat_1 = 561, amount_msat_2 = 1105)
  • 0x010800000000000002310b04000004510d02002a (amount_msat = 561, unknown odd TLV type 0x0b, cltv_delta = 42)

t-bast added a commit to ACINQ/eclair that referenced this pull request Jul 2, 2019
TLV (tag-length-value) types and TLV streams have been defined in the following spec PR: lightning/bolts#607

New Lightning Messages should use TLV extensively instead of ad-hoc per-message encoding. This also allows ignoring unknown odd TLV types, which lets implementers safely test new features on mainnet without impacting legacy nodes. It also allows type re-use which speeds up new features development.

Also cleaned-up and refactored common codecs.
- MUST minimally encode `type` and `length`.
- SHOULD NOT use redundant, variable-length encodings in a `tlv_record`.

The receiving node:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about rejecting redundant types in the stream?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is covered by the "strictly increasing" requirement.

Copy link
Collaborator

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🥨

@t-bast t-bast merged commit a82dd14 into lightning:master Jul 8, 2019
rustyrussell added a commit to rustyrussell/lightning-rfc that referenced this pull request Jul 9, 2019
These are based on @t-bast's vectors from lightning#607, with a few more
cases.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rustyrussell added a commit to rustyrussell/lightning-rfc that referenced this pull request Jul 16, 2019
These are based on @t-bast's vectors from lightning#607, with a few more
cases.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@niftynei niftynei removed the Meeting Discussion Raise at next meeting label Jul 22, 2019
rustyrussell added a commit to rustyrussell/lightning-rfc that referenced this pull request Jul 22, 2019
These are based on @t-bast's vectors from lightning#607, with a few more
cases.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rustyrussell added a commit to rustyrussell/lightning-rfc that referenced this pull request Jul 22, 2019
These are based on @t-bast's vectors from lightning#607, with a few more
cases:

1. Explicitly test encodings for 253, 254 and 255.
2. Use BigSize and make sure tests break badly if endian parsing is wrong.'
3. Test wrap-around of type encodings in stream.

Many thanks to @t-bast and @cfromknecht for their contributions and testing

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rustyrussell added a commit that referenced this pull request Jul 22, 2019
These are based on @t-bast's vectors from #607, with a few more
cases:

1. Explicitly test encodings for 253, 254 and 255.
2. Use BigSize and make sure tests break badly if endian parsing is wrong.'
3. Test wrap-around of type encodings in stream.

Many thanks to @t-bast and @cfromknecht for their contributions and testing

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants