Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Traits and wire format 2.0 #492

Closed
Dirbaio opened this issue May 27, 2021 · 10 comments
Closed

RFC: Traits and wire format 2.0 #492

Dirbaio opened this issue May 27, 2021 · 10 comments
Assignees
Labels
breaking change fix / feature / improvement involves a breaking change and needs to wait until next minor version priority: high High priority for the Knurling team status: needs PR Issue just needs a Pull Request implementing the changes type: enhancement Enhancement or feature request
Milestone

Comments

@Dirbaio
Copy link
Contributor

Dirbaio commented May 27, 2021

I'm opening this issue to collect discussions that are scattered across many issues in a single coherent proposal.

Goals

The goal is to improve code size (#456), while fixing #455 #457 #458 as a bonus. The other main metrics (wire data size, and speed) shouldn't be sacrificed.

Better code size requires simpler code. Less things to do means smaller code.

In particular, making formatting require no state leads to the biggest code size wins from my experiments in #456. Currently defmt stores some state in InternalFormatter. This state has to be stack-allocated, initialized and passed around at every log callsite, which requires a nontrivial amount of instructions.

Proposed changes

1. Use rzcobs to encode the stream.

The wire stream contains many zero bytes, rzCOBS can compress them very fast. This offsets wire size increase from removing LEB128 encoding (proposed below).

  • Advantage: Improved wire size. In a test it reduces wire size by 38%
  • Advantage: More resilience to corrupted stream. Since rzCOBS has builtin framing, if a frame corrupts the decoder can recover and keep decoding the next frames.
  • Disadvantage: Slower. In the same test it's 10% slower. Maybe it should be made optional so the user can choose.

2. Remove LEB128 encoding

  • Advantage: Faster.
  • Advantage: usize/isize will now be fixed width, which allows for interesting optimizations later.
  • Disadvantage: Almost no code size gain, since LEB128 encoding is not inlined.
  • Disadvantage: Bigger wire size, but it's offset by rzCOBS.

3. Remove bool compression

Unfortunately it's simply not possible to do without state. Bools will be encoded as a single
byte. false = 0x00, true = 0x01. At least 0x00s will be compressed by rzCOBS.

  • Advantage: Faster.
  • Advantage: Smaller code (because no state).
  • Disadvantage: Bigger wire size, but not that much (printing many bools is not that common, somewhat offset by rzCOBS).

4. Apply "needs_tag" optimization to "first level" only.

Naively, array/slice contents would be encoded as tag+data+tag+data+... This wastes
space for the redundant tags, since all array items are the same type, which should theoretically have the same tag.

defmt currently has a "needs_tag" optimization that skips writing tags for all items other than the first, so the encoding is tag+data+data+data.... This applies recursively to all nested tags.

However it has proven tricky to guarantee inner tags are really the same. Enums broke it in the past (#123), but it also breaks with dyn (#458) and with manual Format impls (#455).

Solution:

  • Explicitly require that a given type MUST always have the same tag (Format trait change, see below).
  • Apply the optimization only to the "first level": [T] is encoded as T_tag + data + data + data, but
    datas have their inner tags kept.

Manual Format impls are specially handled (see below).

5. New Logger trait.

pub unsafe trait Logger {
    /// Acquire the global logger.
    ///
    /// Never fails. Panics if already acquired.
    fn acquire();

    /// Releases the global logger.
    ///
    /// # Safety
    /// Must be called exactly once for each acquire()
    unsafe fn release();

    /// Writes `bytes` to the destination.
    ///
    /// This will be called by the defmt logging macros to transmit encoded data. The write
    /// operation must not fail.
    ///
    /// Note that a call to `write` does *not* correspond to a defmt logging macro invocation. A
    /// single `defmt::info!` call can result in an arbitrary number of `write` calls.
    ///
    /// # Safety
    /// May only be called after acquire(), before the corresponding release()
    unsafe fn write(bytes: &[u8]);
}

There are 2 main changes:

  • No dyn Writer anymore. The dyn Writer required being stored somewhere during the log call, which requires state, which causes bad code size. This has been PR'd in Replace Write trait with a write fn in Logger. #258.
  • acquire() can't fail, it now panics if called reentrantly. This is to avoid a conditional at every log callsite. Previous callsites had to do this:
if acquire() {
    log_stuff();
    release();
}

With this, callsites are now:

acquire();
log_stuff();
release();

which is somewhat smaller and easier to optimize.

Note that this still allows Logger impls that allow lockless logging at multiple "contexts" (priority levels, threads). They can simply allow acquire() to succeed multiple times, at most once for each "context". Some discussion in #258.

Rationale for completely disallowing reentrant acquires:

  • It's simpler.
  • It's rare: I haven't found any code in the wild that does reentrant acquires.
  • If you do a reentrant acquire by accident, you would like to know. With the previous design, the reentrant log frame is silently dropped, which is not fun to debug.

..

  • Advantage: Smaller code (because no state).
  • Disadvantage: the Logger trait is now more lowlevel, so harder to implement.

6. New Format trait.

trait Format {
    // Get the tag value. Must always return the same tag for the same type.
    fn format_tag() -> u16,
    // Write 
    // safety: the global logger must be acquired.
    unsafe fn format_data(&self);
}

Format is not supposed to be implemented by end users manually. Only by macros or in-crate impls. To allow changing Format again in the future without breaking changes, the functions could be #[doc(hidden)] with an "implementation detail, don't use" rustdoc.

Format is now not object-safe. This solves #458. This is somewhat unfortunate. It could be made object safe with fn format_tag(&self), but then the "Must always return the same tag for the same type." doesn't hold. I tried tricks like impl Format for dyn Format but thatt's not allowed since dyn Format auto-implements Format. If there's a way around it I'd love to hear it. Using dyn Format is very rare anyway.

Why not const TAG: u16? Because the pointer casting to assign IDs to tags doesn't work in const context.

The separate format_tag() allows impls for [T] to print the tag once and then all the datas (the "needs_tag-lite" optimization).

Now the tricky question: how to handle custom format impls?

1: attribute:

#[derive(defmt::Format)]
#[defmt(format="{a=u8}.{b=u8}.{c=u8}.{d=u8}")]
pub struct Ipv4Address {
    u8 a,
    u8 b,
    u8 c,
    u8 d,
}
  • Advantage: As wire-efficient as an automatically derived impl.
  • Disadvantage: Much less powerful: limited to printing fields, can't do arbitrary calculations to print summaries/lengths etc.
  • Disadvantage: Requires a not-yet-invented syntax to specify fields to print, unclear how would that work. Would it allow printing fields of fields? indexing arrays?

2: Macro'd impl:

pub struct Ipv4Address {
    u8 a,
    u8 b,
    u8 c,
    u8 d,
}

defmt::impl_format!(Ipv4Address, {
    // user can do arbitrary logic here

    // impl_format! macro enforces there's exactly one write! statement.
    defmt::write!("{=u8}.{=u8}.{=u8}.{=u8}", self.a, self.b, self.c, self.d);
});
  • Advantage: As wire-efficient as an automatically derived impl.
  • Advantage: Powerful: user can print result of any calculation.
  • Disadvantage: Very ugly IMO. The "enforce only one write!" could be surprising, although with good error messages it might not be that bad.

3: CustomFormat trait:

pub struct Ipv4Address {
    u8 a,
    u8 b,
    u8 c,
    u8 d,
}

impl defmt::CustomFormat for Ipv4Address{
    fn format(fmt: Formatter<'_>) {
        // it's possible to do multipe writes!
        defmt::write!(fmt, "Ipv4Address({=u8}.{=u8}.{=u8}.{=u8}", self.a, self.b, self.c, self.d);
        // and conditional writes!
        if self.is_broadcast() {
            defmt::write!(fmt, " (broadcast)");
        }
        defmt::write!(fmt, ")");
    }
}

This is powered by a special impl impl<T: CustomFormat+?Sized> Format for T. Wire format is a "special tag" indicating this is a CustomFormat, then multiple writes, then a terminator tag.

Trait names up for bikeshedding. Maybe it should be RawFormat+Format, instead of Format+CustomFormat.

  • Advantage: Most similar to the defmt 0.2 way. Upgrading code is trivial.
  • Advantage: Extremely powerful: user can print result of any calculation, do multiple writes, do conditional writes. Fully matches expectations of users coming from core::fmt.
  • Disavantage: Not as wire-efficient as an automatically derived impl.
  • Disavantage: Having 2 traits might be confusing..?

4: A combination of the above?: Maybe do CustomFormat impl now, and add a more optimized way like the attribute or the macro later?

@Urhengulas Urhengulas added the type: enhancement Enhancement or feature request label May 27, 2021
@japaric
Copy link
Member

japaric commented May 31, 2021

thanks for putting this together! this looks great.

Sections 1-4

these sound good

  1. New Logger trait.
    Never fails. Panics if already acquired.

this would be part of the trait contract and not a defmt guarantee, right?

that is the implementer of Logger::acquire should trigger a panic on a re-entrant acquire. they could as well choose to execute an abort instruction, which may result in less code size.

  1. New Logger trait.
    acquire() can't fail, it now panics if called reentrantly.

this sounds like it could turn out quite nasty depending on the chosen panicking behavior.
if the user uses a panic handler crate that uses defmt but doesn't handle re-entrancy gracefully they could end with infinite recursion, e.g.

#[panic_handler]
fn panic(info: &PanicInfo) {
    // this SHOULD use an `AtomicBool` to not call `error!` in presence of nested panics but doesn't

    // implicitly calls `acquire`
    defmt::error("{}", Debug2Format(info));
}

struct Bomb;

impl Format for Bomb {
    fn format(&self, _: Formatter) {
        let x = u32::MAX + 1; // unintentional panic with `dev` profile
        // ..
    }
}

defmt::info!("{}", Format);
// `info!` calls `acquire`
// then `info!` calls `Bomb::format`, which panics
// panic handler's `error!` calls `acquire` which is a "nested one" so that panics
// panic handler is invoked again; the new panic handler calls `error!` again
// another `acquire` = a new panic, repeat

I think this wouldn't be an issue with the current -> Option<dyn _> signature.

but to be fair the root of the problem is the panic handler not handling re-entrancy at all -- you can also shoot yourself in the foot with panic!("{:?}", Bomb) even if you don't use defmt at all.

  1. New Format trait.
    It could be made object safe with fn format_tag(&self), but then the "Must always return the same tag for the same type." doesn't hold.

I'm not sure I follow the conclusion. the &self-less version of format_tag also cannot guarantee 'must always return the same tag for the same type' because one can write:

impl Format for MyType {
    fn format_tag() -> u16 {
        static X: AtomicU16 = AtomicU16::new(0);
        X.fetch_add(1, Ordering::Relaxed)
    }
}

or do you mean that code generated by defmt proc macros can not uphold 'must always return the same tag for the same type' when trait objects are involved if format_tag takes &self but it can uphold that when format_tag is &self-less?

  1. New Format trait.
    The separate format_tag() allows impls for [T] to print the tag once and then all the datas (the "needs_tag-lite" optimization).

since omitting the tag for each element of a slice' falls under "optimization", and not "correctness", could we use the Hash::hash_slice "pseudo-specialization" trick for that optimization? Namely, have a single trait like this:

trait Format {
    fn format(&self, _: Formatter);
    // default implementation applies a tag per element
    fn format_slice(slice: &[Self], f: Formatter) {
        for element in slice {
            Self::format(element, f)
        }
    }
}
// user code
#[derive(Format)]
struct MyStruct { .. }

// expands into
impl Format for struct MyStruct {
    fn format(&self, f: Formatter) {
        // same generated code as today
    }

    // override the default implementation
    fn format_slice(&self, f: Formatter) {
        // use some unstable API to omit the tag of each element
    }
}

the downside is that manual implementations of Format (i.e. that do not use #[derive(Format)]) won't get the slice optimization

I think it'd be best if we can avoid the format_tag method in the API. If we can't then having 2 traits with one completely #[doc(hidden)] seems fine to me.

@japaric
Copy link
Member

japaric commented May 31, 2021

I think it'd be best if we can avoid the format_tag method in the API.

to clarify: my goal is not to make the trait object safe

given that we don't correctly handle dyn Format today (see #458) I would be OK with saying "we don't currently support trait objects" (by e.g. intentionally making the trait not-object-safe) for now if it proves tricky to get that working correctly

@Dirbaio
Copy link
Contributor Author

Dirbaio commented May 31, 2021

Never fails. Panics if already acquired.

this would be part of the trait contract and not a defmt guarantee, right?

that is the implementer of Logger::acquire should trigger a panic on a re-entrant acquire. they could as well choose to execute an abort instruction, which may result in less code size.

Yep, by "never fails" I meant it doesn't return Option or Result. It can "fail" by panicking, and impls should panic on reentrant acquires.

if the user uses a panic handler crate that uses defmt but doesn't handle re-entrancy gracefully they could end with infinite recursion, e.g.

True... In my project I have added a "force release" function that the panic handler calls to ensure the logger is released, then logs the panic message. If the wire has framing it results in a corrupted frame, but the panic message's frame should always be transmitted correctly. I don't think this should be part of the trait, it rules out impls that don't do framing. Or maybe it should, but be documented as "best effort" as in it may cause corruption but you may still want it to avoid recursive panic death.

or do you mean that code generated by defmt proc macros can not uphold 'must always return the same tag for the same type' when trait objects are involved if format_tag takes &self but it can uphold that when format_tag is &self-less?

Yeah. Even if format_tag's contract says 'must always return the same tag for the same type' and all impls uphold it, it doesn't hold with dyn Format, which impls Format. So the code generated by the macros can't rely on the contract for any T: Format

The self-less version is not object-safe, so dyn Format is not a thing and everything is fine.

So if dyn support is a non-goal, let's drop it.

could we use the Hash::hash_slice "pseudo-specialization" trick for that optimization? Namely, have a single trait

Interesting! I totally didn't know about that. However: if the end-user gets to manually-implement the main format method, how do we prevent #455 #457?

@Dirbaio
Copy link
Contributor Author

Dirbaio commented May 31, 2021

Hmm. Following the lines of "pseudo specialization with default trait methods", maybe this works!

  • Advantage: single trait
  • Advantage: slice optimization applies to all impls, derived or custom.
  • Disadvantage: the trait is ugly... has doc(hidden) methods that the user must not impl or call.
trait Format {
    // Only used for custom format impls
    fn format(fmt: Formatter<'_>);

    // Get the tag value. Must always return the same tag for the same type.
    #[doc(hidden)]
    fn _format_tag() -> u16 {
        return internp!("{_custom_format}")
    }

    // Write 
    // safety: the global logger must be acquired.
    #[doc(hidden)]
    unsafe fn _format_data(&self) {
        self.format();
        defmt::export::write_custom_format_terminator();
    }
}

impl<T: Format> Format for [T] {
    fn format(&self, f: Formatter) {
        // never called
    }

    fn _format_tag() -> u16 {
        return internp!("{[:?]}")
    }

    unsafe fn _format_data(&self) {
        defmt::export::write_tag(T::_format_tag());
        defmt::export::write_usize(self.len());
        for t in self [
            i._format_data();
        ]
    }
}

// =============== derived format example
// user code
#[derive(Format)]
struct MyStruct { .. }

// expands into
impl Format for struct MyStruct {
    fn format(&self, f: Formatter) {
        // never called
    }

    fn _format_tag() -> u16 {
        return 1234;
    }

    // Write 
    // safety: the global logger must be acquired.
    #[doc(hidden)]
    unsafe fn _format_data(&self) {
        defmt::export::write_whatever();
    }
}

// =============== custom format example
// user code

impl defmt::Format for Ipv4Address{
    fn format(fmt: Formatter<'_>) {
        // it's possible to do multipe writes!
        defmt::write!(fmt, "Ipv4Address({=u8}.{=u8}.{=u8}.{=u8}", self.a, self.b, self.c, self.d);
        // and conditional writes!
        if self.is_broadcast() {
            defmt::write!(fmt, " (broadcast)");
        }
        defmt::write!(fmt, ")");
    }
}

@japaric
Copy link
Member

japaric commented Jun 1, 2021

Yep, by "never fails" I meant it doesn't return Option or Result. It can "fail" by panicking, and impls should panic on reentrant acquires.

Got it. Let's make sure the contract ("should $X") is documented in the API docs / book.

In my project I have added a "force release" function that the panic handler calls to ensure the logger is released, then logs the panic message.

sound handy, can we have that in the panic-probe panic handler? :-)

Yeah. Even if format_tag's contract says 'must always return the same tag for the same type' and all impls uphold it, it doesn't hold with dyn Format, which impls Format.

ooh, I think I see what you mean. if you encode [dyn Format] and try to do the 'omit tag' optimization it would be wrong because each trait object could, behind the opaque trait object, be a different type and such each one should be tagged. some spit balling: perhaps always avoiding the 'omit tag' optimization on [dyn Format] would make supporting dyn Format possible -- don't know if the 'dyn Format MUST be tagged' requirement shows up in other places though.

So if dyn support is a non-goal, let's drop it.

yeah, I'd be OK with saying "we don't trait objects at the moment (but may in the future)". dyn Format trait objects don't seem like something that would have a lot of use right now (if dyn A + B was supported in the general case -- not limited to auto-traits -- perhaps that would be different) so I'd be OK with punting on that until there's more demand for it.

Hmm. Following the lines of "pseudo specialization with default trait methods", maybe this works!

I'm not sure this produces a different _format_tag per type when manually implemented

trait Format {
    // ..
    // Get the tag value. Must always return the same tag for the same type.
    #[doc(hidden)]
    fn _format_tag() -> u16 {
        return internp!("{_custom_format}")
    }
}

e.g. X and Y (below) would have the exact same tag that comes from the default Format::format_tag method, no?

struct X { /* fields */}
impl Format for X {
    fn format(&self, f: Formatter) {
        // log fields, etc.
    }
}

struct Y { /* fields */}
impl Format for Y {
    fn format(&self, f: Formatter) {
        // log fields, etc.
    }
}

would that be a problem?

(also I'm not quite sure I see the need for _format_tag given that you can always override _format_data in #[derive]-d implementations and the implementations for primitives that reside in the defmt crate)

@Dirbaio
Copy link
Contributor Author

Dirbaio commented Jun 1, 2021

perhaps always avoiding the 'omit tag' optimization on [dyn Format] would make supporting dyn Format possible

It seems that's impossible without specialization. For example, impl Format for [&dyn Format] conflicts with impl<T: Format> Format for [T]. I haven't found a way to make that work.

I'm not sure this produces a different _format_tag per type when manually implemented

The requirement is only same type -> same tag. It's OK to have different types have the same tag: that doesn't break the needs_tag optimization.

the need for _format_tag given that you can always override _format_data in #[derive]-d implementations and the implementations for primitives that reside in the defmt crate

It's to implement the needs_tag optimization without state. _format_tag returns the tag, _format_data writes the data without the tag. This allows the caller to choose when the tag is printed. For single formats, caller will write tag+data. For multiple formats, caller will write tag+data+data+data+...

To improve code size, the goal is to remove InternalFormatter. This means we can no longer store a needs_tag bool there.

@Dirbaio
Copy link
Contributor Author

Dirbaio commented Jun 3, 2021

Thinking more about it, the "needs_tag" can be a bool param. That'd get passed in in a register, so it wouldn't cause bloat at every log call for stack-allocating anything.

The trait would look like this. It feels somewhat cleaner. I'm not sure which is better for code size and for speed, I can try both.

trait Format {
    // Only used for custom format impls
    fn format(fmt: Formatter<'_>);

    // Write tag if with_tag, then write data.
    // safety: the global logger must be acquired.
    #[doc(hidden)]
    fn _raw_format(&self, with_tag: bool) { .. }
}

@japaric
Copy link
Member

japaric commented Jun 3, 2021

I like the single trait with doc(hidden) methods approach over the macro and 2 trait options. Given that the implementation-detail methods are hidden I don't have any preference over having 1 or 2 so I would vote for the option that performs better.

I would suggest naming the doc(hidden) method(s) e.g. _do_not_override_this_raw_format. Rust-Analyzer can auto-complete default methods, but by default it will only auto-complete the non-default methods. In case someone does one code-assist too many then the method name will them to not override the method.

@Urhengulas Urhengulas added the priority: high High priority for the Knurling team label Jun 16, 2021
bors bot added a commit that referenced this issue Jun 17, 2021
505: [1/n] - Logger trait v2. r=jonas-schievink a=Dirbaio

Part 1 of N of #492 

- Panics on reentrant acquire.
- Remove `Write` trait.

Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
bors bot added a commit that referenced this issue Jun 17, 2021
507: [2/n] - Remove code-size-costly optimizations r=jonas-schievink a=Dirbaio

Part 2 of N of #492. Depends on #505 

- Remove bool compression
- Remove LEB128 compression. usize/isize are now 4 bytes, format tags are now 2 bytes.

Code size comparsion (only including gains from this PR, not #505):

```
                             before    after    change
debug:                       107232   101452    -5.3% 
release:                      40852    37396    -8.4%
release + flags:              21936    19416   -11.4%
release + flags + buildstd:   20524    18004   -12.3%

rustc 1.54.0-nightly (676ee1472 2021-05-06)
"flags" means these in Cargo.toml:
    codegen-units = 1
    debug = 2
    debug-assertions = false
    incremental = false
    lto = 'fat'
    opt-level = 'z'
    overflow-checks = false

"buildstd" means these in .cargo/config.toml:
    [unstable]
    build-std = ["core"]
    build-std-features = ["panic_immediate_abort"]
```

Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
@Urhengulas Urhengulas added the breaking change fix / feature / improvement involves a breaking change and needs to wait until next minor version label Jun 18, 2021
@Urhengulas Urhengulas added this to the v0.3.0 milestone Jun 18, 2021
bors bot added a commit that referenced this issue Jun 29, 2021
521: [3/n] Remove u24 r=japaric a=Dirbaio

Part 3 of N of #492 

- Now that we're going to use rzCOBS for encoding the stream, extra zero bytes are not that expenseive. Using a u32 instead of a u24 adds one zero byte, which when encoded is just 1 extra bit.
- Users are unlikely to use u24, as it's quite obscure (I didn't know it existed until I found it while reading defmt's source).
- It makes the code more complicated, because it's not natively supported by Rust. In the code size optimizations of the macro codegen I'm working on, it really breaks the "symmetry" of the code.

Therefore I propose we remove it.

Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
bors bot added a commit that referenced this issue Jun 30, 2021
508: [5/n] Format trait v2 r=japaric a=Dirbaio

Part of #492 . Depends on ~~#505 #507 #521~~ #524

- Implemented new Format trait according to #492.
- Adjusted everything else accordingly.

Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
@japaric
Copy link
Member

japaric commented Jul 7, 2021

status update: by my count, 5 of the 6 proposed changes have been implemented 🎉

  1. Use rzcobs to encode the stream. Add optional rzCOBS encoding+framing #539
  2. Remove LEB128 encoding [2/n] - Remove code-size-costly optimizations #507
  3. Remove bool compression [2/n] - Remove code-size-costly optimizations #507
  4. Apply "needs_tag" optimization to "first level" only. [5/n] Format trait v2 #508
  5. New Logger trait. [1/n] - Logger trait v2. #505
  6. New Format trait. [5/n] Format trait v2 #508

bors bot added a commit that referenced this issue Jul 20, 2021
539: Add optional rzCOBS encoding+framing r=japaric a=Dirbaio

Part of #492 

Encoding is chosen with the `encoding-raw`, `encoding-rzcobs` features. If no encoding is chosen, `rzcobs` is used.

The used encoding is stored in the ELF symbol table, and the decoder automatically picks the correct version.

Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
@japaric japaric added the status: needs PR Issue just needs a Pull Request implementing the changes label Jul 29, 2021
@jonas-schievink
Copy link
Contributor

rzCOBS encoding is now also implemented, so closing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change fix / feature / improvement involves a breaking change and needs to wait until next minor version priority: high High priority for the Knurling team status: needs PR Issue just needs a Pull Request implementing the changes type: enhancement Enhancement or feature request
Projects
None yet
Development

No branches or pull requests

4 participants