-
-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Traits and wire format 2.0 #492
Comments
thanks for putting this together! this looks great.
these sound good
this would be part of the trait contract and not a defmt guarantee, right? that is the implementer of
this sounds like it could turn out quite nasty depending on the chosen panicking behavior. #[panic_handler]
fn panic(info: &PanicInfo) {
// this SHOULD use an `AtomicBool` to not call `error!` in presence of nested panics but doesn't
// implicitly calls `acquire`
defmt::error("{}", Debug2Format(info));
}
struct Bomb;
impl Format for Bomb {
fn format(&self, _: Formatter) {
let x = u32::MAX + 1; // unintentional panic with `dev` profile
// ..
}
}
defmt::info!("{}", Format);
// `info!` calls `acquire`
// then `info!` calls `Bomb::format`, which panics
// panic handler's `error!` calls `acquire` which is a "nested one" so that panics
// panic handler is invoked again; the new panic handler calls `error!` again
// another `acquire` = a new panic, repeat I think this wouldn't be an issue with the current but to be fair the root of the problem is the panic handler not handling re-entrancy at all -- you can also shoot yourself in the foot with
I'm not sure I follow the conclusion. the impl Format for MyType {
fn format_tag() -> u16 {
static X: AtomicU16 = AtomicU16::new(0);
X.fetch_add(1, Ordering::Relaxed)
}
} or do you mean that code generated by
since omitting the tag for each element of a slice' falls under "optimization", and not "correctness", could we use the trait Format {
fn format(&self, _: Formatter);
// default implementation applies a tag per element
fn format_slice(slice: &[Self], f: Formatter) {
for element in slice {
Self::format(element, f)
}
}
} // user code
#[derive(Format)]
struct MyStruct { .. }
// expands into
impl Format for struct MyStruct {
fn format(&self, f: Formatter) {
// same generated code as today
}
// override the default implementation
fn format_slice(&self, f: Formatter) {
// use some unstable API to omit the tag of each element
}
} the downside is that manual implementations of I think it'd be best if we can avoid the |
to clarify: my goal is not to make the trait object safe given that we don't correctly handle |
Yep, by "never fails" I meant it doesn't return Option or Result. It can "fail" by panicking, and impls should panic on reentrant acquires.
True... In my project I have added a "force release" function that the panic handler calls to ensure the logger is released, then logs the panic message. If the wire has framing it results in a corrupted frame, but the panic message's frame should always be transmitted correctly. I don't think this should be part of the trait, it rules out impls that don't do framing. Or maybe it should, but be documented as "best effort" as in it may cause corruption but you may still want it to avoid recursive panic death.
Yeah. Even if format_tag's contract says 'must always return the same tag for the same type' and all impls uphold it, it doesn't hold with The self-less version is not object-safe, so So if dyn support is a non-goal, let's drop it.
Interesting! I totally didn't know about that. However: if the end-user gets to manually-implement the main |
Hmm. Following the lines of "pseudo specialization with default trait methods", maybe this works!
trait Format {
// Only used for custom format impls
fn format(fmt: Formatter<'_>);
// Get the tag value. Must always return the same tag for the same type.
#[doc(hidden)]
fn _format_tag() -> u16 {
return internp!("{_custom_format}")
}
// Write
// safety: the global logger must be acquired.
#[doc(hidden)]
unsafe fn _format_data(&self) {
self.format();
defmt::export::write_custom_format_terminator();
}
}
impl<T: Format> Format for [T] {
fn format(&self, f: Formatter) {
// never called
}
fn _format_tag() -> u16 {
return internp!("{[:?]}")
}
unsafe fn _format_data(&self) {
defmt::export::write_tag(T::_format_tag());
defmt::export::write_usize(self.len());
for t in self [
i._format_data();
]
}
}
// =============== derived format example
// user code
#[derive(Format)]
struct MyStruct { .. }
// expands into
impl Format for struct MyStruct {
fn format(&self, f: Formatter) {
// never called
}
fn _format_tag() -> u16 {
return 1234;
}
// Write
// safety: the global logger must be acquired.
#[doc(hidden)]
unsafe fn _format_data(&self) {
defmt::export::write_whatever();
}
}
// =============== custom format example
// user code
impl defmt::Format for Ipv4Address{
fn format(fmt: Formatter<'_>) {
// it's possible to do multipe writes!
defmt::write!(fmt, "Ipv4Address({=u8}.{=u8}.{=u8}.{=u8}", self.a, self.b, self.c, self.d);
// and conditional writes!
if self.is_broadcast() {
defmt::write!(fmt, " (broadcast)");
}
defmt::write!(fmt, ")");
}
} |
Got it. Let's make sure the contract ("should $X") is documented in the API docs / book.
sound handy, can we have that in the panic-probe panic handler? :-)
ooh, I think I see what you mean. if you encode
yeah, I'd be OK with saying "we don't trait objects at the moment (but may in the future)".
I'm not sure this produces a different trait Format {
// ..
// Get the tag value. Must always return the same tag for the same type.
#[doc(hidden)]
fn _format_tag() -> u16 {
return internp!("{_custom_format}")
}
} e.g. struct X { /* fields */}
impl Format for X {
fn format(&self, f: Formatter) {
// log fields, etc.
}
}
struct Y { /* fields */}
impl Format for Y {
fn format(&self, f: Formatter) {
// log fields, etc.
}
} would that be a problem? (also I'm not quite sure I see the need for |
It seems that's impossible without specialization. For example,
The requirement is only same type -> same tag. It's OK to have different types have the same tag: that doesn't break the needs_tag optimization.
It's to implement the needs_tag optimization without state. To improve code size, the goal is to remove |
Thinking more about it, the "needs_tag" can be a bool param. That'd get passed in in a register, so it wouldn't cause bloat at every log call for stack-allocating anything. The trait would look like this. It feels somewhat cleaner. I'm not sure which is better for code size and for speed, I can try both. trait Format {
// Only used for custom format impls
fn format(fmt: Formatter<'_>);
// Write tag if with_tag, then write data.
// safety: the global logger must be acquired.
#[doc(hidden)]
fn _raw_format(&self, with_tag: bool) { .. }
} |
I like the single trait with doc(hidden) methods approach over the macro and 2 trait options. Given that the implementation-detail methods are hidden I don't have any preference over having 1 or 2 so I would vote for the option that performs better. I would suggest naming the doc(hidden) method(s) e.g. |
507: [2/n] - Remove code-size-costly optimizations r=jonas-schievink a=Dirbaio Part 2 of N of #492. Depends on #505 - Remove bool compression - Remove LEB128 compression. usize/isize are now 4 bytes, format tags are now 2 bytes. Code size comparsion (only including gains from this PR, not #505): ``` before after change debug: 107232 101452 -5.3% release: 40852 37396 -8.4% release + flags: 21936 19416 -11.4% release + flags + buildstd: 20524 18004 -12.3% rustc 1.54.0-nightly (676ee1472 2021-05-06) "flags" means these in Cargo.toml: codegen-units = 1 debug = 2 debug-assertions = false incremental = false lto = 'fat' opt-level = 'z' overflow-checks = false "buildstd" means these in .cargo/config.toml: [unstable] build-std = ["core"] build-std-features = ["panic_immediate_abort"] ``` Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
521: [3/n] Remove u24 r=japaric a=Dirbaio Part 3 of N of #492 - Now that we're going to use rzCOBS for encoding the stream, extra zero bytes are not that expenseive. Using a u32 instead of a u24 adds one zero byte, which when encoded is just 1 extra bit. - Users are unlikely to use u24, as it's quite obscure (I didn't know it existed until I found it while reading defmt's source). - It makes the code more complicated, because it's not natively supported by Rust. In the code size optimizations of the macro codegen I'm working on, it really breaks the "symmetry" of the code. Therefore I propose we remove it. Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
status update: by my count, 5 of the 6 proposed changes have been implemented 🎉
|
539: Add optional rzCOBS encoding+framing r=japaric a=Dirbaio Part of #492 Encoding is chosen with the `encoding-raw`, `encoding-rzcobs` features. If no encoding is chosen, `rzcobs` is used. The used encoding is stored in the ELF symbol table, and the decoder automatically picks the correct version. Co-authored-by: Dario Nieuwenhuis <dirbaio@dirbaio.net>
rzCOBS encoding is now also implemented, so closing this! |
I'm opening this issue to collect discussions that are scattered across many issues in a single coherent proposal.
Goals
The goal is to improve code size (#456), while fixing #455 #457 #458 as a bonus. The other main metrics (wire data size, and speed) shouldn't be sacrificed.
Better code size requires simpler code. Less things to do means smaller code.
In particular, making formatting require no state leads to the biggest code size wins from my experiments in #456. Currently
defmt
stores some state inInternalFormatter
. This state has to be stack-allocated, initialized and passed around at every log callsite, which requires a nontrivial amount of instructions.Proposed changes
1. Use
rzcobs
to encode the stream.The wire stream contains many zero bytes, rzCOBS can compress them very fast. This offsets wire size increase from removing LEB128 encoding (proposed below).
2. Remove LEB128 encoding
3. Remove bool compression
Unfortunately it's simply not possible to do without state. Bools will be encoded as a single
byte. false = 0x00, true = 0x01. At least 0x00s will be compressed by rzCOBS.
4. Apply "needs_tag" optimization to "first level" only.
Naively, array/slice contents would be encoded as
tag+data+tag+data+..
. This wastesspace for the redundant tags, since all array items are the same type, which should theoretically have the same tag.
defmt
currently has a "needs_tag" optimization that skips writing tags for all items other than the first, so the encoding istag+data+data+data...
. This applies recursively to all nested tags.However it has proven tricky to guarantee inner tags are really the same. Enums broke it in the past (#123), but it also breaks with
dyn
(#458) and with manual Format impls (#455).Solution:
[T]
is encoded asT_tag + data + data + data
, butdata
s have their inner tags kept.Manual
Format
impls are specially handled (see below).write!
#455 Decode breaks when not callingwrite!
#457 Wrong result using&[&dyn Formatter]
#458.[u8]
have the same wire size.5. New
Logger
trait.There are 2 main changes:
dyn Writer
anymore. Thedyn Writer
required being stored somewhere during the log call, which requires state, which causes bad code size. This has been PR'd in Replace Write trait with awrite
fn in Logger. #258.acquire()
can't fail, it now panics if called reentrantly. This is to avoid a conditional at every log callsite. Previous callsites had to do this:With this, callsites are now:
which is somewhat smaller and easier to optimize.
Note that this still allows Logger impls that allow lockless logging at multiple "contexts" (priority levels, threads). They can simply allow acquire() to succeed multiple times, at most once for each "context". Some discussion in #258.
Rationale for completely disallowing reentrant acquires:
..
6. New
Format
trait.Format
is not supposed to be implemented by end users manually. Only by macros or in-crate impls. To allow changingFormat
again in the future without breaking changes, the functions could be#[doc(hidden)]
with an "implementation detail, don't use" rustdoc.Format
is now not object-safe. This solves #458. This is somewhat unfortunate. It could be made object safe withfn format_tag(&self)
, but then the "Must always return the same tag for the same type." doesn't hold. I tried tricks likeimpl Format for dyn Format
but thatt's not allowed sincedyn Format
auto-implementsFormat
. If there's a way around it I'd love to hear it. Usingdyn Format
is very rare anyway.Why not
const TAG: u16
? Because the pointer casting to assign IDs to tags doesn't work in const context.The separate
format_tag()
allows impls for[T]
to print the tag once and then all the datas (the "needs_tag-lite" optimization).Now the tricky question: how to handle custom format impls?
1: attribute:
2: Macro'd impl:
3: CustomFormat trait:
This is powered by a special impl
impl<T: CustomFormat+?Sized> Format for T
. Wire format is a "special tag" indicating this is a CustomFormat, then multiple writes, then a terminator tag.Trait names up for bikeshedding. Maybe it should be RawFormat+Format, instead of Format+CustomFormat.
4: A combination of the above?: Maybe do CustomFormat impl now, and add a more optimized way like the attribute or the macro later?
The text was updated successfully, but these errors were encountered: