-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making panic-never
a requirement or convention for rust-embedded
libraries where feasible
#551
Comments
I'd like to add that another common reason for using However, it is actually possible to remove this bloat in nightly, putting this in [unstable]
build-std = ["core"]
build-std-features = ["panic_immediate_abort"] I'm pointing this out because I've seen people on Matrix try to get rid of panics due to the bloat, and the This gets rid of the bloat argument, the safety/correctness argument does remain. Another important thing is that the concept of "panic" or "fatal error" is not only a Rust concept, it's an ARM architectural concept. Even if you completely get rid of Rust panics in your code, you still have HardFault and the other exceptions! They can be triggered by invalid memory accesses, invalid peripheral register reads/writes, division by 0, floating point exceptions, and lots of other chip-specific things. These are NOT caught by Therefore if you're writing safety-critical code (like stop a motor on crash), you still have to handle HardFaults, and you still have all the associated problems (need to make state global, etc), even if you use HardFaults can be even triggered by hardware issues causing bitflips or other glitches (eg EMI interference, severe power supply noise). Even if you're ultra 100% pinkie-promise sure your code can never ever HardFault, you should still handle HardFaults :) If you still have to handle HardFaults, why bother going through the extreme pain of trying to get rid of Rust panics? |
Something of a sidebar, but: is anyone aware of a tracking issue for adding a first-class feature for this to I know there are a few options for this in crates, including |
Great point @Dirbaio - and slightly terrifying! Makes a note to look into HardFault handling on stm32f107 for the artwork. On a related note, maybe in our case RTIC could help here by allowing the user to declare a dedicated task for handling panic/fault events, the difference being RTIC could provide some access to the current context (i.e. While HardFaults do bring into question the argument that panicking encourages anti-patterns / global state, I think the benefit of allowing the user to know for sure that at least panicking is not possible is still a worthy one. I.e. having the option between having to consider unsuspecting, subtle logic errors leading to panics in dependencies as well as cosmic rays, vs only having to consider cosmic rays, I'd prefer the latter. I do acknowledge it's not at all a trivial ask though.
Another issue worth considering that was brought up on Matrix is that |
|
That's kind of whataboutism, isn't it? Typical panics are due to logic/programming errors, which I'd like the compiler's help with in avoiding, since they're my mistake (or of libraries I depend on). |
Hey there, I'm reading through your post and addressing specific points as I see fit. As some background, I've been a firmware developer for the last ~8 years, most of which has been in C/C++, and the last 2 years shifted more towards Rust. Oftentimes, my work finds me in safety-critical systems.
Custom panic handlers, which I would generally refer to as an error handler, is quite common practice in the industry and is often seen in embedded designs. Putting the device into a "known safe state" is a common approach. If this is difficult for your design, I would attribute this to the hardware that you are working on (e.g. no simple way to power off motors quickly) and not a fault of independently developed libraries or languages. Ideally, you should have a fail-free way to get the system into a safe, controlled state when fatal errors occur. This typically involves hooking up power supply shutdown/enable pins to GPIOs of the controller to quickly power off motors etc. I think it is unfair to claim that because of your underlying hardware design, you were forced to make unpleasant firmware compromises (e.g. moving UART drivers into global scope etc.), although I do agree that panic handlers in rust are particularly more complicated than other languages.
I would argue this is a false sense of relief, as it just offloads the error handling to your code, but that's a bit besides the point ;)
Based on my understanding, this is exactly how rustc generates panics for indexing. It internally generates bounds checks in the code and branches to panic if bounds are not met. How would you recommend addressing this otherwise? If you simply ignore bounds issues, you go back to languages like C and throw out memory safety. But if you don't panic, you have to somehow propagate every array index into an analyzeable In regard to panics in librariesI've written a number of external libraries at this point, and I want to point out that I tend to use the panic handler as a means of verifying that my code has been written correctly. For example, I will often SummaryIn summary, I honestly think things are not bad the way they are. Yes, I agree that panics definitely introduce some level of binary bloat due to formatting, and yes, accessing internal resources in the panic handler is oftentimes cumbersome. However, I don't think this is a reason that panicking shouldn't exist, as error handling is a problem that has been around since long before Rust. Ultimately, as @Dirbaio pointed out, there will always need to be some way of handling unexpected events in a system. I think the right approach here is to address:
While I agree that (1) is likely a problem that needs to be addressed at the language level, I am of the opinion that (2) should be entirely up to the developer, and we shouldn't impose any specific way of doing this on them. Writing panic handlers to get systems into a safe state when you have simple GPIO toggles to turn everything off actually turns a custom panic handler into a very reasonable, failure-free means of getting your device into a safe state. |
Thanks for your insight @ryan-summers! I just want to add/get a bit of clarification on some points: Moving
|
panic-never
a requirement or convetion for rust-embedded
libraries where feasiblepanic-never
a requirement or convention for rust-embedded
libraries where feasible
Just a thought: Wouldn't RTIC be in the perfect position to address this? If RTIC handled panics, then the application developer can provide a panic handler that works like any other RTIC task (i.e. specifies which resources it needs and gets them from RTIC). |
As much as I disdain the bloat, I don't think getting rid of panics or even Rather, I wish |
How? Most Rust embedded projects I've seen implement very simple panic handlers that do something like specify an |
I'm not quite sure why bound checks seem to be deemed the largest cause of undesirable panic causes since they're rather easy to avoid: You can use explicit length check or slice to a known size to get bounds check elision. Or you can use fallible accessors or any of the wide range of "functional" tools. From where I'm standing, the use of |
The Subslice patterns are also great for this if they fit your use cases. |
I didn't mean to imply that it's difficult to work around them, more that indexing and slicing are rather common and difficult to search for compared to easily searchable keywords like
I realise there are lots of ways to help the compiler to optimise out bounds checks - my interest in const generics stemmed from aiming for a type-level solution that is safe yet doesn't rely on panic branches to allow consistent error behaviour regardless of the optimisation level.
Agreed, these are both very useful |
Please note that I was replying to a comment about making it easier to access resources in panic handlers. I don't think it makes sense for RTIC to be involved, if the panic handling is very simple. I could imagine that RTIC could (optionally) generate the panic handler for me though, if I specify it: #[task(binds = panic, resources = [motor])]
fn handle_panic(cx: handle_panic::Context) {
cx.resources.motor.disable();
} I haven't researched this, so it might not be practical. As I said, just a thought. |
I did some research. This has been discussed before: rtic-rs/rfcs#27 |
That's not possible because RTIC can't mask HardFault, and masking interuptions guaranties unique &mut borrow |
@hannobraun The panic handler binding in RTIC would indeed be possible, I can add it to a discussion point in the next RTIC meeting and see how it fits into the current development plan. :) |
Note that you can use https://crates.io/crates/panic-persist or something similar to check at boot if there was a panic, and stopping the motor at this moment, in a sane environment. |
Nirvana fallacy. I'm never going to wear a seatbelt because I could also be shot in the head. Just because it's impossible to prevent every panic, doesn't mean we can't stop the preventable ones. |
TL;DR
It would be great if
rust-embedded
adoptedpanic-never
as a standard for libraries. I found it impossible to take advantage ofpanic-never
while also taking advantage of the rust-embedded libraries necessary to build my first non-trivial Rust embedded project. This was due to frequent uses ofpanic!
throughout libraries. This is totally understandable as Rust itself provides very little tooling to avoid this and almost encourages it (i.e. indexing, slicing). Whilepanic!
is very useful for quick iteration in software, it can be detrimental to firmware without significant tooling/logging in thepanic-handler
which isn't always feasible for embedded projects, especially due to severe lack of context in the panic handler compared to regular error handling. The newishno-panic
crate may help with retrofittingpanic-never
, and const generics may help to avoid common causes of panicking code branches.Context & Motivation
This suggestion comes from my experience writing firmware for a kinetic artwork over the past few months. I finally have had some time to reflect and thought I'd open an issue here to get others' thoughts on this :) It seems particularly good timing considering that avoiding Rust panics seems to be the hot topic for landing Rust support in the Linux kernel today https://lkml.org/lkml/2021/4/14/1099.
This was one of my first major firmware projects, involving a pretty tall stack of protocols including SPI for LEDs, I2C for time of flight, one-wire UART for motor driver control and Ethernet for real-time TCP/IP communication with the master software. To achieve this project in the deadline that we had would have been impossible without all the awesome existing work in the rust-embedded ecosystem. The fact that I could include an Ethernet bootloader, serialization between software and firmware, use a real-time scheduler and more thanks to existing work made the project possible! Naturally, this required leaning on quite a few dependencies, including
postcard
(and in turnserde
),rtic
,smoltcp
and loads of others.During early prototype testing, I quickly learned just how drastic
panic!
ing could be in firmware compared to my experience with writing software, particularly when controlling a large number of motors attached to expensive parts. This lead me to search for solutions to ensure that I could avoid panicking entirely, which lead me to thepanic-never
crate.After a few days of commenting out the entire project and trying to add modules back one by one with
panic-never
included, I quickly realised that, while I could track down and address all of thepanic!
sites in my own code, it would be impossible for me to track down and address allpanic!
sites throughout all the dependencies that I required for the project to function - especially considering the limited, cryptic linker errors thatpanic-never
could provide, resulting in an approach that consisted of commenting everything out and re-adding parts one at a time until the linker error showed up.Following the realisation that I would have to accept the possibility of
panic!
s, I began work on a custom panic handler. Easily the largest problem with the custom panic handler was the lack of context, and not knowing what state the device was in when the panic occurred... This lead to the need for moving parts of the application state into global state. This was necessary to 1. send some indication of an error back to the master via Ethernet (provided it was even possible to do so in the panicking state) and 2. disable the motor via UART! This was of utmost importance as the motor driver has it's own step generator, and if the last thing it received was some high velocity before the panic, then there was nothing else stopping it from endlessly driving out the motors until someone freaks out and cuts the power 😱Beyond the obvious reasons why moving state into a global context was unpleasant, I was using RTIC to handle scheduling. RTIC requires managing state in a certain way in order for its priority task system to function in a safe-yet-efficient manner. This meant lots of acrobatics with mutexes and critical sections in order to expose the necessary networking and motor state to the panic handler through a global context, much of which I'm still uncertain is actually safe to this day.
I want to acknowledge that all of these problems are ultimately our own fault. Specifically, for cornering ourselves by accepting a timeline for a project that meant I simply couldn't both 1. take advantage of many of the awesome existing crates throughout the rust ecosystem that were necessary to make such a sophisticated project possible in a short amount of time and 2. actually review all of these dependencies and develop enough familiarity with their src to guarantee there could be no
panic!
conditions throughout. It is this choice that lead to the need for the aformentioned hacks and awkward panic handling solution.That said, I think it is at least worth checking whether or not it is possible to have our cake and eat it too by investigating the feasibility of having
panic-never
as a standard practise for embedded libraries. I cannot tell you how much of a relief it would be to know for certain that it simply wasn't possible to panic, particularly when the firmware is moving 100s of motors around on an artist's budget that provides very little room for repairs 😂 While custom panic handlers help, they provide almost no context about the state of the system during a panic by default and encourage some serious anti-patterns in order to handle those cases.no-panic
I think perhaps this is more achievable now that
no-panic
exists, allowing for a more granular approach to narrowing downpanic!
sites, also with slightly better error messages. The function attribute approach allows for achieving a panic-less codebase one function at a time, without having to solve everything at once as is required withpanic-never
alone.Indexing, slicing and const generics
I think const generics may also play a large role in making this possible. Perhaps the sneakiest and most prevalent culprit for introducing panicking code is rust's core
Index
ing and slicing methods. This is especially frustrating when most embedded code works with fixed size arrays, where the author performing the indexing/slicing knows that it is safe to do so and that it is impossible for the panic to actually occur. I wonder if we can come up with some const-generic based approach to bounds checking for indexing and slicing of fixed size arrays that avoids the need for generating panicking branches.rustc
Another approach might be to instead focus on landing support for avoiding panicking in
rustc
itself? Whileno-panic
is already a big improvement overpanic-never
, it is still a long-shot from having a nicely formatted call-stack with line-numbered links to the source code of each function call that leads to each panic. I'm yet to investigate existing proposals for such a tool.My aim with this issue is mostly to begin a discussion. I'm curious to hear others' thoughts, i.e. Have you had similar experirences? Is this a worthy/pracitcal goal? Or perhaps infeasible for reasons I haven't touched on yet?
The text was updated successfully, but these errors were encountered: