Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider no_std support #15

Closed
bjoernQ opened this issue Feb 3, 2023 · 11 comments
Closed

Consider no_std support #15

bjoernQ opened this issue Feb 3, 2023 · 11 comments

Comments

@bjoernQ
Copy link
Contributor

bjoernQ commented Feb 3, 2023

To run this on an MCU it would be beneficial to support no_std environments

I did a POC "from scratch" which is no_std and no_alloc here: https://github.com/bjoernQ/bare-matter (the code is very bad but might be an inspiration)

I guess making this completely no_alloc will be hard but at least no_std should be possible

@M1cha
Copy link

M1cha commented Mar 21, 2023

I would've really loved for this to be no_alloc since the c++ implementation has the same issue.

I guess the devs are just not into the idea(which is fine I guess) and I guess making this crate support no_alloc wouldn't be generally hard but might result in a completely different api design, code and thus project.

@nevi-me
Copy link
Contributor

nevi-me commented Mar 21, 2023

@M1cha I don't get the impression that it's not wanted/desired. There's a partial implementation at https://github.com/ivmarkov/matter-rs/tree/no-alloc. It's most probably a function of someone not having enough time to implement something.

I'm working on making the crate work with ESP32 again as some crypto stuff doesn't work. I don't have experience with no_std, but if I'd know what I'd be doing, I'd also try that out.

@M1cha
Copy link

M1cha commented Mar 22, 2023

@nevi-me
I only took a quick look but I saw quite a few Boxs and Vecs in the matter-rs codebase and thought it might take quite some redesign work to change that. I'd be very happy to be proven wrong and it's awesome that somebody else is/was working on it.

Quite off-topic, but:

I don't get the impression that it's not wanted/desired

It's just my impression and it might very well be wrong (and hopefully so).

To me, not relying on memory allocations(in both the rust and the C++ codebase) is not a matter of "making it work" in certain environments, I find it careless to not pay attention to that and that there are now released products which do these allocations and are thus prone to crash.

Coming from both the Zephyr RTOS and the current HomeKit SDK where this concept is embraced throughout the whole codebase, it just feels like a huge downgrade.

Additional context(e.g. why placement new's are not a solution):
project-chip/connectedhomeip#2748 (comment)

@ivmarkov
Copy link
Contributor

ivmarkov commented Jul 12, 2023

@M1cha Just to ping you - if you have not followed the development in this repo recently - to take a look at the updated README file.

The TL;DR is that what would soon become the main branch has zero heap allocations and is no_std compatible (I.e. only Rust core is used). That is, modulo a few small allocations in the rust-crypto upstream dependency, that we might be addressing as well soon.

Otherwise - yeah - I'm in the same boat as you - no heap allocations (or minimal, well-thought out ones).

@ivmarkov
Copy link
Contributor

Additional context(e.g. why placement new's are not a solution): project-chip/connectedhomeip#2748 (comment)

@M1cha By the way, I really hope you had concerns with "placement new" + into CHIPMem allocations, as the original comment that you were referring to actually reads, and not really against "placement new" in general.

I can't imagine what your arguments against "placement new" in general might be, but I can say I'm sorely missing this construct in Rust, where - due to un-optimized moves - you might end up with stack blowups which are just as dangerous as OOM due to heap allocs / memory fragmentation.

@ivmarkov
Copy link
Contributor

As for the codebase - yeah, quite a bit had to be touched/changed, but in the end - not as much as I've expected.

@M1cha
Copy link

M1cha commented Jul 12, 2023

@M1cha By the way, I really hope you had concerns with "placement new" + into CHIPMem allocations, as the original comment that you were referring to actually reads, and not really against "placement new" in general.

placement news can be used in multiple ways and yes, I only mean allocating data from CHIPMem and passing it to "placement new"s because that's functionally equal(and thus disliked by me) to normal new's. This concept is only useful if you need to use a different allocator as is very common in game development to speed up allocations by removing the need for syscalls. The rust equivalent of this are custom allocators which would be just as bad for this situation.
TL;DR: My main concerns are OOM and memory-leaks and with CHIPMem you still have both.

I can't imagine what your arguments against "placement new" in general might be, but I can say I'm sorely missing this construct in Rust, where - due to un-optimized moves - you might end up with stack blowups which are just as dangerous as OOM due to heap allocs / memory fragmentation.

You actually have multiple ways to reduce stack usage in rust. But they probably all come done to having some piece of global, static memory that was allocated for you by the binary loader(in linux those are ELF sections). This is basically the same as C's "global" static variables. In Rust, you can actually just have a variable of any type globally accessible, but of course you'll have to use unsafe blocks to access it unless you're using a helper like lazy_static. You can also write your own allocator that utilizes one of these global buffers to give you memory.

About unoptimized moves: IIRC This a general conceptional issue with rust and it's unclear if that's solvable in a generic way. Release mode binaries should almost never have that problem and if they do rustc and/or the LLVM optimizer need to be improved.

@ivmarkov
Copy link
Contributor

I can't imagine what your arguments against "placement new" in general might be, but I can say I'm sorely missing this construct in Rust, where - due to un-optimized moves - you might end up with stack blowups which are just as dangerous as OOM due to heap allocs / memory fragmentation.

You actually have multiple ways to reduce stack usage in rust. But they probably all come done to having some piece of global, static memory that was allocated for you by the binary loader(in linux those are ELF sections). This is basically the same as C's "global" static variables. In Rust, you can actually just have a variable of any type globally accessible, but of course you'll have to use unsafe blocks to access it unless you're using a helper like lazy_static. You can also write your own allocator that utilizes one of these global buffers to give you memory.

(Skipping this as this is common knowledge.)

About unoptimized moves: IIRC This a general conceptional issue with rust and it's unclear if that's solvable in a generic way. Release mode binaries should almost never have that problem and if they do rustc and/or the LLVM optimizer need to be improved.

I think you are underestimating the need for placement-new in Rust.

I don't know how conceptual and unclear it is, but I have seen people dismissing the need for placement new in Rust precisely because "release mode binaries should never have this problem" and "it is all about Rustc/LLVM optimisations". Except that these optimisations never come.

The current status quo is that if you want to see moves (and thus stack blowups) optimised, you should indeed be running release builds - which is sometimes annoying. But that's not enough:

  • You need to explicitly mark your constructor functions (all the way down) with #[inline(always)]as the optimisation does not trigger across function calls.
  • You cannot use opt-level "z" (optimise for size - which is important on MCUs), as it comes with smaller inline depth and not all your functions might be inlined. It should be opt-level 3, which hurts flash size.
  • You still need to use MaybeUninit quite a bit - and if in case you have not used it - it is ugly and unsafe

The above ^^^ is not just about stack blowups by the way. Your "Rust statics" idea is also suffering from the lack of placement new, if your data you put in the static var is not const new-able and thus cannot be reduced to a copy from flash/elf rodata to the data segment. Because otherwise what happens is that your data is first allocated on stack and only then moved to the static var.

(All of the above were actual issues in the no_std codebase that will land in main soon.)

By the way, the current RAM usage of the Rust no_std matter stack is around 130K (all on-stack, max stack usage, but you can of course move stuff to statics as you say), with 3 fabrics, 16 sessions and 8 exchanges (in release of course; in debug it is 3-4x bigger), and I still hope to optimize it below 100K, but the "placement new" issue does not make that easy. Contrast this to the Matter C++ SDK, where I've heard figures of ~ 70K to 90K.

@M1cha
Copy link

M1cha commented Jul 12, 2023

@ivmarkov That comment is actually very insightful but if you have the time I'd need some more information to understand the core issue:

In which situation do moves not get optimized? As far as I know they mostly implemented that as a special case for "Box". And for that I haven't yet run into a situation where it doesn't work - even across function calls, without inline and with opt-level=z: https://godbolt.org/z/P8j1KTPen

If you're not using Box(which is likely given the context of the discussion) you're probably simply not getting any of those optimizations. That would make a lot of sense to me.

@ivmarkov
Copy link
Contributor

Yes, of course it is not Box-related. It is more about moves that happen within the stack, across nested function calls. Also moves that happen when you use stuff like static_cell, that runs a non-const fn constructors that build stuff that then needs to land in static memory. Without all of the above magic that you have to do, you often end up with an excessive (albeit temporary) stack memory usage.

I'll pull out an example for you maybe later. A bit too busy atm.

@ivmarkov
Copy link
Contributor

ivmarkov commented Oct 6, 2023

@kedars I think we should close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants