Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overhaul thread stack specification #24714

Merged
merged 26 commits into from
Jul 31, 2020

Conversation

andrewboie
Copy link
Contributor

@andrewboie andrewboie commented Apr 26, 2020

This PR attempts to centralize stack management to common code, introduce kernel-only stacks which use less memory, and document best practices for new architecture ports.

Major changes in this PR:

  • arch_new_thread() API has been revised. Some extraneous parameters have been removed, and the kernel now pre-computes an initial stack pointer which uses as much space as possible and takes into account random offsets and thread-local storage. One change related to this credit goes to @ceolin. arch_new_thread() no longer needs to call back into the kernel to set stack boundaries.
  • The arch interface for thread stacks has changed. The macros for declaring stacks are all centralized. Arches now define the required stack pointer alignment, the alignment of stack objects, and a macro to convert a requested stack buffer size into the actual object size needed by HW. For any given arch the definitions are much simpler.
  • New macros introduced for declaring kernel-only stacks, all are prefixed with K_KERNEL_STACK. Stacks that are capable of hosting user mode are now placed in a designated section so that they can be recognized at runtime.
  • Improvements to the test case for thread stacks, now no longer part of the userspace test.
  • Additional thread stack memory reserved due to rounding up to size/alignment requirements is now fully used.
  • arches, drivers, subsystems, and library code updated to use kernel-only stacks where appropriate, potentially saving large amounts of memory if user mode is enabled.

Please see individual commit messages for more details.

Fixes: #13651
Fixes: #13637
Fixes: #25635

Note about naming: I chose K_KERNEL_STACK_* for the supervisor-only stack macros and left K_THREAD_STACK_* alone, as to not break existing code (you can use K_THREAD_STACK for everything at the expense of memory)

Other than avoiding breakages, I have no particular attachment to these names though. There is already unfortunately another type of kernel object called k_stack.

This also closes #14269 (making it obsolete, and removing relevant workaround for ARMv7-M)

@zephyrbot
Copy link
Collaborator

zephyrbot commented Apr 26, 2020

All checks are passing now.

Tip: The bot edits this comment instead of posting a new one, so you can check the comment's history to see earlier messages.

@andrewboie andrewboie force-pushed the stacks-phase-2 branch 8 times, most recently from 5786294 to 9bc36ff Compare May 14, 2020 23:58
@zephyrbot zephyrbot added area: native port Host native arch port (native_sim) area: X86 x86 Architecture (32-bit) area: ARM ARM (32-bit) Architecture area: NIOS2 NIOS2 Architecture area: Xtensa Xtensa Architecture area: ARC ARC Architecture area: Shell Shell subsystem area: API Changes to public APIs area: Tests Issues related to a particular existing or missing test area: Kernel labels May 15, 2020
@andrewboie andrewboie force-pushed the stacks-phase-2 branch 5 times, most recently from 71db270 to 8a87dda Compare May 15, 2020 22:16
Andrew Boie added 7 commits July 30, 2020 11:17
This now takes a stack pointer as an argument with TLS
and random offsets accounted for properly.

Based on zephyrproject-rtos#24467 authored by Flavio Ceolin.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Check that the base of every stack object is properly
defined. This can get messed up if K_THREAD_STACK_ARRAY_DEFINE
isn't specified properly.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This had been copy-pasted between linker scripts, create
a central header for it.

The linker scripts for xtensa and posix have very different
structure and have been left alone.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Most of these checks can be performed on non-userspace
supporting platforms.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
thread->stack_info is now much more well maintained. Make these
tests that validate that user mode has no access just outside
the bounds of it, instead of the entire object.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Purely for informational purposes.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Currently for informational purposes, although we do check that
the carveout is smaller than the stack_size.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
@andrewboie
Copy link
Contributor Author

Updated based on @ioannisg comments.
Also, based on an offline discussion with @dleach02, I've simplified ARM to always have K_THREAD_STACK_RESERVED be 0. It's simpler this way since ARM is unconditionally generating separate privilege stacks.

@andrewboie although, I am not against this simplification, I'd like to point out that now K_THREAD_STACK_RESERVED for ARMv8-M, will be 32 when CONFIG_USERSPACE=n and 0 when CONFIG_USERSPACE=y, while the HW does not impose any relevant constraints that justify this difference. This might be confusing for a developer that builds his code with / without USERSPACE enabled and sees different available stack for their supervisor threads (assuming that they use K_THREAD_STACK_DEFINE).

We had always been treating the (CONFIG_USERSPACE && CONFIG_MPU_REQUIRES_POWER_OF_TWO_ALIGNMENT) as the special case.

If you do this simplification, it means you favor consistency across the different cortex-m variants.
But maybe it is better to favor consistency across building with/without USERSPACE for each cortex-m variant.
In both cases you need to sacrifice something.

I need direction on what we need to do for this PR to get it merged.

If we reserve data, then there is a guard instantiated in two places, with the one in the stack object wasted when we drop to user mode. If we don't, we have the above situation.

The proper fix in my view is to convert ARM to not generate privilege stacks when it shouldn't, which I believe we agreed to do after this PR. If there's something you would like me to do for this PR to ease your concerns please share it.

@ioannisg
Copy link
Member

If we reserve data, then there is a guard instantiated in two places, with the one in the stack object wasted when we drop to user mode. If we don't, we have the above situation.

No we don't; the guard in the stack object becomes part of the user thread stack, for ARCHes that do not require power-of-two alignment. Just that the user stack will grow a bit more :)

But, I have been thinking more of it, and I think that your simplification is the best option, for one fundamental reason. Most of ARMv8-M platforms will be Mainline (Cortex-M33) and those ones won't use MPU guards, anyways. That's regardless of building with or without USERSPACE support. The only ones that will have the minor inconsistency that I described above are

  • the Kinetis SoCs with the "nxp" MPU
  • ARMv8-M Baseline SoCs that won't have SPLIM registers. Right now there are no such SoCs in Zephyr, and, in general, there are very few designs, so I assume we should favor the majority which is the Cortex-M33

The other reason is what you mention: we will move to not generating separate privilege stacks for ARMv8-M, hopefully soon.

Copy link
Member

@ioannisg ioannisg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this, assuming @andrewboie will fix the last issue spotted.

For issues that might occur, there's plenty of time to fix until release.

Andrew Boie added 6 commits July 30, 2020 15:40
These stacks are appropriate for threads that run purely in
supervisor mode, and also as stacks for interrupt and exception
handling.

Two new arch defines are introduced:

- ARCH_KERNEL_STACK_GUARD_SIZE
- ARCH_KERNEL_STACK_OBJ_ALIGN

New public declaration macros:

- K_KERNEL_STACK_RESERVED
- K_KERNEL_STACK_EXTERN
- K_KERNEL_STACK_DEFINE
- K_KERNEL_STACK_ARRAY_DEFINE
- K_KERNEL_STACK_MEMBER
- K_KERNEL_STACK_SIZEOF

If user mode is not enabled, K_KERNEL_STACK_* and K_THREAD_STACK_*
are equivalent.

Separately generated privilege elevation stacks are now declared
like kernel stacks, removing the need for K_PRIVILEGE_STACK_ALIGN.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This will save memory on many platforms that enable
user mode.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The idle thread(s) always run in supervisor mode.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The system workqueue is a kernel thread that doesn't run
in user mode.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Explain best practices for defining stack object macros.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
There are predictable relationships between the actual size
of a stack object, the return value of K_*_STACK_SIZEOF() macros,
and the original size passed in when the stack was declared.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Need to test MPU guards on ARMv8.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
@maksimmasalski
Copy link
Collaborator

@andrewboie Does that fixes #26865 ?

@andrewboie
Copy link
Contributor Author

@maksimmasalski no it does not but I'll find some time for it this week, easy fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: API Changes to public APIs area: ARC ARC Architecture area: ARM ARM (32-bit) Architecture area: Documentation area: Kernel area: native port Host native arch port (native_sim) area: NIOS2 NIOS2 Architecture area: Shell Shell subsystem area: Tests Issues related to a particular existing or missing test area: X86 x86 Architecture (32-bit) area: Xtensa Xtensa Architecture
Projects
None yet