Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arch/xtensa: arch_cpu_idle cleanup #64760

Merged
merged 3 commits into from
Nov 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions arch/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -976,3 +976,10 @@ config TOOLCHAIN_HAS_BUILTIN_FFS
default y if !(64BIT && RISCV)
help
Hidden option to signal that toolchain has __builtin_ffs*().

config ARCH_CPU_IDLE_CUSTOM
bool "Custom arch_cpu_idle implementation"
default n
help
This options allows applications to override the default arch idle implementation with
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this. Why would this be an application level override? It seems this is a board/soc specific requirement and if you buiild for these targets, the custom functions has to be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i I tested cavs2.5 and didn't find any issue. I traced this issue which was found on cavs1.8 platforms in 2017 and then these patches was applied. I don't think it still exists on cavs2.5 so I don't enable it with cavs2.5. I reserve it since It will make effect when cavs1.8 platform are supported.

a custom one.
12 changes: 0 additions & 12 deletions arch/xtensa/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -50,18 +50,6 @@ config XTENSA_ENABLE_BACKTRACE
help
Enable this config option to print backtrace on panic exception

config XTENSA_CPU_IDLE_SPIN
bool "Use busy loop for k_cpu_idle"
help
Use a spin loop instead of WAITI for the CPU idle state.

config XTENSA_WAITI_BUG
bool "Workaround sequence for WAITI bug on LX6"
help
SOF traditionally contains this workaround on its ADSP
platforms which prefixes a WAITI entry with 128 NOP
instructions followed by an ISYNC and EXTW.

config XTENSA_SMALL_VECTOR_TABLE_ENTRY
bool "Workaround for small vector table entries"
help
Expand Down
39 changes: 2 additions & 37 deletions arch/xtensa/core/cpu_idle.c
Original file line number Diff line number Diff line change
Expand Up @@ -6,48 +6,13 @@
#include <zephyr/toolchain.h>
#include <zephyr/tracing/tracing.h>

/* xt-clang removes any NOPs more than 8. So we need to set
* no optimization to avoid those NOPs from being removed.
*
* This function is simply enough and full of hand written
* assembly that optimization is not really meaningful
* anyway. So we can skip optimization unconditionally.
* Re-evalulate its use and add #ifdef if this assumption
* is no longer valid.
*/
__no_optimization
#ifndef CONFIG_ARCH_CPU_IDLE_CUSTOM
void arch_cpu_idle(void)
{
sys_trace_idle();

/* Just spin forever with interrupts unmasked, for platforms
* where WAITI can't be used or where its behavior is
* complicated (Intel DSPs will power gate on idle entry under
* some circumstances)
*/
if (IS_ENABLED(CONFIG_XTENSA_CPU_IDLE_SPIN)) {
__asm__ volatile("rsil a0, 0");
__asm__ volatile("loop_forever: j loop_forever");
return;
}

/* Cribbed from SOF: workaround for a bug in some versions of
* the LX6 IP. Preprocessor ugliness avoids the need to
* figure out how to get the compiler to unroll a loop.
*/
if (IS_ENABLED(CONFIG_XTENSA_WAITI_BUG)) {
#define NOP4 __asm__ volatile("nop; nop; nop; nop");
#define NOP32 NOP4 NOP4 NOP4 NOP4 NOP4 NOP4 NOP4 NOP4
#define NOP128() NOP32 NOP32 NOP32 NOP32
NOP128();
#undef NOP128
#undef NOP32
#undef NOP4
__asm__ volatile("isync; extw");
}
RanderWang marked this conversation as resolved.
Show resolved Hide resolved

__asm__ volatile ("waiti 0");
}
#endif
kv2019i marked this conversation as resolved.
Show resolved Hide resolved

void arch_cpu_atomic_idle(unsigned int key)
{
Expand Down
12 changes: 12 additions & 0 deletions soc/xtensa/intel_adsp/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -113,4 +113,16 @@ config ADSP_IMR_CONTEXT_SAVE
entering D3 state. Later this context can be used to FW restore
when Host power up DSP again.

config XTENSA_CPU_IDLE_SPIN
bool "Use busy loop for k_cpu_idle"
help
Use a spin loop instead of WAITI for the CPU idle state.

config XTENSA_WAITI_BUG
bool "Workaround sequence for WAITI bug on LX6"
help
SOF traditionally contains this workaround on its ADSP
platforms which prefixes a WAITI entry with 128 NOP
instructions followed by an ISYNC and EXTW.

endif # SOC_FAMILY_INTEL_ADSP
45 changes: 45 additions & 0 deletions soc/xtensa/intel_adsp/cavs/power.c
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,51 @@ void pm_state_exit_post_ops(enum pm_state state, uint8_t substate_id)
}
#endif /* CONFIG_PM */

#ifdef CONFIG_ARCH_CPU_IDLE_CUSTOM
/* xt-clang removes any NOPs more than 8. So we need to set
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* xt-clang removes any NOPs more than 8. So we need to set
/* xt-clang removes any NOPs more than 8. So we need to set

Please provide at least one xt-lang version that with confirmed reproduction. This workaround should refer to a proper issue number but for now the mere output of xt-lang --version will already be infinitely better than nothing.

If there was a past discussion in https://github.com/thesofproject/sof about this then also add the link. Toolchain issues are incredibly time-consuming so the last thing they need is incomplete information.

I realize you're "only" moving an existing comment but as noted by @andyross this PR does far more than just re-organizing code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @dcpleung was the one who discovered that particular optimizer misfeature, likely on the MTL toolchain?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the xt-clang I used exhibited the same behavior of "ignoring" NOPs beyond 8.

* no optimization to avoid those NOPs from being removed.
*
* This function is simply enough and full of hand written
* assembly that optimization is not really meaningful
* anyway. So we can skip optimization unconditionally.
* Re-evalulate its use and add #ifdef if this assumption
* is no longer valid.
*/
__no_optimization
void arch_cpu_idle(void)
{
sys_trace_idle();

/* Just spin forever with interrupts unmasked, for platforms
* where WAITI can't be used or where its behavior is
* complicated (Intel DSPs will power gate on idle entry under
* some circumstances)
*/
if (IS_ENABLED(CONFIG_XTENSA_CPU_IDLE_SPIN)) {
__asm__ volatile("rsil a0, 0");
__asm__ volatile("loop_forever: j loop_forever");
return;
}

/* Cribbed from SOF: workaround for a bug in some versions of
* the LX6 IP. Preprocessor ugliness avoids the need to
* figure out how to get the compiler to unroll a loop.
*/
if (IS_ENABLED(CONFIG_XTENSA_WAITI_BUG)) {
#define NOP4 __asm__ volatile("nop; nop; nop; nop");
#define NOP32 NOP4 NOP4 NOP4 NOP4 NOP4 NOP4 NOP4 NOP4
#define NOP128() NOP32 NOP32 NOP32 NOP32
NOP128();
#undef NOP128
#undef NOP32
#undef NOP4
__asm__ volatile("isync; extw");
}

__asm__ volatile ("waiti 0");
}
#endif

__imr void power_init(void)
{
/* Request HP ring oscillator and
Expand Down
Loading