-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add x86 TSC_DEADLINE mode timer #34875
Conversation
First cut to try to get this reviewed and merged before freeze. It's should be safe-ish in a CI sense as it doesn't impact any platforms beyond up_squared right now. ehl_crb should get it too, of course, and presumably it works in ACRN (this is exactly what they asked for) but I haven't tested. |
CONFIG_APIC_TSC_DEADLINE_TIMER=y | ||
CONFIG_HPET_TIMER=n | ||
CONFIG_APIC_TIMER=n | ||
CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC=1593600000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: This rate was hand-measured on my board using some hackery in the HPET driver. I'm currently trying to turn that into a rig we can commit to samples somewhere that will detect and calibrate HPET, TSC and APIC clocks so we don't have to constantly guess at this stuff. Will add to this PR if there's time, otherwise submit separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so it is choice between using APIC_TIMER and APIC_TSC_DEADLINE_TIMER. I did not realize that at first. Something to think about, maybe we could simplify this need for 3 configs to choose a timer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to test this driver with acrn_ehl_crb and see some regressions in tests/kernel/timer. I added the same lines as this up2 defconfig, except using HW cycles for ACRN. 1 of 4 completes tests/kernel/timer/timer_error_case/kernel.timer, the other 3 configurations hang at these tests:
- tests/kernel/timer/timer_monotonic/kernel.timer.monotonic - test_monotomic_timer
- tests/kernel/timer/timer_api/kernel.timer - test_timer_duration_period
- tests/kernel/timer/timer_api/kernel.timer.tickless - test_timer_duration_period
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying the same exercise on ehl_crb (which typically uses HPET configuration), fails differently than acrn:
2 pass and 2 fail.
- tests/kernel/timer/timer_api/kernel.timer - regression on test_timer_remaining, and hangs on test_timeout_abs
- tests/kernel/timer/timer_api/kernel.timer.tickless - regression on test_timer_remaining, and hangs on test_timeout_abs
START - test_timer_remaining
Assertion failed at WEST_TOPDIR/zephyr/tests/kernel/timer/timer_api/src/main.c:718: test_timer_remaining: (abs(delta_ticks) <= MAX(slew_ticks, 1U) is false)
tick/busy slew -2 larger than test threshold 0
FAIL - test_timer_remaining in 1.978 seconds
===================================================================
START - test_timeout_abs
(hangs)
EDIT: The tests pass on ehl_crb if I also add CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC=1900000000 to the defconfig (similar to acrn_ehl_crb defconfig).
EDIT2: Spoorthy is helping to expand testing scope on EHL now that we have this initial success. I'll keep looking at ACRN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC needs to be the actual TSC counter rate for the hardware, not the value copied from up_squared (though my guess is that is probably close) and absolutely not a value left over from an HPET driver, which is 100x too low. The 1.9 GHz number you have there certainly sounds about right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I understand. I was not reusing the up2 value. What seems to be the issue is that config is required with the new driver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: All tests under kernel are passing except kernel/common/bootdelay with the below changes in defconfig on EHL baremetal.
CONFIG_APIC_TSC_DEADLINE_TIMER=y
CONFIG_HPET_TIMER=n
CONFIG_APIC_TIMER=n
CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC=1900000000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now works great on acrn_ehl_crb (spot tested with twister the tests/kernel/timer/ and tests/kernel/common/.
* doesn't know how to manage LVT interrupts for anything | ||
* other than the calling/initial CPU. Same fence needed to | ||
* prevent later MSR writes from reordering before the APIC | ||
* configuration write. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: this is one of the glitches with SMP support in the older APIC driver, and really it's a hole in our APIC support. The timer driver shouldn't be responsible for this, new CPUs should be automatically set up with the correctly mapped and configured interrupts by the arch layer. But it's not like it's difficult to do here.
return lo + (((uint64_t)hi) << 32); | ||
} | ||
|
||
static void isr(const void *arg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one showcase here: note that this interrupt handler contains basically no code. Compute elapsed ticks with a subtract and divide and announce them.
5d5c08d
to
fb7a0e5
Compare
757cced
to
6a96064
Compare
Add ehl_crb to the enabled boards now that it's known to be working, adding one workaround in the boot_delay test to handle devices like this. |
6a96064
to
404ea93
Compare
The board defconfig changes look fine to me, but do we also want to update this stuff in
|
404ea93
to
6b234be
Compare
I was going to defer playing with kconfig a big. Partly because it always blows up on me in weird ways, partly because I think we'll be able to deprecate and remove the one-shot APIC driver, and partly because I think doing this right should involve a kconfig choice or menu or something instead of every SoC having to enumerate all the compatible timer drivers. |
boards/x86/ehl_crb/ehl_crb_defconfig
Outdated
@@ -12,3 +12,7 @@ CONFIG_X2APIC=y | |||
CONFIG_SMP=y | |||
CONFIG_BUILD_OUTPUT_EFI=y | |||
CONFIG_BUILD_NO_GAP_FILL=y | |||
CONFIG_APIC_TSC_DEADLINE_TIMER=y | |||
CONFIG_HPET_TIMER=n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need those? just drop them.
6b234be
to
fd5b3c9
Compare
Updated once more to work around an upstream ACRN bug (see projectacrn/acrn-hypervisor#5985) and to migrate acrn_ehl_crb to the new driver too. |
@@ -4,7 +4,7 @@ CONFIG_SOC_IA32=y | |||
CONFIG_BOARD_ACRN=y | |||
CONFIG_PIC_DISABLE=y | |||
CONFIG_LOAPIC=y | |||
CONFIG_APIC_TIMER=y | |||
CONFIG_APIC_TIMER=n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just remove this instead of setting it to NO?
boards/x86/ehl_crb/ehl_crb_defconfig
Outdated
@@ -12,3 +12,7 @@ CONFIG_X2APIC=y | |||
CONFIG_SMP=y | |||
CONFIG_BUILD_OUTPUT_EFI=y | |||
CONFIG_BUILD_NO_GAP_FILL=y | |||
CONFIG_APIC_TSC_DEADLINE_TIMER=y | |||
CONFIG_HPET_TIMER=n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just remove this instead of setting it to NO?
@@ -13,3 +13,7 @@ CONFIG_SMP=y | |||
CONFIG_MP_NUM_CPUS=2 | |||
CONFIG_BUILD_OUTPUT_EFI=y | |||
CONFIG_BUILD_NO_GAP_FILL=y | |||
CONFIG_APIC_TSC_DEADLINE_TIMER=y | |||
CONFIG_HPET_TIMER=n | |||
CONFIG_APIC_TIMER=n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just remove this instead of setting it to NO?
Modern hardware all supports a TSC_DEADLINE mode for the APIC timer, where the same GHz-scale 64 bit TSC used for performance monitoring becomes the free-running counter used for cpu-local timer interrupts. Being a free running counter that does not need to be reset, it will not lose time in an interrupt. Being 64 bit, it needs no rollover or clamping logic in the driver when presented with a 32 bit tick count. Being a proper comparator, it will correctly trigger interrupts for times set "in the past" and thus needs no minimum/clamping logic. The counter is synchronized across the system architecturally (modulo one burp where firmware likes to change the adjustment value) so usage is SMP-safe by default. Access to the 64 bit counter and comparator value are single-instruction atomics even on 32 bit systems, so it beats even the RISC-V machine timer in complexity (which was our reigning champ for "simplest timer driver"). Really this is just ideal for Zephyr. So rather than try to add support for it to the existing APIC driver and increase complexity, make this a new standalone driver instead. All modern hardware has what it needs. The sole gotcha is that it's not easily emulatable (qemu supports it only under kvm where they can freeload on the host TSC) so it can be exercised only on hardware platforms right now. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Atom and Core CPUs want to be using this driver as they all have the needed APIC support. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
New timer driver needs an entry in the hard-coded list of IRQs Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
First, this test is a little suspect. It's assuming that the value returned from k_cycle_get_32() represents the time since system power-on. While that's an obvious implementation choice and surely often true, it's definitely not the way we document this API to the arch layer. It's perfectly legal for a platform to return any value as long as the counter is increasing at the correct rate. Leaving for now as there's no other way to test CONFIG_BOOT_DELAY, but this will likely be coming back to confuse us at some point. Regardless, that convention holds for x86 devices using any of the existing drivers. But on an EFI PC using the TSC counter as the clock source: (1) the counter is running at 1-2 GHz and (2) the time to get through an EFI BIOS and into Zephyr is routinely 10+ seconds, especially on reference hardware. The poor 32 bit API will roll over several times, and effectively be a random number by the time it reaches this test. Just skip this test with fast counter. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
8c56ba6
to
b4a4eee
Compare
Under duress, and at great risk to my personal wellbeing, I was forced by project management to clean up the kconfig layer for the affected platforms and their various timer drivers. Looks better now. Seems not to have broken anything. |
I retested with the kconfig changes and looks good on acrn_ehl_crb so far. Thanks Andy! |
b4a4eee
to
74d6f36
Compare
The addition of a timer driver made a messy situation worse. Move board-level configuration like clock rates and dividers into the board and don't try to default it in the soc. Make it clear which kconfig goes with which driver. Likewise don't try to do driver selection in the soc, the board (or app) is in a better position to choose. Also clean up and better unify the up_squared 32/64 bit settings. Really only CONFIG_BOARD_NAME needs to care about the difference between these devices. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Cleanup along the same lines as the last change to APL/up_squared. Make sure all hardware configuration is at the board level where it belongs and not in the soc, don't play games with defaulting timer drivers. Unify the configuration where possible and make it clearer which setting goes with which driver. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
74d6f36
to
ca87be3
Compare
Update to use this driver as this CPU has the needed APIC support. Signed-off-by: Jennifer Williams <jennifer.m.williams@intel.com>
ca87be3
to
aac5967
Compare
Sneak in one last change, a rebase ordering glitch left two "=n" lines in Jen's final patch in the series. Doesn't affect generated code. |
Modern hardware all supports a TSC_DEADLINE mode for the APIC timer, where the same GHz-scale 64 bit TSC used for performance monitoring becomes the free-running counter used for cpu-local timer interrupts. Being a free running counter that does not need to be reset, it will not lose time in an interrupt. Being 64 bit, it needs no rollover or clamping logic in the driver when presented with a 32 bit tick count. Being a proper comparator, it will correctly trigger interrupts for times set "in the past" and thus needs no minimum/clamping logic. The counter is synchronized across the system architecturally (modulo one burp where firmware likes to change the adjustment value) so usage is SMP-safe by default. Access to the 64 bit counter and comparator value are single-instruction atomics even on 32 bit systems, so it beats even the RISC-V machine timer in complexity (which was our reigning champ for "simplest timer driver").
Really this is just ideal for Zephyr. So rather than try to add support for it to the existing APIC driver and increase complexity, make this a new standalone driver instead. All modern hardware has what it needs. The sole gotcha is that it's not easily emulatable (qemu supports it only under kvm where they can freeload on the host TSC) so it can be exercised only on hardware platforms right now.
This is being added to address shortcomings with the existing APIC driver in #34788 . The older driver is annoying to synchronize in SMP and has some mild warts with configuration. This one is trivial to make work in SMP and works better in all ways possible (OK, except not working in qemu...)
Fixes #34788