Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add x86 TSC_DEADLINE mode timer #34875

Merged
merged 7 commits into from
May 7, 2021

Conversation

andyross
Copy link
Contributor

@andyross andyross commented May 5, 2021

Modern hardware all supports a TSC_DEADLINE mode for the APIC timer, where the same GHz-scale 64 bit TSC used for performance monitoring becomes the free-running counter used for cpu-local timer interrupts. Being a free running counter that does not need to be reset, it will not lose time in an interrupt. Being 64 bit, it needs no rollover or clamping logic in the driver when presented with a 32 bit tick count. Being a proper comparator, it will correctly trigger interrupts for times set "in the past" and thus needs no minimum/clamping logic. The counter is synchronized across the system architecturally (modulo one burp where firmware likes to change the adjustment value) so usage is SMP-safe by default. Access to the 64 bit counter and comparator value are single-instruction atomics even on 32 bit systems, so it beats even the RISC-V machine timer in complexity (which was our reigning champ for "simplest timer driver").

Really this is just ideal for Zephyr. So rather than try to add support for it to the existing APIC driver and increase complexity, make this a new standalone driver instead. All modern hardware has what it needs. The sole gotcha is that it's not easily emulatable (qemu supports it only under kvm where they can freeload on the host TSC) so it can be exercised only on hardware platforms right now.

This is being added to address shortcomings with the existing APIC driver in #34788 . The older driver is annoying to synchronize in SMP and has some mild warts with configuration. This one is trivial to make work in SMP and works better in all ways possible (OK, except not working in qemu...)

Fixes #34788

@andyross
Copy link
Contributor Author

andyross commented May 5, 2021

First cut to try to get this reviewed and merged before freeze. It's should be safe-ish in a CI sense as it doesn't impact any platforms beyond up_squared right now. ehl_crb should get it too, of course, and presumably it works in ACRN (this is exactly what they asked for) but I haven't tested.

CONFIG_APIC_TSC_DEADLINE_TIMER=y
CONFIG_HPET_TIMER=n
CONFIG_APIC_TIMER=n
CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC=1593600000
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: This rate was hand-measured on my board using some hackery in the HPET driver. I'm currently trying to turn that into a rig we can commit to samples somewhere that will detect and calibrate HPET, TSC and APIC clocks so we don't have to constantly guess at this stuff. Will add to this PR if there's time, otherwise submit separately.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so it is choice between using APIC_TIMER and APIC_TSC_DEADLINE_TIMER. I did not realize that at first. Something to think about, maybe we could simplify this need for 3 configs to choose a timer.

Copy link
Collaborator

@jenmwms jenmwms May 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to test this driver with acrn_ehl_crb and see some regressions in tests/kernel/timer. I added the same lines as this up2 defconfig, except using HW cycles for ACRN. 1 of 4 completes tests/kernel/timer/timer_error_case/kernel.timer, the other 3 configurations hang at these tests:

  • tests/kernel/timer/timer_monotonic/kernel.timer.monotonic - test_monotomic_timer
  • tests/kernel/timer/timer_api/kernel.timer - test_timer_duration_period
  • tests/kernel/timer/timer_api/kernel.timer.tickless - test_timer_duration_period

Copy link
Collaborator

@jenmwms jenmwms May 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying the same exercise on ehl_crb (which typically uses HPET configuration), fails differently than acrn:
2 pass and 2 fail.

  • tests/kernel/timer/timer_api/kernel.timer - regression on test_timer_remaining, and hangs on test_timeout_abs
  • tests/kernel/timer/timer_api/kernel.timer.tickless - regression on test_timer_remaining, and hangs on test_timeout_abs
START - test_timer_remaining
Assertion failed at WEST_TOPDIR/zephyr/tests/kernel/timer/timer_api/src/main.c:718: test_timer_remaining: (abs(delta_ticks) <= MAX(slew_ticks, 1U) is false)
tick/busy slew -2 larger than test threshold 0
FAIL - test_timer_remaining in 1.978 seconds
===================================================================
START - test_timeout_abs

(hangs)

EDIT: The tests pass on ehl_crb if I also add CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC=1900000000 to the defconfig (similar to acrn_ehl_crb defconfig).
EDIT2: Spoorthy is helping to expand testing scope on EHL now that we have this initial success. I'll keep looking at ACRN.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC needs to be the actual TSC counter rate for the hardware, not the value copied from up_squared (though my guess is that is probably close) and absolutely not a value left over from an HPET driver, which is 100x too low. The 1.9 GHz number you have there certainly sounds about right.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I understand. I was not reusing the up2 value. What seems to be the issue is that config is required with the new driver?

Copy link
Contributor

@yerabolu yerabolu May 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: All tests under kernel are passing except kernel/common/bootdelay with the below changes in defconfig on EHL baremetal.
CONFIG_APIC_TSC_DEADLINE_TIMER=y
CONFIG_HPET_TIMER=n
CONFIG_APIC_TIMER=n
CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC=1900000000

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now works great on acrn_ehl_crb (spot tested with twister the tests/kernel/timer/ and tests/kernel/common/.

* doesn't know how to manage LVT interrupts for anything
* other than the calling/initial CPU. Same fence needed to
* prevent later MSR writes from reordering before the APIC
* configuration write.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this is one of the glitches with SMP support in the older APIC driver, and really it's a hole in our APIC support. The timer driver shouldn't be responsible for this, new CPUs should be automatically set up with the correctly mapped and configured interrupts by the arch layer. But it's not like it's difficult to do here.

@zephyrbot zephyrbot added the platform: X86 x86 and x86-64 label May 5, 2021
@zephyrbot zephyrbot requested review from jhedberg and KangJianX May 5, 2021 18:19
return lo + (((uint64_t)hi) << 32);
}

static void isr(const void *arg)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one showcase here: note that this interrupt handler contains basically no code. Compute elapsed ticks with a subtract and divide and announce them.

@andyross andyross force-pushed the tsc-deadline-timer branch from 5d5c08d to fb7a0e5 Compare May 5, 2021 22:08
@github-actions github-actions bot added area: Kernel area: Tests Issues related to a particular existing or missing test labels May 5, 2021
@andyross andyross force-pushed the tsc-deadline-timer branch from 757cced to 6a96064 Compare May 6, 2021 14:34
@andyross
Copy link
Contributor Author

andyross commented May 6, 2021

Add ehl_crb to the enabled boards now that it's known to be working, adding one workaround in the boot_delay test to handle devices like this.

@andyross andyross force-pushed the tsc-deadline-timer branch from 6a96064 to 404ea93 Compare May 6, 2021 14:39
@jhedberg
Copy link
Member

jhedberg commented May 6, 2021

The board defconfig changes look fine to me, but do we also want to update this stuff in soc/x86/elkhart_lake/Kconfig.defconfig:

config HPET_TIMER
	default y

config APIC_TIMER
	default y if !HPET_TIMER

@jhedberg jhedberg added this to the v2.6.0 milestone May 6, 2021
@andyross andyross force-pushed the tsc-deadline-timer branch from 404ea93 to 6b234be Compare May 6, 2021 17:05
@andyross
Copy link
Contributor Author

andyross commented May 6, 2021

I was going to defer playing with kconfig a big. Partly because it always blows up on me in weird ways, partly because I think we'll be able to deprecate and remove the one-shot APIC driver, and partly because I think doing this right should involve a kconfig choice or menu or something instead of every SoC having to enumerate all the compatible timer drivers.

@@ -12,3 +12,7 @@ CONFIG_X2APIC=y
CONFIG_SMP=y
CONFIG_BUILD_OUTPUT_EFI=y
CONFIG_BUILD_NO_GAP_FILL=y
CONFIG_APIC_TSC_DEADLINE_TIMER=y
CONFIG_HPET_TIMER=n
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need those? just drop them.

@andyross
Copy link
Contributor Author

andyross commented May 6, 2021

Updated once more to work around an upstream ACRN bug (see projectacrn/acrn-hypervisor#5985) and to migrate acrn_ehl_crb to the new driver too.

@@ -4,7 +4,7 @@ CONFIG_SOC_IA32=y
CONFIG_BOARD_ACRN=y
CONFIG_PIC_DISABLE=y
CONFIG_LOAPIC=y
CONFIG_APIC_TIMER=y
CONFIG_APIC_TIMER=n
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just remove this instead of setting it to NO?

@@ -12,3 +12,7 @@ CONFIG_X2APIC=y
CONFIG_SMP=y
CONFIG_BUILD_OUTPUT_EFI=y
CONFIG_BUILD_NO_GAP_FILL=y
CONFIG_APIC_TSC_DEADLINE_TIMER=y
CONFIG_HPET_TIMER=n
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just remove this instead of setting it to NO?

@@ -13,3 +13,7 @@ CONFIG_SMP=y
CONFIG_MP_NUM_CPUS=2
CONFIG_BUILD_OUTPUT_EFI=y
CONFIG_BUILD_NO_GAP_FILL=y
CONFIG_APIC_TSC_DEADLINE_TIMER=y
CONFIG_HPET_TIMER=n
CONFIG_APIC_TIMER=n
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just remove this instead of setting it to NO?

drivers/timer/apic_tsc.c Outdated Show resolved Hide resolved
Andy Ross added 4 commits May 6, 2021 15:42
Modern hardware all supports a TSC_DEADLINE mode for the APIC timer,
where the same GHz-scale 64 bit TSC used for performance monitoring
becomes the free-running counter used for cpu-local timer interrupts.
Being a free running counter that does not need to be reset, it will
not lose time in an interrupt.  Being 64 bit, it needs no rollover or
clamping logic in the driver when presented with a 32 bit tick count.
Being a proper comparator, it will correctly trigger interrupts for
times set "in the past" and thus needs no minimum/clamping logic.  The
counter is synchronized across the system architecturally (modulo one
burp where firmware likes to change the adjustment value) so usage is
SMP-safe by default.  Access to the 64 bit counter and comparator
value are single-instruction atomics even on 32 bit systems, so it
beats even the RISC-V machine timer in complexity (which was our
reigning champ for "simplest timer driver").

Really this is just ideal for Zephyr.  So rather than try to add
support for it to the existing APIC driver and increase complexity,
make this a new standalone driver instead.  All modern hardware has
what it needs.  The sole gotcha is that it's not easily emulatable
(qemu supports it only under kvm where they can freeload on the host
TSC) so it can be exercised only on hardware platforms right now.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Atom and Core CPUs want to be using this driver as they all have the
needed APIC support.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
New timer driver needs an entry in the hard-coded list of IRQs

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
First, this test is a little suspect.  It's assuming that the value
returned from k_cycle_get_32() represents the time since system
power-on.  While that's an obvious implementation choice and surely
often true, it's definitely not the way we document this API to the
arch layer.  It's perfectly legal for a platform to return any value
as long as the counter is increasing at the correct rate.  Leaving for
now as there's no other way to test CONFIG_BOOT_DELAY, but this will
likely be coming back to confuse us at some point.

Regardless, that convention holds for x86 devices using any of the
existing drivers.  But on an EFI PC using the TSC counter as the clock
source: (1) the counter is running at 1-2 GHz and (2) the time to get
through an EFI BIOS and into Zephyr is routinely 10+ seconds,
especially on reference hardware.  The poor 32 bit API will roll over
several times, and effectively be a random number by the time it
reaches this test.

Just skip this test with fast counter.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
@andyross andyross force-pushed the tsc-deadline-timer branch 2 times, most recently from 8c56ba6 to b4a4eee Compare May 7, 2021 00:06
@andyross
Copy link
Contributor Author

andyross commented May 7, 2021

Under duress, and at great risk to my personal wellbeing, I was forced by project management to clean up the kconfig layer for the affected platforms and their various timer drivers. Looks better now. Seems not to have broken anything.

@jenmwms
Copy link
Collaborator

jenmwms commented May 7, 2021

I retested with the kconfig changes and looks good on acrn_ehl_crb so far. Thanks Andy!

@andyross andyross force-pushed the tsc-deadline-timer branch from b4a4eee to 74d6f36 Compare May 7, 2021 01:01
Andy Ross added 2 commits May 6, 2021 18:13
The addition of a timer driver made a messy situation worse.  Move
board-level configuration like clock rates and dividers into the board
and don't try to default it in the soc.  Make it clear which kconfig
goes with which driver.  Likewise don't try to do driver selection in
the soc, the board (or app) is in a better position to choose.

Also clean up and better unify the up_squared 32/64 bit settings.
Really only CONFIG_BOARD_NAME needs to care about the difference
between these devices.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Cleanup along the same lines as the last change to APL/up_squared.
Make sure all hardware configuration is at the board level where it
belongs and not in the soc, don't play games with defaulting timer
drivers.  Unify the configuration where possible and make it clearer
which setting goes with which driver.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
@andyross andyross force-pushed the tsc-deadline-timer branch from 74d6f36 to ca87be3 Compare May 7, 2021 01:13
Update to use this driver as this CPU has the needed APIC support.

Signed-off-by: Jennifer Williams <jennifer.m.williams@intel.com>
@andyross andyross force-pushed the tsc-deadline-timer branch from ca87be3 to aac5967 Compare May 7, 2021 17:55
@andyross
Copy link
Contributor Author

andyross commented May 7, 2021

Sneak in one last change, a rebase ordering glitch left two "=n" lines in Jen's final patch in the series. Doesn't affect generated code.

@nashif nashif merged commit 05c5ebd into zephyrproject-rtos:master May 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Boards area: Kernel area: Tests Issues related to a particular existing or missing test area: Timer Timer platform: X86 x86 and x86-64
Projects
None yet
Development

Successfully merging this pull request may close these issues.

APIC timer does not support SMP
7 participants