Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: can't busy_wait too long when interrupt is locked #24332

Conversation

vonhust
Copy link

@vonhust vonhust commented Apr 14, 2020

I made a fix of arc timer in #24445. This PR as suggested by @andyross

It's not needed and it spins on the CPU wasting precious cycles in CI.

@vonhust vonhust requested review from ruuddw and abrodkin April 14, 2020 09:04
@zephyrbot zephyrbot added area: Tests Issues related to a particular existing or missing test area: Kernel labels Apr 14, 2020
Copy link
Collaborator

@abrodkin abrodkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure DURATION has anything to do with real HW timer period?
I think it's a property of Zephyr's "software timer", no?

Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was confused by the description, because SysTick does not rely on the ISR to fire to return a value. But I think I see what you mean: the ISR is required to handle a systick wraparound condition, and as the counter is only 24 bits that happens comparatively fast. This was waiting more than one full "SysTick cycle" and would get stick. I'm guessing ARC has similar behavior?

Would be good to augment the commit message to explain this specifically.

@vonhust
Copy link
Author

vonhust commented Apr 14, 2020

Are you sure DURATION has anything to do with real HW timer period?
I think it's a property of Zephyr's "software timer", no?

For CONFIG_TICKLESS_KERNEL, DURATION will impact the real HW timer period. k_timer_start will finally call z_clock_set_timeout.

For non CONFIG_TICKLESS_KERNEL, as irq is locked, no periodic sys timer irq, the cycles will also be lost.

@vonhust
Copy link
Author

vonhust commented Apr 14, 2020

I was confused by the description, because SysTick does not rely on the ISR to fire to return a value. But I think I see what you mean: the ISR is required to handle a systick wraparound condition, and as the counter is only 24 bits that happens comparatively fast. This was waiting more than one full "SysTick cycle" and would get stick. I'm guessing ARC has similar behavior?

Would be good to augment the commit message to explain this specifically.

Here is more detailed description:

For TICKLESS kernel, k_timer_start will call z_clock_set_timeout to set the sys timer to time out
at DURATION and be aligned to next sys tick. ARC's sys timer is a 32 bit free-running counter, when it reaches the limit (DURATION), it will raise a irq request and wrap to 0. Because irq is locked ,
so no sys timer irq to accumulate the cycles. The 1st miss of irq (DURATION) is ok , because the following codes can still get the correct cycles

tatic u32_t elapsed(void)
{
	u32_t val, ov, ctrl;

	do {
		val =  timer0_count_register_get();
		ctrl = timer0_control_register_get();
	} while (timer0_count_register_get() < val);

	ov = (ctrl & _ARC_V2_TMR_CTRL_IP) ? last_load : 0;
	return val + ov;
}

But the 2nd miss of irq (DURATION), it will wrap to the same value of 1st miss of irq. So it means
to maintain the correct cycles, the sys timer irq must be handled before next sys timer irq comes.

@andyross
Copy link
Contributor

Yes, that's the right description, k_busy_wait() can naturally handle one wraparound cycle of the underlying timer, but not two.

Please put it in the commit message. :)

@andrewboie
Copy link
Contributor

Are you sure DURATION has anything to do with real HW timer period?

@abrodkin I agree, I'm not quite seeing the relationship between DURATION and the wraparound period.

In this test, we have

#define DURATION	5

Which is just some small arbitrary value.

What is the actual upper bound of a k_busy_wait() call on ARC before it stops working correctly? And do we need to document this somewhere?

@andyross
Copy link
Contributor

Actually, now that I'm reminding myself of this, systick is supposed to work, it detects the wraparound condition synchronously via a bit that gets set in the CTRL register on overflow. Is that not happening on ARM boards anymore?

Does ARC not have a way to solve this in the timer driver?

@andyross
Copy link
Contributor

If someone is seeing trouble on ARM, can you try reverting commit f16a403? I'm looking at that and think it might have regressed a case the original code handled.

@andrewboie
Copy link
Contributor

I haven't heard any reports about this test failing on ARM, FWIW.

@vonhust
Copy link
Author

vonhust commented Apr 15, 2020

I haven't heard any reports about this test failing on ARM, FWIW.

Current codes shows that ARM 's systick timer has similar issue. When systick irq is not handled, the accumulated cycles can't be updated.

@andyross , the up bound is two systick timeout , FOR TICKLESS_KERNEL , it depends on set_timeout, For NON-tickless kernel, it's two sys ticks.

I tried to have a fix in driver, did't find a solution, e.g, if nothing done, the cycles will be lost finally.

TWO possible fixes:

  • one global wall clock, e.g., ARC HS global free-running counter, which can't be stopped since reset
  • make k_busy_wait don't rely on sys tick, e.g., instruction loop, but it needs calibration (Linux's jitter?) to match us (micro second)

@andyross
Copy link
Contributor

Current codes shows that ARM 's systick timer has similar issue. When systick irq is not handled, the accumulated cycles can't be updated.

That is not correct. You can verify this by disabling interrupts, spinning in a loop on any platform with the systick driver and printing the results of k_uptime_ticks(). I just verified with this code:

void main(void)
{
	printk("Hello World! %s\n", CONFIG_BOARD);

	/* Unconditionally disable interrupts */
	(void) arch_irq_lock();

	/* Loop forever printing each tick as it happens */
	u64_t tlast = 0, t;

	while (true) {
		t = k_uptime_ticks();

		if (t != tlast) {
			printk("ticks now %d\n", (int)t);
		}
		tlast = t;
	}
}

Really, this is a bug with the ARC timer driver. Time needs to not freeze or burp when interrupts are disabled. If it's really not architecturally possible to detect a wraparound condition underneath z_clock_elapsed(), then we may need to augment the implementation of k_busy_wait() such that ARC can provide its own.

Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All that said, I'm actually OK with this patch as it stands, simply for the reason that a 1 second busy wait is a terrible design decision in the test. It's not needed and it spins on the CPU wasting precious cycles in CI.

But ARC really needs to fix its timer driver or find a solution to getting time correctly with interrupts disabled. Maybe we need to turn that loop above into a test case somewhere.

@andyross
Copy link
Contributor

(Oh, and ignore my comment above about a suspicious commit. I read through more carefully and convinced myself that the current code is correct.)

@vonhust
Copy link
Author

vonhust commented Apr 16, 2020

All that said, I'm actually OK with this patch as it stands, simply for the reason that a 1 second busy wait is a terrible design decision in the test. It's not needed and it spins on the CPU wasting precious cycles in CI.

But ARC really needs to fix its timer driver or find a solution to getting time correctly with interrupts disabled. Maybe we need to turn that loop above into a test case somewhere.

@andyross , I took a look at cortex-m systick driver. My understanding is:

  1. overflow_cyc is a good way to maintain the total cycles when irq is locked, it can be applied to arc's timer. I will have a different PR for this.

  2. But, as the comments said:

 * - no more than one counter-wrap has occurred between
 *     - the timer reset or the last time the function was called
 *     - and until the current call of the function is completed.

If elapsed() wasn't called when irq is locked, still cannot update the correct total cycles.

So we need to remember this case.

@vonhust
Copy link
Author

vonhust commented Apr 16, 2020

@andyross @ruuddw @abrodkin

I dived into the details, i found maybe we can't apply similar mechanism of ARM systick for ARC, because of the limitation of IP bit (interrupt pending) of ARC timer.

Because if i cleared the IP bit not in isr, the timer irq may be lost, e.g.

  1. locked irq
  2. call elapsed() and clear ip bit-> the timer irq signal may disapper.
  3. unlocked irq
  4. systick irq will be lost .

For ARM's SysTick_CTRL_COUNTFLAG, it looks it won't impact the irq pending.

@vonhust
Copy link
Author

vonhust commented Apr 16, 2020

@andyross one related question,
in the following codes

	cycle_count += overflow_cyc;
	overflow_cyc = 0;

	if (TICKLESS) {
		/* In TICKLESS mode, the SysTick.LOAD is re-programmed
		 * in z_clock_set_timeout(), followed by resetting of
		 * the counter (VAL = 0).
		 *
		 * If a timer wrap occurs right when we re-program LOAD,
		 * the ISR is triggered immediately after z_clock_set_timeout()
		 * returns; in that case we shall not increment the cycle_count
		 * because the value has been updated before LOAD re-program.
		 *
		 * We can assess if this is the case by inspecting COUNTFLAG.
		 */

		dticks = (cycle_count - announced_cycles) / CYC_PER_TICK;
		announced_cycles += dticks * CYC_PER_TICK;
		z_clock_announce(dticks);

As cyc_count is 32 bit , if cycle_count += overflow_cyc overflowed and make cyc_count < announced_cycles, i think an unexpected value of dticks will come out.

@andyross
Copy link
Contributor

I see what you're saying. Yes: the SysTick mechanism handles only one missed wraparound. If code (1) disables or masks the timer interrupt and (2) does not query the time for two or more full cycles (~300ms on faster systems), then you'll still lose time in an unavoidable way. And this can't be fixed.

But this patch is to a k_busy_wait(). That function, by definition, is polling the time in a tight loop.

@andyross
Copy link
Contributor

And as to the math you're quoting in the cortex driver: that looks correct to me. All those values are unsigned, so the delta computation still works if the result is less than 4G cycles.

@vonhust
Copy link
Author

vonhust commented Apr 16, 2020

@andyross 4G cycles will be reached soon, e.g. for 100 Mhz targe and systick timer is also 100 Mhz , less than one minute, then if 32 bit overflow happens, the result may be unexpected

@andyross
Copy link
Contributor

Not 4G cycles of uptime, 4G cycles between timer interrupts. :)

@andrewboie
Copy link
Contributor

All that said, I'm actually OK with this patch as it stands, simply for the reason that a 1 second busy wait is a terrible design decision in the test. It's not needed and it spins on the CPU wasting precious cycles in CI.

Well, without it we may never have known that there exists an upper bound on how long you can busy-wait on some CPUs with IRQs disabled before bad things happen.

Having said that, busylooping for this long with IRQs locked doesn't happen in practice. Is this something we could solve by adding an assertion in k_busy_wait() that either IRQs are disabled, or the wait value is less than something like 100ms?

@vonhust
Copy link
Author

vonhust commented Apr 17, 2020

I made a fix of arc timer in #24445. This PR as suggested by @andyross

It's not needed and it spins on the CPU wasting precious cycles in CI.

* no need long busy_wait to waste cycles, especially in
 CI.

Signed-off-by: Wayne Ren <wei.ren@synopsys.com>
vonhust pushed a commit to foss-for-synopsys-dwc-arc-processors/zephyr that referenced this pull request Apr 21, 2020
referring the ARM's Systick driver, we did the following improvements:

* use 31 bits of 32-bit counter to avoid the rare but possible
  overflow of elapsed(). If 32 bits val are used, elpased() may
  return a wrong value. then wrong HW cycles.
* two ways to update the correct cycles
  - through systick timer irq
  - when systick timer irq cann't be handled because of irq
    locked/disabled, call z_timer_cycle_get_32->elapsed to update
    the correct cylces. no more than one counter-wrap is allowed.
* if elapsed() is not called too long (more than one counter-wrap) from
  systick tiemr irq or from z_timer_cycle_get_32. The lost of HW cycles
  is unavoidable.

* some detailed discussion can be found in zephyrproject-rtos#24332

Signed-off-by: Wayne Ren <wei.ren@synopsys.com>
@andrewboie andrewboie merged commit 315cba1 into zephyrproject-rtos:master Apr 28, 2020
@galak
Copy link
Collaborator

galak commented Apr 28, 2020

This fails for me on qemu_cortex_m0:

./scripts/sanitycheck -s tests/kernel/interrupt/arch.interrupt -p qemu_cortex_m0

@galak galak requested a review from ioannisg April 28, 2020 22:00
@vonhust vonhust deleted the topic-fix-23476 branch May 1, 2020 09:24
carlescufi pushed a commit that referenced this pull request May 7, 2020
referring the ARM's Systick driver, we did the following improvements:

* use 31 bits of 32-bit counter to avoid the rare but possible
  overflow of elapsed(). If 32 bits val are used, elpased() may
  return a wrong value. then wrong HW cycles.
* two ways to update the correct cycles
  - through systick timer irq
  - when systick timer irq cann't be handled because of irq
    locked/disabled, call z_timer_cycle_get_32->elapsed to update
    the correct cylces. no more than one counter-wrap is allowed.
* if elapsed() is not called too long (more than one counter-wrap) from
  systick tiemr irq or from z_timer_cycle_get_32. The lost of HW cycles
  is unavoidable.

* some detailed discussion can be found in #24332

Signed-off-by: Wayne Ren <wei.ren@synopsys.com>
hakehuang pushed a commit to hakehuang/zephyr that referenced this pull request Jun 20, 2020
referring the ARM's Systick driver, we did the following improvements:

* use 31 bits of 32-bit counter to avoid the rare but possible
  overflow of elapsed(). If 32 bits val are used, elpased() may
  return a wrong value. then wrong HW cycles.
* two ways to update the correct cycles
  - through systick timer irq
  - when systick timer irq cann't be handled because of irq
    locked/disabled, call z_timer_cycle_get_32->elapsed to update
    the correct cylces. no more than one counter-wrap is allowed.
* if elapsed() is not called too long (more than one counter-wrap) from
  systick tiemr irq or from z_timer_cycle_get_32. The lost of HW cycles
  is unavoidable.

* some detailed discussion can be found in zephyrproject-rtos#24332

Signed-off-by: Wayne Ren <wei.ren@synopsys.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Kernel area: Tests Issues related to a particular existing or missing test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants