Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xtimer: Fix race condition in _xtimer_set_absolute #6937

Closed
wants to merge 1 commit into from

Conversation

samkumar
Copy link
Contributor

This pull request fixes a race condition in the xtimer module.

Copy link
Member

@jnohlgard jnohlgard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find, the debug message needs an update as well

@kaspar030
Copy link
Contributor

Unfortunately xtimer_now() is not safe to use within iISRs or with IRQs disabled, when not running on 32bit timers, so this solution will cause problems.

@jnohlgard
Copy link
Member

Ah, never mind. We need to review this further

@kaspar030
Copy link
Contributor

@samkumar Could you elaborate on the race condition? I don't get it with my tired eyes. ;)

@samkumar
Copy link
Contributor Author

samkumar commented Apr 19, 2017

@kaspar030 This race condition is as follows. If an interrupt occurs after checking "(target >= now) && ((target - XTIMER_BACKOFF) < now)" but before the call to _lltimer_set, the timer could be set when "target" represents a time in the past.

@samkumar samkumar force-pushed the feature-xtimer-fix branch from d163e8a to f823c8d Compare April 19, 2017 19:19
@samkumar
Copy link
Contributor Author

@kaspar030 Thanks for mentioning that xtimer_now should not be called with interrupts disabled. I have updated the commit accordingly.

@samkumar
Copy link
Contributor Author

Maybe I'm mistaken about this, but shouldn't the check

if (_is_set(timer)) {
    _remove(timer);
}

come before checking if we should spin-wait?

@samkumar samkumar force-pushed the feature-xtimer-fix branch from f823c8d to 7c36ba8 Compare April 19, 2017 20:37
@samkumar
Copy link
Contributor Author

samkumar commented Apr 19, 2017

I suppose there is a more fundamental issue... this same race condition shows up in _xtimer_set. An interrupt may occur in between "target" being computed, and "_xtimer_set_absolute" being called, which could cause the timer to be set in the past.

I've fixed this issue as well (see updated commit) but now we are again calling xtimer_now in with interrupts disabled.

One solution would be for _xtimer_set_absolute to check:

if (((int32_t) (target - now)) <= 0) {
    irq_restore(state);
    /* fire timer immediately, since its target is in the past */
    _shoot(timer);
    return 0;
}

This would let us avoid calling xtimer_now with interrupts disabled, but it seems kind of hacky to me...

@miri64 miri64 requested review from jnohlgard and kaspar030 May 3, 2017 12:12
@miri64 miri64 added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Area: timers Area: timer subsystems labels May 3, 2017
@DipSwitch
Copy link
Member

Also fixed by: #7053

@immesys
Copy link
Contributor

immesys commented Jan 5, 2018

@kaspar030 can you elaborate a bit more on why xtimer_now cannot be called with IRQs disabled?

@kaspar030
Copy link
Contributor

kaspar030 commented Jan 6, 2018

@kaspar030 can you elaborate a bit more on why xtimer_now cannot be called with IRQs disabled?

On <32bit platforms timers, it might miss the _xtimer_high_cnt overflow.

@smlng
Copy link
Member

smlng commented Jan 18, 2018

discussion inconclusive, remove milestone

@smlng smlng removed this from the Release 2018.01 milestone Jan 18, 2018
@Hyungsin
Copy link

Hyungsin commented Jun 15, 2018

@kaspar030, @gebart.
We have used RIOT for our projects (multihop OpenThread network with UDP or FreeBSD-TCP) and this race condition makes it very unstable. In our experience, we cannot use xtimer as it is for practical, long term deployment where robust, unattended operation is necessary.

Especially because RIOT provides preemptive multithreading, any time-critical operation must be protected from an interrupt. Timer operation initiated from a low-priority thread can be preempted by a high-priority thread, which causes significant misbehavior ("target_time" becomes less than "now", which means that the timer expires one whole cycle later). Even a short interrupt handler execution can cause a problem theoretically.

Here is an issue. To this end, xtimer_now should be executed after IRQs disabled, since it is the basis of all timer operation. If 16-bit platforms do not support timer access with IRQs disabled, I'm not sure if these platforms fit with preemptive multithreading concurrency model, strictly saying. I think it is too much loss that RIOT becomes a "toy" OS without robustness due to lack of this very simple fix. Further discussion is welcome!

At least, it would be good to have #ifdef for 32-bit platforms. Is there any option for that?

@stale
Copy link

stale bot commented Aug 10, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

@stale stale bot added the State: stale State: The issue / PR has no activity for >185 days label Aug 10, 2019
@stale stale bot closed this Sep 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: timers Area: timer subsystems State: stale State: The issue / PR has no activity for >185 days Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants