-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible problem in scheduler #5769
Comments
The access to 'runqueue_bitcache' should be protected by disabling
Can you pinpoint where exactly? Is the thread "STATUS_RECEIVE_BLOCKED"? What platform are you working on? |
Thanks for the Info, It is true - atomic access is not enough. The hung thread (a network interface) has STATUS_PENDING - at least The threads msg_queue full (no surprise), and another thread (ipv6) was blocked (msg_send_receive) because it wanted to communicate with the hung network interface thread. I am using, an stm32f1 variant with an CortexM3 on a custom PCB. Note that I am using release 2015.09 - and I currently cannot rebase onto a newer RIOT release. When this problem occurs again I try to pinpoint more exactly where the thread hangs. best |
@kaspar030 ping? |
@melshuber Did you encounter this again? |
@kaspar030: not yet, but we are currently running on a patch which replaces bit operations on runqueue_bitcache by atomic_[set|clr]_bit. I know thats not the correct solution, but I have not yet not found the time to dig into this. |
any news here? |
not yet, but I am currently not working only part-time on this project |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions. |
Any progress on this? |
sorry, I am no longer working on this |
Then I would close, since you yourself weren't sure about the problem either from the start, and as far as I interpret the discussion above, this wasn't reproducible. Shout if you disagree. |
I am not totally sure, but i think there might be a Problem in
sched_set_status
.https://github.com/RIOT-OS/RIOT/blob/master/core/sched.c#L130-L150
runqueue_bitcache
is not accessed atomically, i think it should.e.g.:
runqueue_bitcache |= 1 << process->priority
I found out that in our project sometimes one of our thread hung. I check for deadlocks but did not found any, so i locked up the stack of the threads, and saw that the thread stopped somewhere in
msg_receive
. The thread state was pending but, was not scheduled anymore.Finally I found that that:
runqueue_bitcache
was not set.I used gdb to set the bit in
runqueue_bitcache
manually, and the system got kick started again.So something introduced an inconsistency between the runqueues and the bitcache.
I think the reason is that
runqueue_bitcache
is not accessed atomicallyI already prepared a fix you can review:
The text was updated successfully, but these errors were encountered: