-
Notifications
You must be signed in to change notification settings - Fork 13.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PX4 general work queue #11570
PX4 general work queue #11570
Conversation
(Repeating from #11261 (comment)) See the CI failure in Jenkins (http://ci.px4.io:8080/blue/organizations/jenkins/PX4%2FFirmware/detail/PR-11570/1/pipeline) or run |
|
8a32064
to
e97dfec
Compare
} | ||
|
||
bool empty() const { return _count == 0; } | ||
bool full() const { return _count == N; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should _count
be protected by _mutex
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe so, even if you called empty()
or full()
in the middle of another thread pushing or popping you still have to go through the locked methods to actually do anything with the queue.
Can you think of a problematic case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just thinking "writing and reading of shared data simultaneously can lead to undefined behaviour", right? And I'm assuming this blocking queue should be thread-safe.
e97dfec
to
375f84d
Compare
The tests shut down when you comment out |
It also works like this:
|
@dagar however, with the broadcast addition the |
375f84d
to
1bc2527
Compare
Oh, I wonder if it's because they're both static, and we're not really shutting things down properly yet.
I'll allocate them separately in WorkQueueManagerStart(). |
|
||
daemon_task = px4_task_spawn_cmd("wqueue", | ||
SCHED_DEFAULT, | ||
SCHED_PRIORITY_MAX - 5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it supposed to use: PX4_WQ_HP_BASE now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit arbitrary here, and I don't think even matters. This task gets the work queues going then exits. It probably doesn't even need to exist now that I look at it.
} | ||
|
||
// priority | ||
param.sched_priority = SCHED_PRIORITY_MAX + wq->relative_priority; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it supposed to use PX4_WQ_HP_BASE ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I'd say no it's not supposed to use PX4_WQ_HP_BASE, but that might have been a better way to structure it.
I actually wanted to directly reference the task priority defines for each WQ, but on Linux those are calls (eg sched_get_priority_max(SCHED_FIFO)
) that don't seem to have accessible compile time constants.
Then combined with the initial use case (primary SPI IMU driver highest priority thing in the system) and differences like the priority range (min to max) being so different between NuttX and Linux, I ended up with this simplification of first using priorities relative to max for the wq, then everything else underneath.
A bit later on I'm imagining the need to take another pass at this and make the priority table runtime configurable based on sensor configuration and even current health.
@dagar I think I would appreciate if you make incremental commits with fixes after reviews. I have no idea what has changed whenever you force push and have to review everything again. The tests seem to pass now for CI and for me locally now though which is good :). |
This is a probably a bigger discussion than we should get into on a PR, but I would prefer that as well if we can move away from a rebase workflow, at least by default. We're still getting incremental commits in master that have negative value. Incremental history that changed something, then changed it back, doesn't fully work, etc. It's well intentioned, but often makes it harder or misleading when debugging (git blame), bisecting, backporting to an older branch, finding the original pull request (designs and discussion), or even just casually browsing. If I were going to propose something I'd say in most cases we structure development such that each pull request corresponds to a single change that makes sense to squash and merge on completion. We'd open a pull request, add commits in response to reviews, testing, etc, merge (yes merge!) in new changes from master, and then when we're completely done we press "Squash and merge" to end up with a single, clean, atomic commit in master that references the full, true development history in all its gory detail. |
Right, also true! |
Let me just complain for whatever role I'm doing at the moment 😄. When I'm reviewing I want it that way and when bisecting another! |
I don't mind complaining if it's accompanied by enough interest to push through the process of actually getting things changed on a wider scale. I only bothered with the rant because I think this might actually be a case where we can nearly please everyone (I'm sure someone will disagree). I dream of a day where we can leverage Jenkins and the hardware test rack to automatically track down a regression via bisect. |
Sorry, I was just annoyed trying to do the review again but given the reasons you said I agree with you and it's fine! Didn't mean to come across that ranty. |
1bc2527
to
990d8a0
Compare
990d8a0
to
ac6c467
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dagar - This looks ready (except for my signed question). Can you add a UML of the class hierarchy and a 1 sequence diagram as documentation?
dev->ScheduleNow(); | ||
} | ||
|
||
void ScheduledWorkItem::ScheduleDelayed(uint32_t delay_us) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add RT on these so we can have real time and non real time scheduling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was intentionally trying to keep that separate to prevent abuse. Let's talk about options separately.
} | ||
} | ||
|
||
bool WorkItem::Init(const wq_config_t &config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to start enforcing the constructor rules now and set a correct pattern of 2 phase construct as discussed on the devcall.
namespace px4 | ||
{ | ||
|
||
class WorkItem : public IntrusiveQueueNode<WorkItem *> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great yet another term that hides the meaning. :) Queue of copies and Queue of references says what need to be known.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does it hide the meaning? Isn't the template argument telling you what it is?
Open to specific suggestions.
Let's find a standard way to do this, I don't think there's ultimately much value if it ends up as another out of date document floating around google drive. I wonder if we can get acceptable results form doxygen with appropriate markup. |
Thanks for the reviews everyone. I'll bring this in with the corresponding driver changes in #11571. |
Continuation of #11261.
More background to come, but in short this general px4 work queue will allow us to finally unblock several important things.
Closes #8814
Design Notes
Each PX4 module that previously used hrt_calls, HPWORK, or LPWORK can now easily be moved into this framework. These classes inherit either WorkItem to be run as needed, or ScheduledWorkItem if a fixed interval is desired.
The reason for the (somewhat awkward) WorkQueueManager is to get every queue (a pthread) running within the same task group on NuttX. This significantly reduces overhead, and when combined with uORB changes (#11176) has no disadvantages. The cost of each work queue is little more than the stack (1-2 kB).
The other option for scheduling WorkItems is something I've bolted into uORB itself. The common pattern throughout most important PX4 modules is a task that polls a uORB topic for updates. Many of these processes are already fundamentally serialized, so the coexistence of all these tasks wastes quite a lot of memory. Any work pipeline (a collection of tasks) that's fundamentally serial can be moved to share the same WorkQueue. Then we can lean on the uORB publication to schedule appropriate work.
TODO: