-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a deadlock that can occur when using scope() on ComputeTaskPool from within a system #892
Fix a deadlock that can occur when using scope() on ComputeTaskPool from within a system #892
Conversation
Also while I'm thinking about it, this change (aclysma@e1f7619) would ensure that if there is only a single task, that the calling thread immediately executes it. I think this would be better than having another thread pick up the work and execute it with potentially a cold cache, with the calling thread stuck until completes. BTW, and this is true for the PR as well, the calling thread might not belong to the pool used by scope(). So you could end up with a thread outside the pool doing work on tasks that are in the pool. (having scope() be async and awaiting would avoid this and be more proper behavior, but I think in practice the tradeoffs here are fine.) I can PR this one if you'd like. |
Preventing deadlocks is definitely desirable, but this change does appear to come at the cost of cpu usage (at least in dev builds, which is what i tested). before (reported as a combined 15%)after (reported as a combined 17%)linked commit w/ single task opt (reported as a combined 17%)Cpu usage is much "spikier" which sort of makes sense. I'll do release mode tests next |
Release mode results (sorry for the cut off units ... the scale is the same in each image): before (reported as a combined 1%)after (reported as a combined 1%)linked commit w/ single task opt (reported as a combined 1%)Looks like it matters a lot less in release mode. The average is still very low and all impls have "spikes" |
The stats are very interesting! It looks like these measurements are based on CPU % (as opposed to throughput). If this change avoids waiting for another thread to step in and pick up the work, I would expect both a slight increase in CPU % and raw throughput. |
Yeah its probably worth measuring frame time. Ideally in a non-graphical compute-heavy example (with decent parallelism potential) to remove rendering from the equation. (the tests above were performed on the breakout example) |
Just measured frame time for the breakout example with and without this change. @aclysma's prediction was correct. Before this change, we get I'm inclined to merge this as (1) it resolves a real bug (2) it gives us a performance boost. The spikiness of the usage graph is something to think about from a power consumption perspective, but I'm not going to worry about that / im not even sure how we would measure it / if we really care about power consumption theres a million other things we should resolve first. I'm also very open to your other change if you're still interested in a pr. |
Im putting together a fix for the new clippy lint now. I'll rebase this pr when its ready. |
…rom within a system.
5fc894f
to
bacd175
Compare
This bevy app will deadlock 100% of the time (provided by @bonsairobo with minor adjustments):
The root cause is that in this example, ComputeTaskPool has only one thread in the pool. Calling scope() from within the system eventually calls
future::block_on(...)
, blocking this thread until all the spawned tasks complete. However, those tasks will never run because the only thread in the ComputeTaskPool is blocking.This PR fixes this by having the thread that calls spawn() participate in driving the tasks spawned in the scope forward.
Another way this could be resolved would be to await the scope() call. This would require making scope() an async function which may complicate capturing local vars, and making systems themselves async. But the solution in the PR is much simpler.