-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UI extraction performance improvements #9212
Conversation
Instead of a single list of `ExtractedUiNodes`, we have a separate list for each extraction function. Then we don't need to sort `ExtractedUiNodes` in `prepare_uinodes` as each sublist is already ordered.
prepare_uinodes
ExtractedUiNodes
from prepare_uinodes
ExtractedUiNodes
from prepare_uinodes
ExtractedUiNodes
in prepare_uinodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is sound. Really nice to take advantage of the already-sorted nature of the UiStack though I think the code quality could be improved.
crates/bevy_ui/src/render/mod.rs
Outdated
let mut drains: Vec<_> = extracted_uinodes | ||
.uinodes | ||
.iter_mut() | ||
.map(|uinodes| { | ||
let mut d = uinodes.drain(..); | ||
let first = d.next(); | ||
(first, d) | ||
}) | ||
.collect(); | ||
|
||
let mut next_extracted_node = move || { | ||
let mut min_stack_index = usize::MAX; | ||
let mut n = usize::MAX; | ||
|
||
for (i, (node, _)) in drains.iter_mut().enumerate() { | ||
if let Some(node) = node { | ||
if node.stack_index < min_stack_index { | ||
n = i; | ||
min_stack_index = node.stack_index; | ||
} | ||
} | ||
} | ||
|
||
if n == usize::MAX { | ||
None | ||
} else { | ||
let (current, drain) = &mut drains[n]; | ||
let next = drain.next(); | ||
let out = current.take(); | ||
*current = next; | ||
out | ||
} | ||
}; | ||
|
||
while let Some(extracted_uinode) = next_extracted_node() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you mentioned in the PR description, this needs to be a method on ExtractedUiNode
. Ideally an impl Iterator
next
method.
crates/bevy_ui/src/render/mod.rs
Outdated
for (i, (node, _)) in drains.iter_mut().enumerate() { | ||
if let Some(node) = node { | ||
if node.stack_index < min_stack_index { | ||
n = i; | ||
min_stack_index = node.stack_index; | ||
} | ||
} | ||
} | ||
|
||
if n == usize::MAX { | ||
None | ||
} else { | ||
let (current, drain) = &mut drains[n]; | ||
let next = drain.next(); | ||
let out = current.take(); | ||
*current = next; | ||
out | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was at first extremely confused by how this works. So here is a summary for other reviewers:
- find the index (named
n
) indrains
wherenode.stack_index
is smallest (thefor
loop) - Pick the
(current, drain): (Option<T>, Drain<T>)
fromdrains
- Read the
current
- replace it with the
drain.next()
- return
current
I might suggest some improvement in a couple hours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's weird I just typed this in and it worked immediately, and then all the better-thought-out refactors I tried ruined the performance.
I opened two PRs to your branch, please check them out:
|
An alternative implementation is to create a The downside is that it might be fairly wasteful if most ui nodes are invisible (supposing we simply "hide" menus instead of despawning them) The current impl might consume at most twice as much memory as there are visible extracted UI nodes though the memory for extracted UI nodes is never reclaimed. |
I traced my PRs and they are a bit slower than your code. |
I did some more experimenting and I found we should also check the time on |
Using a sparse set ( |
Instead of sorting `ExtractedUiNodes`, add a lookup index. Then in prepare_uinodes sort the indices by `stack_index` then use the sorted indices to retrieve each `ExtractedUiNode` in the correct order. * Added a field `indices` to `ExtractedUiNodes`. * Added `ExtractedIndex` type.
* New component, holds the uinode entity's stack index. Updated in `ui_stack_system after the `UiStack` is generated. `bevy_ui::node_bundles` * Added the `UiStackIndex` to all UI node bundles. `bevy_ui::render` * Extract stack indices from the `UiNode` component.
…ck-index-component
Ready for review. Just going to run some final benchmarks. |
crates/bevy_ui/src/node_bundles.rs
Outdated
@@ -119,6 +119,8 @@ pub struct ImageBundle { | |||
pub z_index: ZIndex, | |||
/// The node's position in the UiStack. Nodes with lower stack indices are drawn first. | |||
/// Nodes with a higher stack index are drawn over nodes with a lower stack index. | |||
/// | |||
/// This field is automatically managed by `ui_stack_system`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a component, not a field. (well yes it's a field, but "component" is more in line with how to use bevy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied it from the doc comments for the other fields without thinking. Yeah, it doesn't make sense. The bundle's field isn't the thing being managed by the system. Better change them all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you prefer, you could open an issue and keep this for a different PR. But I doubt anyone would object this change with this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed as well for GlobalTransform
:
/// This component is automatically managed by the UI layout system.
I think I'll fix the field
thing here as it's within the focus of this PR as it affects the new comments I added, but fix the GlobalTransform comment in another PR.
…nent`. The bundle's field isn't the thing being managed by the system.
I relaunched CI, tests are still failing, could you take a look? |
/// The node's position in the UiStack. Nodes with lower stack indices are drawn first. | ||
/// Nodes with a higher stack index are drawn over nodes with a lower stack index. | ||
/// | ||
/// This component is automatically managed by `ui_stack_system`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link would be nice here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto replicated doc comments before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, can't link to ui_stack_system
because it's private, that's why the CI was failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code and docs quality is good, motivation is sound and benchmark results are excellent.
should be good now |
Test still appears to be failing 🤔 |
I'll have another look |
@ickshonpe I can merge this once the merge conflict is resolved :) Ping me if I'm slow! |
I might open a new PR and leave this one. I've had to rework everything to accommodate the recent rendering changes. |
Closing in favor of #9668 due to merge conflicts. |
Objective
UI extraction is really inefficient:
Query::get
for each entity. So if there are 10000 nodes it will do 10000 lookups, even if there are no entities that match the query.prepare_uinodes
does an in-place stable sort of theExtractedUiNodes::uinodes
vector. It's extremely expensive as eachExtractedUiNode
is quite large (192 bytes) and there can be a lot of them.Solution
Add a new component
UiStackIndex
to hold each UI node's stack index generated byui_stack_system
. Then during extraction we can query for each node's stack index and we no longer have to iterate the UiStack each time.ExtractedUiNodes
now has a second vec ofThese ranges partition the
ExtractedUinodes::uinodes
vector.Now instead of sorting
uinodes
we can sort theExtractedRange
s vector by stack index instead. This is faster becauseExtractedRange
s are smaller and there are fewer of them (a page of text using thousands ofExtractedUiNode
s only requires oneExtractedRange
).Benchmarks
Notes
Changelog
ExtractedUiNodes::uinodes
is no longer sorted inprepare_uinodes
, instead it has a second vec of ranges that partitionuinodes
. This partition is sorted by stack index instead.UiStackIndex
: holds each UI node's stack index as generated byui_stack_system
.UiStack
, instead the stack index for each node is retrieved from theirUiStackIndex
component.uinodes
field ofExtractedUiNodes
is now private. Use the newpush
andpush_nodes
to addExtractedUiNode
s to be rendered.Migration Guide
The
uinodes
field ofExtractedUiNodes
is now private. Use the newpush
andpush_nodes
to addExtractedUiNode
s to be rendered.