Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occlusion culling causes CPU-related frametime spikes when OccluderInstance3D nodes' visibility is toggled #70373

Open
Tracked by #70533
Calinou opened this issue Dec 20, 2022 · 3 comments

Comments

@Calinou
Copy link
Member

Calinou commented Dec 20, 2022

Related to godotengine/godot-proposals#5967.

Godot version

4.0.beta9

System information

Fedora 36, Vulkan Forward Plus, AMD Radeon RX 6900 XT

Issue description

Occlusion culling causes CPU frametime spikes when OccluderInstance3D nodes with BoxOccluder shapes are hidden and shown. This occurs both with V-Sync enabled and disabled, and is noticeable in both cases. A fully optimized engine binary (with LTO enabled) was used to reproduce this issue. I also tried setting the BVH Build Quality advanced project setting to Low, to no avail.

The MRP toggles visibility of static OccluderInstance3D to disable them when 4 doors start opening, and re-enables them when the doors are done closing. This is done to avoid moving OccluderInstance3D nodes every frame, which would trigger unnecessary BVH rebuilds (on top of overocclusion).

All doors open/close at the same time in the MRP, which makes the issue more noticeable. However, it still occurs with a single door present in the scene (with the other 3 doors removed entirely, not just hidden). As a workaround, making sure multiple occluders are never toggled on the same frame can help.

According to the visual profiler, Cull Scene (highlighted in white) is the most expensive operation during those spikes, not Update Occlusion Buffer:

2022-12-20_22 52 01

Steps to reproduce

  • Create or download a 3D scene with geometry. You can use https://github.com/Calinou/game-maps-obj as a reference.
  • Enable occlusion culling in the project settings.
  • Add an OccluderInstance3D node and bake it to match the level geometry.
  • Add a second OccluderInstance3D node with a BoxOccluder shape.
  • Toggle the OccluderInstance3d nodes' visibility.
  • Notice frametime spikes using editor profiles, MangoHud, RTSS or similar.

Minimal reproduction project

occlusion_culling_mesh_lod.zip (same as godotengine/godot-demo-projects#807)

@clayjohn
Copy link
Member

Before implementing the Embree-based occlusion culling we discussed using Intel's Masked Software Occlusion Culling and now I can't remember why we decided not to use it. It certainly seems like it would improve performance on our target hardware

@Calinou
Copy link
Member Author

Calinou commented Feb 16, 2023

This part from the issue description makes me wonder if we could queue updates on a frame and ensure only one BVH rebuild occurs per frame:

All doors open/close at the same time in the MRP, which makes the issue more noticeable. However, it still occurs with a single door present in the scene (with the other 3 doors removed entirely, not just hidden). As a workaround, making sure multiple occluders are never toggled on the same frame can help.

It's worth digging out a profiler and checking if the rebuild function is called more times than needed.

@mrjustaguy
Copy link
Contributor

I'm running 4.2 dev3, and I've locked my CPU frequency to 3.7 GHz (i3 10105f) using Power plans in Windows 11, and I'm seeing a totally different result that is totally fine IMHO..

While there are spikes, the delta between highest and lowest points on the graph is about 2x, not like 10x, and it looks like a flat brick with an occasional spike that is only sometimes created when toggling the doors.

I've tested with 512, 4096 and 16384 rays, and the behavior is fairly consistent
1ms avg with 2ms spikes, 2.5ms avg with 5ms spikes, 6ms avg with 13ms spikes

I'm only getting a graph resembling this when allowing for dynamic clock speeds, however the spikes consistently remain under 16ms per frame, across all 3 ray counts, all of them reaching similar spike durations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants