Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decrease the default number of max clustered elements to 128 #56636

Conversation

Calinou
Copy link
Member

@Calinou Calinou commented Jan 8, 2022

Related to #55289 (can be merged independently).

This improves performance without affecting the visual appearance of scenes that have less than 128 OmniLights + SpotLights + Decals + ReflectionProbes present in the view frustum.

This setting will need to be increased by the user if rendering a more complex scene, but it's fair to say that a majority of scenes will not exceed 128 clustered elements.

Testing project: test_cluster_size.zip

Preview

OS: Fedora 34
CPU: Intel Core i7-6700K
GPU: NVIDIA GeForce GTX 1080

Before

2022-01-09_00 35 54

After

2022-01-09_00 36 25

This improves performance without affecting the visual appearance
of scenes that have less than 128 OmniLights + SpotLights + Decals
+ ReflectionProbes present in the view frustum.
@fire
Copy link
Member

fire commented Jan 9, 2022

Makes #56591 worse.

Makes performance better for a small gain and removes point of clustered forward of having very cheap thousands of lights.

@mrjustaguy
Copy link
Contributor

mrjustaguy commented Jan 9, 2022

I've found very little difference in performance between the min for the setting and the max (8k) on an i3 10105f (3.7GHz) and the potato antiquity that is the Quadro K2000, so there's really not much of a reason to decrease the setting unless you're trying to hyper-optimize the game.

In fact it seems that setting the thing too low can have a negative performance impact.
at 32 i'm getting 260 FPS in an empty scene. with 8192 I'm getting 303, with default i'm getting 310

Personally I think just maxing out the setting makes sense, as in real life scenarios, the tiny hit that you're getting from maxing it out is nothing compared to what is actually going on in the scenes, while setting it too low will have a very clear impact on scenes when you go over the limit.

In a Scene with dozens of lights and 3m primitives at 1mpx/frame i'm getting the identical render times (~44.7ms) with the 8k and default settings tested.

@Calinou
Copy link
Member Author

Calinou commented Jan 9, 2022

I've found very little difference in performance between the min for the setting and the max (8k) on an i3 10105f (3.7GHz) and the potato antiquity that is the Quadro K2000, so there's really not much of a reason to decrease the setting unless you're trying to hyper-optimize the game.

If I set Max Clustered Elements to 8192 on the test project, I get much worse performance:

2022-01-09_15 32 46

Are you checking performance with the editor's View Frame Time panel or a in-game FPS counter? I think the View Frame Time panel may have some discrepancy with actual FPS here.

The performance impact is likely resolution-dependent, and I'm testing this at 2560×1440. Make sure to maximize the project window when comparing performance.

@mrjustaguy
Copy link
Contributor

It seems to significantly increase in cost with resolution indeed, also used both in game and editor and they both gave roughly the same results. I also wasn't able to notice any differences in perf between the two.

I'm curious however, how is it possible that your powerful GPU is getting a 50% decrease in performance from 128 to 8k, while my decade old potato isn't feeling any changes...

I've also confirmed in Task Manager, The utilization in an empty scene is stable 40% when locking FPS, with both 512 and 8k (That is just Forward Clustered Renderer's base cost on my GPU 😞)

@Calinou
Copy link
Member Author

Calinou commented Jan 9, 2022

Makes #56591 worse.

I wonder if we can detect when the number of clustered elements is exceeded, and print a warning message in this case.

@mrjustaguy
Copy link
Contributor

Could the Limit be dynamically changed maybe?

@Calinou Calinou marked this pull request as draft January 22, 2022 21:31
@Calinou
Copy link
Member Author

Calinou commented Jan 22, 2022

Marking as draft since this will cause #56657 to happen more often.

@reduz
Copy link
Member

reduz commented Aug 2, 2022

This needs re-assesment once we have better benchmarks to understand why it may be a performance issue (it should not be).

@reduz
Copy link
Member

reduz commented Aug 2, 2022

Additionally, cluster size may be automatic and grow on demand.

@YuriSizov YuriSizov modified the milestones: 4.0, 4.1 Feb 9, 2023
@Calinou Calinou closed this Feb 23, 2023
@Calinou Calinou removed this from the 4.1 milestone Feb 23, 2023
@Calinou
Copy link
Member Author

Calinou commented Feb 23, 2023

Closing per reduz's comments above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants