Optimize usage of "prepare for use" in draw and dispatch commands. #91989

DarioSamo · 2024-05-15T17:33:11Z

Prepare for use is a command that got added due to the D3D12 driver requiring a step to transition resources to their required states, meaning the function needed to be called no matter what.

Due to how the function was written to share the loop used for validation (which does not get compiled into release games), this means that command was added and serialized into the render graph for every draw call even if the API did not require it at all.

Upon testing Nuku Warriors I found the command being called 72K times per frame on the Vulkan backend despite being a noop in that API. Furthermore, this path will also be skipped when #91769 is merged and the driver reports support for enhanced barriers.

I've not measured the performance differential but it is likely the CPU times can only improve with this change as it goes from doing something that was unnecessary to doing nothing.

…n not required by the API.

clayjohn

Looks good to me

akien-mga · 2024-05-15T20:44:10Z

Thanks!

Calinou · 2024-05-15T23:45:06Z

For future reference, I benchmarked this change on https://github.com/godotengine/godot-benchmarks' Bunnymark 2D benchmarks with --print-fps, which are notoriously CPU-bound on my machine:

Benchmark	Before	After (this PR)
`rendering/bunnymark/bunnymark_canvasitem_draw_api_5000`	810 FPS (1.23 mspf)	830 FPS (1.20 mspf)
`rendering/bunnymark/bunnymark_canvasitem_draw_api_10000`	413 FPS (2.42 mspf)	428 FPS (2.33 mspf)
`rendering/bunnymark/bunnymark_canvasitem_draw_api_20000`	216 FPS (4.62 mspf)	220 FPS (4.54 mspf)

`rendering/bunnymark/bunnymark_meshinstance2d_5000`	527 FPS (1.89 mspf)	532 FPS (1.87 mspf)
`rendering/bunnymark/bunnymark_meshinstance2d_10000`	264 FPS (3.78 mspf)	266 FPS (3.75 mspf)
`rendering/bunnymark/bunnymark_meshinstance2d_20000`	133 FPS (7.51 mspf)	134 FPS (7.46 mspf)

`rendering/bunnymark/bunnymark_sprite2d_5000`	566 FPS (1.76 mspf)	569 FPS (1.75 mspf)
`rendering/bunnymark/bunnymark_sprite2d_10000`	278 FPS (3.59 mspf)	285 FPS (3.50 mspf)
`rendering/bunnymark/bunnymark_sprite2d_20000`	136 FPS (7.35 mspf)	141 FPS (7.09 mspf)

PC specifications

CPU: Intel Core i9-13900K
GPU: NVIDIA GeForce RTX 4090
RAM: 64 GB (2×32 GB DDR5-5800 C30)
SSD: Solidigm P44 Pro 2 TB
OS: Linux (Fedora 39)

DarioSamo requested a review from a team as a code owner May 15, 2024 17:33

Rewrite implementation for prepare for use commands to be skipped whe…

61cd007

…n not required by the API.

DarioSamo force-pushed the prepare_for_use_skip branch from 638dbc8 to 61cd007 Compare May 15, 2024 17:35

Calinou added enhancement topic:rendering performance labels May 15, 2024

Calinou added this to the 4.x milestone May 15, 2024

clayjohn approved these changes May 15, 2024

View reviewed changes

clayjohn modified the milestones: 4.x, 4.3 May 15, 2024

akien-mga merged commit 4bb8c06 into godotengine:master May 15, 2024
16 checks passed

DarioSamo mentioned this pull request May 20, 2024

Replace List with LocalVector on Skeleton3D's bone transform update. #92164

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize usage of "prepare for use" in draw and dispatch commands. #91989

Optimize usage of "prepare for use" in draw and dispatch commands. #91989

DarioSamo commented May 15, 2024

clayjohn left a comment

akien-mga commented May 15, 2024

Calinou commented May 15, 2024 •

edited

Loading

Optimize usage of "prepare for use" in draw and dispatch commands. #91989

Optimize usage of "prepare for use" in draw and dispatch commands. #91989

Conversation

DarioSamo commented May 15, 2024

clayjohn left a comment

Choose a reason for hiding this comment

akien-mga commented May 15, 2024

Calinou commented May 15, 2024 • edited Loading

Calinou commented May 15, 2024 •

edited

Loading