Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement motion vectors and TAA for skinned meshes and meshes with morph targets. #13572

Merged
merged 4 commits into from
May 31, 2024

Conversation

pcwalton
Copy link
Contributor

@pcwalton pcwalton commented May 30, 2024

This is a revamped equivalent to #9902, though it shares none of the code. It handles all special cases that I've tested correctly.

The overall technique consists of double-buffering the joint matrix and morph weights buffers, as most of the previous attempts to solve this problem did. The process is generally straightforward. Note that, to avoid regressing the ability of mesh extraction, skin extraction, and morph target extraction to run in parallel, I had to add a new system to rendering, set_mesh_motion_vector_flags. The comment there explains the details; it generally runs very quickly.

I've tested this with modified versions of the animated_fox, morph_targets, and many_foxes examples that add TAA, and the patch works. To avoid bloating those examples, I didn't add switches for TAA to them.

Addresses points (1) and (2) of #8423.

Changelog

Fixed

  • Motion vectors, and therefore TAA, are now supported for meshes with skins and/or morph targets.

@pcwalton pcwalton added A-Rendering Drawing game state to the screen S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels May 30, 2024
morph targets.

This is a revamped version of bevyengine#9902. Instead of adding more bind group
layouts as that patch did, which created a combinatorial explosion of
layouts, this patch unconditionally adds `prev_joint_matrices` and
`prev_morph_weights` bindings to the shader bind groups. These add no
significant overhead if unused because we simply bind dummy buffers, and
the driver strips them out if unused. We already do this extensively
with the various `StandardMaterial` bindings as well as the mesh view
bindings, so this approach isn't anything new.

The overall technique consists of double-buffering the joint matrix and
morph weights buffers, as most of the previous attempts to solve this
problem did. The process is generally straightforward. Note that, to
avoid regressing the ability of mesh extraction, skin extraction, and
morph target extraction to run in parallel, I had to add a new system to
rendering, `set_mesh_motion_vector_flags`. The comment there explains
the details; it generally runs very quickly.

I've tested this with modified versions of the `animated_fox`,
`morph_targets`, and `many_foxes` examples that add TAA, and the patch
works. To avoid bloating those examples, I didn't add switches for TAA
to them.

Addresses points (1) and (2) of bevyengine#8423.
Copy link
Member

@aevyrie aevyrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closes an important gap in bevy, very little added code, this is largely documentation - the code quality here is great. The simplifying assumptions here w.r.t. the buffers seems well justified. All-in-all this should be pretty uncontroversial.

Copy link
Contributor

@JMS55 JMS55 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM minus some general nits, and pending tracy comparisons on a large many_foxes run on a lower end (Pascal, GCN, mobile) device.

EDIT: Please also update the doc comments in taa/mod.rs talking about skinned/morph support.

crates/bevy_pbr/src/prepass/prepass.wgsl Outdated Show resolved Hide resolved
#ifdef MOTION_VECTOR_PREPASS
// Use vertex_no_morph.instance_index instead of vertex.instance_index to work around a wgpu dx12 bug.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the vertex_no_morph.instance_index workaround no longer valid?

Copy link
Contributor Author

@pcwalton pcwalton May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried animated_fox with TAA enabled on the Direct3D 12 backend and it works fine. I guess whatever the issue was is now fixed. Or my changes to the shaders perturbed the bug out of existence.

crates/bevy_pbr/src/render/mesh.rs Outdated Show resolved Hide resolved
crates/bevy_pbr/src/render/mesh.rs Outdated Show resolved Hide resolved
if let Some(skin) = skin {
groups.skinned = Some(layouts.skinned(&render_device, &model, skin));
let prev_skin = skins_uniform.prev_buffer.buffer().unwrap_or(skin);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can bind the same buffer twice? I would've thought you needed a dummy buffer. Maybe that rule is only for storage resources, and does not apply to uniform buffers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think there was any kind of rule against binding a read-only buffer twice. I can see problems with read-write buffers, but read-only seems always fine. It's extremely common for textures, so I don't see why it wouldn't be for buffers...

// Borrow check workaround.
let (morph_indices, uniform) = (morph_indices.into_inner(), uniform.into_inner());

// Swap buffers. We need to keep the previous frame's buffer around for the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your way is a bit cleaner, but this is how I did it in my PR https://github.com/JMS55/bevy/blob/3334d0bd5ff643f54a605672a70d05b6b271b455/crates/bevy_pbr/src/render/morph.rs#L93-L110. Can't remember why they look different off the top of my head.

@JMS55 JMS55 added this to the 0.14 milestone May 30, 2024
@pcwalton
Copy link
Contributor Author

pcwalton commented May 30, 2024

This is approximately a 2%-4% regression in CPU time of main_opaque_pass_3d of many_foxes with 10,000 foxes.

CPU 13th Gen Intel(R) Core(TM) i9-13900H
GPU NVIDIA GeForce RTX 4070 Laptop GPU
Screenshot 2024-05-29 225937

Mac mini, M2
Screenshot 2024-05-29 231740

I'd argue that this small regression is worth it. Rationale: This is a worst-case stress test, and real-world games will typically not have that many skinned meshes. There should be no regressions for non-skinned meshes. We can also fix the regressions in the future in a variety of ways. TAA for skinned meshes is such a commonly-requested feature that it seems worthwhile to have it.

@alice-i-cecile alice-i-cecile added D-Complex Quite challenging from either a design or technical perspective. Ask for help! X-Uncontroversial This work is generally agreed upon D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes and removed D-Complex Quite challenging from either a design or technical perspective. Ask for help! labels May 30, 2024
@pcwalton
Copy link
Contributor Author

The regression is somewhat worse--13% or so for main_opaque_pass_3d--on my slower box:
CPU: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
GPU: NVIDIA GeForce RTX 2060
Screenshot 2024-05-30 214741

Frame time is a 5.73% regression:
Screenshot 2024-05-30 215220

I still think we should take this regression as this is still not very large and this is a stress test. Most games won't see this kind of regression.

@pcwalton
Copy link
Contributor Author

This version of the patch should eliminate the regression by only including the bindings for views with motion vectors.

@pcwalton pcwalton requested a review from JMS55 May 31, 2024 07:08
Copy link
Contributor

@JMS55 JMS55 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're technically doing a double pipeline compile here, one the first frame you add a skinned entity when it doesn't yet have a previous skin buffer entry, and then a second pipeline the next frame when it does.

I'm ok with that though, we can always improve it later.

@alice-i-cecile alice-i-cecile added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels May 31, 2024
@alice-i-cecile alice-i-cecile added this pull request to the merge queue May 31, 2024
Merged via the queue into bevyengine:main with commit be053b1 May 31, 2024
28 checks passed
@IceSentry IceSentry added the M-Needs-Release-Note Work that should be called out in the blog due to impact label May 31, 2024
@alice-i-cecile
Copy link
Member

Thank you to everyone involved with the authoring or reviewing of this PR! This work is relatively important and needs release notes! Head over to bevyengine/bevy-website#1326 if you'd like to help out.

@alice-i-cecile alice-i-cecile removed the M-Needs-Release-Note Work that should be called out in the blog due to impact label Jun 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it X-Uncontroversial This work is generally agreed upon
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants