Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Directly copy data into uniform buffers #9865

Merged

Conversation

james7132
Copy link
Member

Objective

This is a minimally disruptive version of #8340. I attempted to update it, but failed due to the scope of the changes added in #8204.

Fixes #8307. Partially addresses #4642. As seen in #8284, we're actually copying data twice in Prepare stage systems. Once into a CPU-side intermediate scratch buffer, and once again into a mapped buffer. This is inefficient and effectively doubles the time spent and memory allocated to run these systems.

Solution

Skip the scratch buffer entirely and use wgpu::Queue::write_buffer_with to directly write data into mapped buffers.

Separately, this also directly uses wgpu::Limits::min_uniform_buffer_offset_alignment to set up the alignment when writing to the buffers. Partially addressing the issue raised in #4642.

Storage buffers and the abstractions built on top of DynamicUniformBuffer will need to come in followup PRs.

This may not have a noticeable performance difference in this PR, as the only first-party systems affected by this are view related, and likely are not going to be particularly heavy.


Changelog

Added: DynamicUniformBuffer::get_writer.
Added: DynamicUniformBufferWriter.

@james7132 james7132 added A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times labels Sep 20, 2023
Copy link
Contributor

@superdump superdump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alice-i-cecile alice-i-cecile added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Sep 20, 2023
@mockersf mockersf added this pull request to the merge queue Sep 25, 2023
Merged via the queue into bevyengine:main with commit 12032cd Sep 25, 2023
25 checks passed
@cart cart mentioned this pull request Oct 13, 2023
43 tasks
rdrpenguin04 pushed a commit to rdrpenguin04/bevy that referenced this pull request Jan 9, 2024
# Objective
This is a minimally disruptive version of bevyengine#8340. I attempted to update
it, but failed due to the scope of the changes added in bevyengine#8204.

Fixes bevyengine#8307. Partially addresses bevyengine#4642. As seen in
bevyengine#8284, we're actually copying
data twice in Prepare stage systems. Once into a CPU-side intermediate
scratch buffer, and once again into a mapped buffer. This is inefficient
and effectively doubles the time spent and memory allocated to run these
systems.

## Solution
Skip the scratch buffer entirely and use
`wgpu::Queue::write_buffer_with` to directly write data into mapped
buffers.

Separately, this also directly uses
`wgpu::Limits::min_uniform_buffer_offset_alignment` to set up the
alignment when writing to the buffers. Partially addressing the issue
raised in bevyengine#4642.

Storage buffers and the abstractions built on top of
`DynamicUniformBuffer` will need to come in followup PRs.

This may not have a noticeable performance difference in this PR, as the
only first-party systems affected by this are view related, and likely
are not going to be particularly heavy.

---

## Changelog
Added: `DynamicUniformBuffer::get_writer`.
Added: `DynamicUniformBufferWriter`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Direct copy API for Buffer wrappers
4 participants