-
Notifications
You must be signed in to change notification settings - Fork 947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lifetimes on RenderPass
make it difficult to use.
#1453
Comments
This is indeed a feature that makes it more difficult to use. What we gain from it is very lightweight render pass recording. Explanation follows.
Most of the dependent resources are meant to outlive the pass anyway. Only a few, like the index buffers you create dynamically, become problematic. Generally, unless you are creating many resources during recording, it's easy to work around. If you are doing that, you aren't on a good path performance wise, and should consider creating one big buffer instead per frame. Another way to possibly address this is to have head's up to @mitchmindtree, who is on 0.4 and will face this issue soon. It would be good to know how much this would affect their case. |
I think the issue might be a bit more severe. I put the buffers into a |
Judging by gfx-rs/wgpu-rs#155 and gfx-rs/wgpu-rs#168 I don't imagine it should affect us a great deal - most of the abstractions in nannou take a
These are just a few examples - generally all of these are submitted on a single Anyway, I hope I'm not speaking too soon as I haven't tried updating yet. There are some other things I'd like to address in nannou first, but I'll report back once I get around to it. |
Yes. So the good news is - this is a not the best pattern to follow as a use case: creating buffers as you are recording a pass. Would it be possible for you to refactor the code in a way that first figures out how much space is needed for, say, all indices in a pass, creating a single buffer, and then using it through the pass? |
I also hit this issue in the game engine I've been building. It has a relatively complex "Render Graph" style api and I spent a solid ~15 hours over the last week refactoring everything to account for the new lifetime requirements. In general I solved the problem the same way @kvark outlined. Although I really wish I had found this thread before coming to the same conclusion via trial and error 😄 . My final solution was:
My takeaways from this (honestly very painful) experience:
That being said, I understand why these lifetime changes were made and I think the wgpu-rs team made the right call here. I certainly don't want this to read as a complaint. I deeply appreciate the work @kvark and the wgpu team has done here. I just want to add my experience as a data point. |
Thank you for feedback @cart ! struct ArcRenderPass<'a> {
id: wgc::id::RenderPassId,
_parent: &'a mut CommandEncoder,
used_buffers: Vec<Arc<Buffer>>,
}
impl ArcRenderPass<'_> {
fn set_vertex_buffer(&mut self, slot: u32, buffer: &Arc<Buffer>, offset: BufferOffset) {
self.used_buffers.push(Arc::clone(buffer));
unsafe {
wgn::wgpu_render_pass_set_vertex_buffer(
self.id.as_mut().unwrap(),
slot,
buffer.id,
offset,
)
};
}
} These passes could be used interchangeably with the current ones and trade the life time restrictions to a bit of run-time overhead for the |
Ooh I think I like the "multiple pass variants" idea because it gives people the choice of "cognitive load vs runtime cost". The downsides I can see are:
On the other hand, the "zero cost abstraction" we have currently feels more in line with the Rust mindset and I'm sure many people would prefer it. I'm also in the weird position where I'm over the "migration hump" and now I really want a zero cost abstraction. Its hard for me to be objective here 😄 I think this could be solved with either:
If I had to pick one today, I think I would go for (2). Rather than complicating the api surface / being forced to support that forever and document it clearly, just see if additional docs and examples for the "zero cost lifetimes" solves the problem well enough for users. If this continues to be a problem you can always add the variant(s). Removing apis is harder on users than adding features, so I think it makes sense to bias toward a smaller api. |
The other interesting aspect is that in this use case, we don't care about the While on the topic of lifetimes and safety, what happens if a |
Yep, we could do something like that as well. It would also involve a different signature for render pass functions though (since you'd be lifting the lifetime restriction we have today).
Generally, we have all the objects refcounted, and you don't lose the device just because you drop it. The only exception really is render/compute pass recording, where we only want to work with ID and not go into the objects themselves (until the recording is finished) to bump the refcounts. |
This would appear the simplest option to me. It can probably even be done without breaking changes: pub enum BufferOwnedOrRef<'a> {
Owned(Buffer),
Ref(&'a Buffer),
}
impl<'a> From<Buffer> for BufferOwnedOrRef<'a> {
fn from(b: Buffer) -> Self {
BufferOwnedOrRef::Owned(b)
}
}
impl<'a> From<&'a Buffer> for BufferOwnedOrRef<'a> {
fn from(b: &'a Buffer) -> Self {
BufferOwnedOrRef::Ref(b)
}
}
pub fn set_vertex_buffer<'a, B: Into<BufferOwnedOrRef<'a>>(
&mut self,
slot: u32,
buffer: B,
offset: BufferAddress,
size: BufferAddress
) |
@dhardy yes, we could. I hesitate, however, because I see the value in not promoting the code path where the user creates resources in the middle of a render pass. It's an anti-pattern. The only reason that could make this path appealing today is because updating GPU data is hard. Here is what needs to happen (ideally) when you are creating a new vertex buffer with data:
Now, imagine you already have a buffer that is big enough(!). That would spare you (3) but otherwise follow the same steps. Therefore, there is no reason for us to make it easy to create new buffers, even if you are replacing all the contents of something. It's always more efficient to use an existing one. The only caveat is - what if you need a bigger buffer? Let's see if this becomes a blocker. For the data uploads, the group is still talking about the ways to do it. Hopefully, soon... |
FYI, you can emulate the ArcRenderPass API using arenas in user space, and it should basically be just as efficient as the equivalent WGPU API (unless core implementation details change a lot to increment internal reference counts before the RenderPass is dropped). struct ArcRenderPass<'a> {
arena: &'a TypedArena<Arc<Buffer>>,
render_pass: RenderPass<'a>
}
impl<'a> ArcRenderPass<'a> {
fn set_vertex_buffer(&mut self, slot: u32, buffer: Arc<Buffer>, offset: BufferOffset) {
let buffer = self.arena.alloc(buffer);
self.render_pass.set_vertex_buffer(slot, buffer, offset);
}
}
fn blah<'a>(encoder: &'a mut CommandEncoder) {
let arena = TypedArena::new();
let arc_render_pass = ArcRenderPass {
arena,
render_pass: encoder.begin_render_pass(..),
};
// ... Do stuff; you can pass around &mut ArcRenderPass and call set_vertex_buffer on owned `Arc`s.
} |
@pythonesque it would be wonderful if we had that used by one of the examples. Would you mind doing a PR for this? We'd then be able to point users to working code instead of this snippet. |
Just to provide another data point, I hit this issue as well. Consider that I want my user to be able to simply call an API to render high-level objects without worrying about details of which buffers to use. There are 2 options:
I'm not sure which is more performant. With option (1), it seems good but we are blocked until the GPU has finished its job, effectively losing parallelism (unless I force the user to go full async and/or use double-buffering). With option (2), we're infinitely-buffered but pays for the cost of allocations. Ultimately, with the current lifetime constraints, option (2) is not possible. So we're forced to go for option (1). As a side point, it is a little clunky using the current buffer mapping API to go with option (1). I referred to gfx-rs/wgpu-rs#9 and saw this advice from @kvark:
which seemed to contradict the approach to its core. |
@definitelynotrobot I don't think I understand your thoughts clearly. For example, this part seems to be unrelated to the issue at hand:
Also, this part:
This issue gfx-rs/wgpu-rs#9 is actually no longer a problem. The upstream WebGPU API went in this direction, and it's a part of wgpu-0.6. Did you consider using the |
Sorry about that. I was trying to illustrate the 2 designs that I could go with my API + their pros/cons and meant to say that option (2) was not even considerable because of
I was hesitant because that would mean an allocation for |
What's the status of this? I think it would be very useful. I'm writing a renderer that would allow users to create custom pipelines, and they'd be provided with the state of the renderer using a trait. The issue is that I'm getting massive lifetime issues that are caused by the fact that I'm using a lot of |
At this point, any improvement here should be blocked on @pythonesque work in Arc-anizing the internals. |
This is now blocked on #2710. |
This is still pretty painful. The interactions with this lifetime mean that any object which embeds wgpu resources becomes untouchable the instant you record one of those resources into a renderpass. And since we can't clone wgpu resource ids in rust today, there's no way to work around this with temporary handles on the application side. For a non-trivial application, I'm ending up having to wrap all the wgpu resource types in wrapper objects, define new handles for them, and pass those around in my code... |
Adding onto this, I had to write a custom un-typed arena that would allocate Arcs onto the heap and return a reference to the data instead of being able to just pass in an Arc into the render pass |
This is the reason I gave up on wgpu. |
@griffi-gh I was working on a wgpu encapsulate layer to work around these issues, which may be useful for your reference. Although this lifetime requirement imposes great constraints in renderpass encoding for performance and safety reasons, I still think the wgpu is a super good graphics abstraction layer implementation. https://github.com/mikialex/rendiation/tree/master/platform/graphics/webgpu |
Y'all will be happy to know that this requirement will be able to be removed once #3626 lands. |
Now that #3626 has been merged, what's the next step for this issue? I might be interesting in working on this. |
The main task is to refactor |
RenderPass
make it difficult to use. RenderPass
make it difficult to use.
Good news! We got almost everything landed now to remove those lifetimes on |
Related PRs:
gfx-rs/wgpu-rs#155
gfx-rs/wgpu-rs#168
For context, I was looking at updating imgui-wgpu to work with the current master.
Now that
set_index_buffer
(and similar methods) take the a&'a Buffer
instead of&Buffer
, it has a ripple effect of "infecting" the surrounding scope with this lifetime.While attempting to iterate and add draw calls, you end up with either "multiple mutable borrow" errors on wherever you're storing the
Buffer
or similar lifetime errors like "data fromself
flows intorpass
here"I would think that by calling
set_index_buffer
(and similar methods) that it would increment the ref-count on the buffer so that this lifetime restriction isn't needed.The text was updated successfully, but these errors were encountered: