-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework of gl memory / buffer handling #2822
Conversation
This is still highly WIP, I just went ahead and created the PR to get feedback. Notably, this doesn't fix wgpu-rs with rendy yet, rendy errors (when trying the wgpu-rs cube example) with |
Okay, so the cube example in wgpu-rs is still broken, and it has something to do with the fact that you can't call In the cube example, this is coming from uploading texture data to a new buffer and calling I don't know how opengl mapping of a buffer range really works, I guess you can't unmap a buffer range, you can only unmap all of the mapped buffer ranges at once? Does opengl support mapping multiple different ranges of a single buffer simultaneously? Curiously, if you patch rendy memory to call gfx In any case, I think the best way forward might be not to use the rendy dynamic allocator with the opengl backend and instead substitute the dedicated one? I'm not really super sure! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very well written! I got a few suggestions here, but most importantly we need the persistent mapping for the cpu-visible buffers to become useful.
src/backend/gl/src/command.rs
Outdated
// Active index type, set by the current index buffer. | ||
index_type: Option<hal::IndexType>, | ||
// Active index type and buffer range, set by the current index buffer. | ||
index_type_range: Option<(hal::IndexType, Range<u64>)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could use hal::buffer::Offset
typedef here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, done!
src/backend/gl/src/command.rs
Outdated
@@ -1088,7 +1087,9 @@ impl command::RawCommandBuffer<Backend> for RawCommandBuffer { | |||
} | |||
|
|||
unsafe fn dispatch_indirect(&mut self, buffer: &n::Buffer, offset: buffer::Offset) { | |||
self.push_cmd(Command::DispatchIndirect(buffer.raw, offset)); | |||
let (raw_buffer, range) = buffer.borrow().as_bound(); | |||
assert_eq!(range.start, 0, "buffer offset unsupported in indirect draw"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we actually support indirect buffer offset, it's a part of DispatchIndirect
variant.
What we do not support is draw_indexed_indirect
when index buffer offset is non-zero
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, okay that makes sense, I was confused between DispatchIndirect
and draw_indirect
and draw_indexed_indirect
. OpenGL doesn't support draw_indexed_indirect
at all right now so that's not an issue.
gl.buffer_storage(target, buffer.requirements.size as _, None, flags); | ||
gl.bind_buffer(target, None); | ||
} else { | ||
let flags = if cpu_can_read && cpu_can_write { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing these flags appears to be a regression. It's fine to catch up to this after the fact, since your change is much more important, but need to keep it on the radar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way the gl backend was before, allocated memory had n::Memory
properties hard-coded to memory::Properties::CPU_VISIBLE | memory::Properties::CPU_CACHED
, which means that it was always really allocated with glow::MAP_READ_BIT | glow::MAP_WRITE_BIT
or glow::DYNAMIC_DRAW
, the rest of it seemed to be dead code?
Now, it's still hard-coded, but since the allocation takes place in the same place that the n::Memory
flags are set, there's nothing to check to to see what flags to use when allocating.
I agree it would probably be better if you could control the buffer flags more than this, but it should have the same behavior as it did before.
src/backend/gl/src/lib.rs
Outdated
}; | ||
}, | ||
// Memory type for uses other than images and INDEX | ||
hal::MemoryType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be good to also expose a device-local memory type for buffers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounded right to me too, but I didn't know enough to know how to do it correctly. If we have a device-local buffer type, what should the gl backend do when copying from a cpu-visible buffer (backed by a real native buffer) to a device-local buffer? I imagine the correct answer is nothing, but what happens then when the user tries to copy from a fake device-local buffer to e.g. an image buffer? This might be another case where if I knew more about how HAL works it would make more sense to me, but as of right now I need it explained a bit slower :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, now that I think about it I suppose just having an actual, real buffer without any memory mapping would be the right thing to do, even if it needlessly duplicates memory on platforms with emulated memory mapping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I think I pushed something that works, it at least fixes the wgpu-rs shadow example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copying from one buffer to another would be going through glCopyBufferSubData
.
what happens then when the user tries to copy from a fake device-local buffer to e.g. an image buffer
Device-local buffer is just a GL buffer (or a sub-range of it, if we have a buffer associated with Memory
) without MAP_XXX flags. Copying from it, to it, using as index/vertex, all those things are done the same way as usual.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes perfect sense, and that's where I ended up.
src/backend/gl/src/lib.rs
Outdated
properties: Properties::CPU_VISIBLE | ||
| Properties::COHERENT | ||
| Properties::CPU_CACHED, | ||
heap_index: 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't need to be a separate heap index. We should have 0 for device local types and 1 for cpu visible types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that makes perfect sense, done.
@@ -612,7 +612,7 @@ impl CommandQueue { | |||
r.image_extent.height as _, | |||
glow::RGBA, | |||
glow::UNSIGNED_BYTE, | |||
0, | |||
r.buffer_offset as i32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should use the offset in CopyTextureToBuffer
as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops! I did in fact forget about CopyTextureToBuffer
, and it's now waiting on grovesNL/glow#13
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, but it requires depending on a git rev of glow again.
I think getting this to work on webgl will require mapping without |
13: Add support for pixel buffer offset in get_tex_image r=grovesNL a=kyren Necessary for this gfx PR: gfx-rs/gfx#2822 Co-authored-by: kyren <kerriganw@gmail.com>
I'm very unsure about whether the non-coherent gl buffer mapping is correct... |
Okay, I have tested this PR via wgpu-rs and can confirm the cube example works again! I have also tested this PR with several other required patches and can confirm this allows wgpu-rs to work on webgl again as it did before! Only simple translucent quad is tested, and this requires several additional in-flight patches to |
Now the wgpu-rs shadow example works again (to the same extent it used to work, with some graphical errors). |
src/backend/gl/Cargo.toml
Outdated
@@ -23,7 +23,7 @@ bitflags = "1" | |||
log = { version = "0.4" } | |||
gfx-hal = { path = "../../hal", version = "0.2" } | |||
smallvec = "0.6" | |||
glow = { version = "0.2.0" } | |||
glow = { git = "https://github.com/grovesNL/glow", rev = "054aabc1bfbad472e4a08bc55912552d8211daee" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grovesNL what's the plan here? Are we waiting for more fixes to glow
before a patch can be released?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, 0.2.0 was published a week ago but there have been a few PRs since then. We can publish whenever this branch is ready to replace the git dependency
src/backend/gl/src/device.rs
Outdated
@@ -515,7 +514,6 @@ impl d::Device<B> for Device { | |||
|
|||
Ok(n::Memory { | |||
properties: memory::Properties::CPU_VISIBLE | |||
| memory::Properties::COHERENT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry to bring it up, but it looks like we should expose both coherent and non-coherent memory types for CPU visible buffers (where persistent mapping is available), given that GL has the corresponding MAP_FLUSH_EXPLICIT_BIT
bit that we can control.
It is totally fine to delay for later. My only concern would be that some of gfx-hal users may expect at least one coherent memory type exposed, since this is a guarantee of Vulkan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add that as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done (in personal branch), waiting on glow. (edit: merged into this PR now)
src/backend/gl/src/lib.rs
Outdated
}; | ||
}, | ||
// Memory type for uses other than images and INDEX | ||
hal::MemoryType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copying from one buffer to another would be going through glCopyBufferSubData
.
what happens then when the user tries to copy from a fake device-local buffer to e.g. an image buffer
Device-local buffer is just a GL buffer (or a sub-range of it, if we have a buffer associated with Memory
) without MAP_XXX flags. Copying from it, to it, using as index/vertex, all those things are done the same way as usual.
src/backend/gl/src/device.rs
Outdated
if self.share.private_caps.emulate_map { | ||
let ptr = mem.emulate_map_allocation.get().unwrap(); | ||
let slice = slice::from_raw_parts_mut(ptr.offset(offset as isize), size as usize);; | ||
gl.buffer_data_u8_slice(mem.target, slice, glow::DYNAMIC_DRAW); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should probably use buffer_sub_data
instead of rewriting the whole buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that's an awful thing to miss, I'm really sorry!
Turns out, there is no buffer_sub_data in glow, I will add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in personal branch, waiting on glow (edit: merged into this PR now)
|
||
if let Err(err) = self.share.check() { | ||
panic!("Error unmapping memory: {:?} for memory {:?}", err, memory); | ||
} | ||
} | ||
|
||
unsafe fn flush_mapped_memory_ranges<'a, I, R>(&self, _: I) -> Result<(), d::OutOfMemory> | ||
unsafe fn flush_mapped_memory_ranges<'a, I, R>(&self, ranges: I) -> Result<(), d::OutOfMemory> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are also missing invalidate_mapped_memory_ranges
implementation, which is required to be sensible for non-coherent memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added glInvalidateBufferSubData
to glow and used that (in personal branch), when using emulated memory maps I'm currently doing... nothing? I guess that's the right thing to do? (edit: merged now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, flush() updates the data from mapping to GPU, while invalidate() should update the data from GPU to the mapping. It sounds like this still needs to be done with the emulated maps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay yeah that makes sense. This will also require a new glow method, I'll merge into this PR once that's merged into glow. (edit: done now)
I've added an implementation of |
I have probably added a lot of portability issues with this PR, I'm open to suggestions about what needs to be fixed right now, but I don't have a great way of testing everything I might do. I know that with the changes I'm going to add as soon as the glow PRs are merged, the gl backend will be using I'm open to suggestions here. I don't mind doing more work on this, but I do feel like the PR has gotten a bit more complex than I was really intending from the outset (and I feel a smidge unqualified to be doing some of the things I'm doing). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for driving this forward! I think we've narrowed down most of rough spots, there is just a few left (noted below)
src/backend/gl/src/device.rs
Outdated
size, | ||
emulate_map_allocation: RefCell::new(std::ptr::null_mut()), | ||
}) | ||
if (1 << mem_type.0) & IMAGE_MEM_TYPE_MASK != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this just be a comparison 1 << mem_type.0 == IMAGE_MEM_TYPE_MASK
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this case, yes, but that assumes that there is only one type of image memory. For the other roles there are multiple types of memory that could work, so the intention was to see whether the given type was in the bit mask at all.
src/backend/gl/src/device.rs
Outdated
map_flags: 0, | ||
}) | ||
} else { | ||
assert!(self.share.private_caps.buffer_role_change); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait, so this code can't allocate buffer memory at all if buffer_role_change
is false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I know it's bad, I figured you'd ask about this. We could either offer a unique type of memory for each role and work differently whether or not buffer_role_change
was allowed, OR we could force mmap emulation if buffer_role_change
is not there. I'm fine with doing either one of those now, this change was just getting to be larger than I expected it to be so I wasn't sure what the best way forward right now was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OR we could force mmap emulation if buffer_role_change is not there.
Is this really an option though? If the user wants a buffer to support both being a vertex buffer and a uniform buffer, no amount of mmap emulation would give you that.
So it looks to me that we just need to expose this "no role change" policy in the exposed memory types and live with that, for now. Later, if we find that applications are too restricted with this, we could think of some workarounds involving copying data around to temporary buffers, but I don't want us to go there just now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, just to clarify, you're saying that the gl backend should have memory types based on no role changes at all, or that this should be dynamic depending on buffer_role_change
? Doing the first one is... relatively easy, but the second one is a more involved change. It's easier for me to do the first one and has a higher probability of me getting it right, but if you think the that the gl backend should definitely support multi purpose buffers if available, I can do that. (If it's dynamic, I might have to move a lot of the buffer type logic around, because most of those mask constants will no longer be constant. As a plus, this would also enable mixing index buffers when not on webgl).
Edit: And yeah, you're right that mmap emulation doesn't help. Native gl buffers are still be attached to memory rather than hal buffers, and hal buffers are attached to sub-ranges of native buffers, so no matter what you will hit the buffer role change limitation.
However, I remember now why I thought this: it DOES solve one of the buffer role change problems I ran into on webgl prior to this PR (but not the new ones related to buffer sub-ranges). Currently, gfx-hal binds native buffers to the PIXEL_PACK_BUFFER
target during Device::map_memory
, which runs afoul of the index role change limitations on webgl. Does this mean that the gl platform doesn't work currently on !buffer_role_change
platforms, or is the PIXEL_PACK_BUFFER
target special somehow on non-webgl and allowed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be dynamic depending on buffer_role_change
We definitely need the exposed memory types being different, based on the buffer_role_change
capability.
is the PIXEL_PACK_BUFFER target special somehow on non-webgl and allowed
We only have to use this binding when doing texture-buffer copies. Technically, this is subset of TRANSFER_DST capability (while PIXEL_UNPACK_BUFFER is a subset with TRANSFER_SRC). So with this in mind there is pretty much 1:1 correspondence between a buffer binding point ("role") and buffer::Usage
.
When the implementation just needs to bind a buffer, regardless of the particular bind point, it should pick the bind point that the buffer supports. For example, if the user created a buffer with only VERTEX usage, we'll use the GL_ARRAY_BUFFER binding point for it, etc.
src/backend/gl/src/device.rs
Outdated
} else { | ||
assert!(self.share.private_caps.buffer_role_change); | ||
|
||
let target = if (1 << mem_type.0) & INDEX_MEM_TYPE_MASK != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly, could we compare directly instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's more than one type of index memory, so the intention here is to check whether the given memory type is any of the index memory types. Is there a clearer way to write this?
src/backend/gl/src/device.rs
Outdated
glow::ARRAY_BUFFER | ||
}; | ||
|
||
let is_device_local = (1 << mem_type.0) & DEVICE_LOCAL_MASK != 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: since this flag is needed in both branches, let's move this assignment up to the beginning of the function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, and I went ahead and gave all of the properties clearer names at the top of the function because I felt that was easier to follow.
src/backend/gl/src/device.rs
Outdated
|
||
Ok(n::Memory { | ||
properties: memory::Properties::CPU_VISIBLE | ||
| memory::Properties::CPU_CACHED, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the future, we'd want to expose "read" memory type independent of the "write", but it doesn't need to happen here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, agreed!
src/backend/gl/src/device.rs
Outdated
// Buffers are only allowed to be INDEX usage XOR another type of usage, they are not | ||
// allowed to have both INDEX and non-INDEX usage. | ||
if !(buffer::Usage::INDEX | buffer::Usage::TRANSFER_SRC | buffer::Usage::TRANSFER_DST).contains(usage) { | ||
(0, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's error!
log here for the users to find out what happened earlier :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
src/backend/gl/src/device.rs
Outdated
// TODO: Access | ||
gl.buffer_data_u8_slice(target, mapped, glow::DYNAMIC_DRAW); | ||
let _ = *Box::from_raw(raw); | ||
gl.buffer_sub_data_u8_slice(memory.target, 0, &allocation); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shouldn't be doing this on unmap
. In Vulkan and gfx-hal, mapping as an operation doesn't actually involve any synchronization. I.e. the only thing that synchronizes is:
- submit() for coherent memory (implicitly synchronizes all active coherent mappings)
- explicit flush() and invalidate() for non-coherent memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, thanks for being patient with me and helping me learn all of the different memory semantics for modern gfx apis!
You're very very welcome, thanks for being patient with me and teaching me a lot about modern gfx APIs in the process 😛 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I suppose there is only getting glow
published and maybe reordering the memory types that's left to do here? Or do you think we should land this as is now and follow-up?
src/backend/gl/src/lib.rs
Outdated
// omitting the GL_MAP_FLUSH_EXPLICIT_BIT flag. | ||
if !self.0.private_caps.emulate_map { | ||
memory_types.push( | ||
// Coherent CPU_VISIBLE memory for INDEX buffers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move these to the beginning of memory_types
. In Vulkan there is an interesting requirement that a memory type with more flags should be exposed first (i.e one with flags A | B should come before A).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes the memory type masks dynamic based on capabilities, darn.
Allocates native buffers to back memory as soon as allocate_memory is called, hal buffers become logical sub-ranges of native buffers. This is still WIP, there is a lot of room for improvement here. Separate INDEX memory could be reserved only for webgl, there could be separate upload / download memory, and the end of buffer sub-ranges could be checked with asserts in debug mode.
Currently requires master branch of `glow` from git
I believe this is necessary for emulated mapping to work properly, as emulated mapping cannot be coherent. Currently unimplemented for emulated mapping, will implement soon.
I guess it would be better to fix the memory types thing and the brokenness of |
* Handles the case where `buffer_role_change` capability is false by generating memory types for every buffer role. * Does not add separate INDEX memory on non-webgl platforms.
Okay, I have a new system for handling memory types that is dynamic based on capabilities rather than the old simplistic system. I've also fixed what I think are several bugs that I caught in review, so there's kind of a lot of new changes. This should fix handling the I haven't tested this with the webgl backend just yet because I need to rebase onto my personal branch to try it out and stash what I'm currently working on in the project that uses it. I'll report back once I've done that. Edit: Everything seems to work fine under webgl as well! |
Move type mask logic into gfx_backend_gl::Share so that it can be shared
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few last questions/concerns :)
src/backend/gl/src/device.rs
Outdated
if is_readable_memory { | ||
map_flags |= glow::MAP_READ_BIT; | ||
} | ||
if !is_coherent_memory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the condition appears wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! It definitely is wrong, it used to control the explicit flush bit before I figured out that it should control the coherent flag instead, and I forgot to reverse it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it was hiding another bug, which is now fixed 😥
src/backend/gl/src/device.rs
Outdated
properties: memory::Properties::DEVICE_LOCAL, | ||
buffer: None, | ||
size, | ||
target: 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move this field behind the buffer
option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I almost did that before actually...
src/backend/gl/src/device.rs
Outdated
// Alignment of 4 covers indexes of type u16 and u32 | ||
(INDEX_MEM_TYPE_MASK, 4) | ||
} | ||
let alignment = if usage.contains(buffer::Usage::INDEX) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we just return 4 unconditionally here? It's a reasonable alignment to adhere to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, that seems reasonable.
properties: memory::Properties::CPU_VISIBLE | memory::Properties::CPU_CACHED | memory::Properties::COHERENT, | ||
heap_index: CPU_VISIBLE_HEAP, | ||
}); | ||
buffer_role_memory_types.push(hal::MemoryType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we missing the CPU-visible non-cached one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know what, I actually did all this still with the assumption that there would only be R/W memory types this patch and not write-only ones (and leave that for a later patch). I guess though, with the way everything else is set up, the ONLY THING left to do is just add an entry here and it will work? I guess that's really all that's left to do so it's silly not to do it!
src/backend/gl/src/lib.rs
Outdated
memory_types.push(( | ||
buffer_role_memory_type, | ||
MemoryRole::Buffer { | ||
allowed_usage: buffer::Usage::TRANSFER_SRC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we use buffer::Usage::all()
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sure thing I just forgot it existed :P
* Fix inverted condition around coherent memory, and fix the bug that it was hiding (also need MAP_COHERENT_BIT on storage flags!) * Move buffer target with raw buffer inside the Option in native::Memory * Hard-code buffer alignment of 4 * Actually go ahead and add cpu-visible non-cached coherent memory type, since it will now work by just adding it to the list! * Simplify using buffer::Usage flags a bit
Okay all of your new comments are addressed. Based on the discussion in gitter (I think I have this right?) I added a new memory type (set) that is cpu-visible, non-cached, and coherent, so there are now also non-cached memory types available (non-cached implies coherent, right?). |
Awesome!
Bors r+
… On Jun 19, 2019, at 23:27, kyren ***@***.***> wrote:
Okay all of your new comments are addressed. Based on the discussion in gitter (I think I have this right?) I added a new memory type that is cpu-visible, non-cached, and coherent, so there is now also a non-cached gl memory type available (non-cached implies coherent, right?).
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
2822: Rework of gl memory / buffer handling r=kvark a=kyren Allocates native buffers to back memory as soon as allocate_memory is called, hal buffers become logical sub-ranges of native buffers. This is still WIP, there is a lot of room for improvement here. Separate INDEX memory could be reserved only for webgl, there could be separate upload / download memory, and the end of buffer sub-ranges could be checked with asserts in debug mode. Fixes #2812 PR checklist: - [x] `make` succeeds (on *nix) - [x] `make reftests` succeeds - [x] tested examples with the following backends: gl - [ ] `rustfmt` run on changed code Co-authored-by: kyren <kerriganw@gmail.com>
Build succeeded |
Allocates native buffers to back memory as soon as allocate_memory is called,
hal buffers become logical sub-ranges of native buffers.
This is still WIP, there is a lot of room for improvement here. Separate INDEX
memory could be reserved only for webgl, there could be separate upload /
download memory, and the end of buffer sub-ranges could be checked with asserts
in debug mode.
Fixes #2812
PR checklist:
make
succeeds (on *nix)make reftests
succeedsrustfmt
run on changed code