-
-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with Vulkan image layout transitions #271
Comments
This link has a really good overview on image layouts. |
I'm still a little bit confused on exactly what's the bug is but I think one possible solution is setting up subpasses that that describe the beginning and end of the render pass. I'm just doing some more reading on it now.
Am I missing something here or isn't it impossible for vulkano to infer the initial and dest layout transitions because it would require vulkano to know what you were intending to use the attachments for? |
I'm just reading though this and I it states:
One theory is that however the transition is occurring it is not setting the correct Sorry if this what you have already said, I'm pretty sick today so might be repeating stuff |
It does the inference at the time of command buffer building/submission I believe - at that point it should be able to know everything as it has all the of the commands submitted by the user including render passes, image copies, etc.
Yeah this is what the bug is, and this is why my patch of changing the initial_layout for the image types we're using fixes the bug for us (even though it's not a proper solution as having the actual correct layout should be much more performant). It might be worth first making a big table of all the operations in which images require a specific layout, and then use this to make a list of all places in which layout transitions can occur and use this to dig in and see what's missing in vulkano? |
This seems to imply setting it to |
Yeah so I think is actually the case for images in general, not just the swapchain image. If they have just been initialized then the From my understanding, while It seems to me like tomaka's goal with vulkano was to take advantage of the fact that we can know what layout an image should be in from the way it is used (by looking at the commands) and infer the optimal transitions. Right now, it just seems to be missing a lot of cases which cause it to fail at this. You should be able to find the parts where transitions are handled in vulkano by finding the places in which image memory barriers are created (I'm pretty sure this is during command buffer building). |
From the docs:
I will have a dig through vulkano now and see if I can see where these are happening |
This is a little complex so I'm just going to put stuff here as I find it and hopefully it will become a bit clearer
|
Yeah in the original patch I did I removed that field altogether along with the check as I also noticed it was never getting set to I think you're right though that ensuring it actually does get set to Also when testing changes with nannou make sure to test at least:
as each of these require different combinations of images and layout transitions. |
Yeh I'm having trouble seeing the link between the Even more odd is why this doesn't happen in the vulkan examples. They don't seem to set that variable to true.
|
Yeah
I get the feeling this is probably the case as tomaka mentions here, but defs worth checking blame just in case 👍 |
I'm a bit stuck with where to go from here. The git blame doesn't shed any light and based on the issue it seems that its not an easy issue. I get the feeling that the current state of vulkano is that you aren't eve ment to specify an I think it's worth noting that this bug can also be solved by just not ever specifying |
Does this actually work for all the nannou examples I mentioned above though? If this was the case, I never would have run into the bug (when I first ran into the bug I didn't even know you could specify the |
It works for everything except |
I have another idea, although I'm not sure how easy it will be to do.
Subpasses can have What I'm thinking is that we add an explicit subpass at the start and the end of the renderpass and it should do the transitions for us without any memory barriers needed. Vulkano currently doesn't set these so I definitely think it's worth a try. Edit: |
Notes
Resources Plan
|
Yeah I'm still leaning towards the original idea of signaling to the images somehow that they've been transitioned out of the undefined state (I'm guessing this was the original plan and that is why they use Also, I forgot to mention that the |
Ok I can give this a go aswell.
|
So here is an attempt at implementing the above comment. It works with all the examples that are listed here but I'm not sure if it's safe or fast. Nothing is noticeably slower though. The next step might be to use some profiling tools to look for pipleline bubbles. |
This is the PR to master vulkano-rs/vulkano#1171 |
|
Here's a slightly simplified form of the table above:
I believe that's all the transitions for commands outside of the render pass. I'll try and put something together for the required layouts for inside the render pass, though it seems quite a bit more complicated. First I might just try and collect all image layout constraints locally in a massive big list, then see if there's a way to simplify it. You almost need graph resolution or a relational programming lang like prolog to work all of this out 🕵️♀️ |
I might be wrong but I think anything inside the renderpass is already handle by the subpass dependacies |
The following are all of the mentions of requirements and restraints on image layouts I could find throughout the spec along with links to header of the relevant section. I'll have a go at cleaning this up into a more digestible format tomorrow. Shading Rate Image Image Presentation
Attachment Aliases - If a set of attachments alias each other, then all except the first to be used in the render pass must use an initialLayout of VK_IMAGE_LAYOUT_UNDEFINED, since the earlier uses of the other aliases make their contents undefined. Once an alias has been used and a different alias has been used after it, the first alias must not be used in any later subpasses. However, an application can assign the same image view to multiple aliasing attachment indices, which allows that image view to be used multiple times even if other aliases are used in between.
An attachment used as both an input attachment and a color attachment must be in the VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR or VK_IMAGE_LAYOUT_GENERAL layout. An attachment used as an input attachment and depth/stencil attachment must be in the VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR, VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL, VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL, VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL, or VK_IMAGE_LAYOUT_GENERAL layout. An attachment must not be used as both a depth/stencil attachment and a color attachment.
For depth/stencil attachments, each aspect can be used separately as attachments and non-attachments as long as the non-attachment accesses are also via an image subresource in either the VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL layout or the VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL layout, and the attachment resource uses whichever of those two layouts the image accesses do not. Use of non-attachment aspects in this case is only well defined if the attachment is used in the subpass where the non-attachment access is being made, or the layout of the image subresource is constant throughout the entire render pass instance, including the initialLayout and finalLayout. Note - These restrictions mean that the render pass has full knowledge of all uses of all of the attachments, so that the implementation is able to make correct decisions about when and how to perform layout transitions, when to overlap execution of subpasses, etc. Valid Usage:
Storage Image
Sampled Image
Combined Image Sampler
Input Attachment
|
Here is everything above simplified and tabled: https://docs.google.com/spreadsheets/d/1GMIHZaktFYiGk6NBFJtHyWROZ0cycCGfXOzsebz8Lhs/edit?usp=sharing Perhaps we can add another column for checking what we have reviewed in vulkano and another for what we can currently check with our custom validation layer? |
This adds an example that runs each nannou example listed in the Cargo.toml for a few seconds each, one at a time. This is useful for finding `panic!`s that only occur at runtime, as while `cargo test` does compile the examples, it does not try and execute them. The test has the `#[ignore]` attribute so that `cargo test` does not run the example by default. This is because 1. the test takes a looong time to run and 2. travis is likely incapable of running a lot of the tests as it doesn't have a windowing system etc. As a result, to run the test locally use `cargo test -- --ignored`. We should try to ensure we run this at least once before publishing each major release. Specifically, this test was created to help with testing for vulkano image layout bugs (nannou-org#271). cc @freesig just pinging you as you might find this useful for this reason too!
Layout Transition Error 1
I can confirm (as @mitchmindtree suggested) this is caused by the intermediary frame image being created with the layout This only happens the first time because at the end of the pipeline the frame image is transitioned to Possible fixesThere should be a memory barrier transition form log with details: |
Layout Transition Error 2
and
This is happening in the osc_receiver example. I'm pretty sure this is linked to the UI not regular nannou. For some reason the descriptor set is being created with SolutionsI think we need to examine the UI code for this one. I think the descriptor set should probably not be Side NoteI noticed another possible issue. There is one point in this issue where a pipeline is submitted with 35 memory barriers. They are all look like this: pImageMemoryBarriers[3]: const VkImageMemoryBarrier = 0x556c355103d0:
sType: VkStructureType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER (45)
pNext: const void* = NULL
srcAccessMask: VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
dstAccessMask: VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
oldLayout: VkImageLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL (7)
newLayout: VkImageLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL (7)
srcQueueFamilyIndex: uint32_t = 4294967295
dstQueueFamilyIndex: uint32_t = 4294967295
image: VkImage = 0x556c355138d0
subresourceRange: VkImageSubresourceRange = 0x556c35510400:
aspectMask: VkImageAspectFlags = 1 (VK_IMAGE_ASPECT_COLOR_BIT)
baseMipLevel: uint32_t = 0
levelCount: uint32_t = 1
baseArrayLayer: uint32_t = 0
layerCount: uint32_t = 1 As you can see nothing is being changed. This is happening because there are 35 buffer transfer and vulkano is sticking a barrier between each one regardless of it's current layout. Log with details: |
Ok Re 1 |
I think instead of setting just checking Issues:
|
RE 2 The CauseThis is the line in conrod_vulkano that is causing the issue. let glyph_cache_tex = StorageImage::with_usage(
device.clone(),
Dimensions::Dim2d { width, height },
R8G8B8A8Unorm,
ImageUsage {
transfer_destination: true,
sampled: true,
..ImageUsage::none()
},
vec![graphics_queue_family],
)?; From the vulkano docs:
Ideas
I might need some help on this one. I think it's worth noting that nowhere in this code is |
Ok I think that another possible explanation is that vulkano is trying to batch the pImageMemoryBarriers[34]: const VkImageMemoryBarrier = 0x556c35510c88:
sType: VkStructureType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER (45)
pNext: const void* = NULL
srcAccessMask: VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
dstAccessMask: VkAccessFlags = 32 (VK_ACCESS_SHADER_READ_BIT)
oldLayout: VkImageLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL (7)
newLayout: VkImageLayout = VK_IMAGE_LAYOUT_GENERAL (1)
srcQueueFamilyIndex: uint32_t = 4294967295
dstQueueFamilyIndex: uint32_t = 4294967295
image: VkImage = 0x556c355138d0
subresourceRange: VkImageSubresourceRange = 0x556c35510cb8:
aspectMask: VkImageAspectFlags = 1 (VK_IMAGE_ASPECT_COLOR_BIT)
baseMipLevel: uint32_t = 0
levelCount: uint32_t = 1
baseArrayLayer: uint32_t = 0
layerCount: uint32_t = 1 Shouldn't be in that batch and should be in a barrier that occurs after the copys I can't actually reproduce this by stepping through the code so it must be a timing thing. Maybe a bunch of copy's are being submitted faster then the gpu. I stills don't see how the last command back to |
I have spent the last few days walking through the debugger and I'm really struggling to gain any clarity on how this whole process works.
|
When a command is submitted to this above function it checks if there is a conflict.
The
Then CmdB will not be flushed before CmdC. Ideas
|
Ok I've solved this The last thing left to do is check if an image is set to PreinitializedI'll dig into this now but one thing I know is we can't have a |
Ok everything is fixed now on mitchmindtree/vulkano#6
|
I'm tagging in. My initial instinct is to say that all of the valid transitions can be described by an FSM and likely handled by a simple typestate system under the hood and validate at compile time. |
Hey I'm trying to sort out this image layout pr but I thought I'd post here so that I don't pollute the PR. if initial_layout_requirement != start_layout || unsafe { !img.is_layout_initialized() } {
actually_exclusive = true; I'm not sure why in the original code let is_layout_initialized = unsafe { img.is_layout_initialized() };
if initial_layout_requirement != start_layout || !is_layout_initialized {
actually_exclusive = is_layout_initialized; But I'm not sure if this is correct, although it doesn't trigger any problems that I can see |
I've pushed this version to the PR. Maybe @mitchmindtree you can run it on linux and check for errors in the validation layers. It seems to be working on MacOS. |
I'll add your branch to my local nannou repo as a patch, run cargo test with all examples and standard validation layers and record the output 👍 |
OK, I just ran all examples! There are no new validation layer errors, but the vk_image.rs and vk_images.rs seemed to present their first frame and then vk_image_panic.txt Do you get this too? Here's the full output from the run_all_examples test in case you wanna have a look: |
Ah dam I missed this because I was testing in release. |
Ok I have fixed this and tested it on all examples in the top level example folder and the vulkan examples. Feel free to run your more complete tests but I'm going to suggest the PR is merged now because I'm confident that its working |
Awesome work!! I'l give this another run locally now just in case. |
Ok the never ending issue continues. |
Ok fixed it lol |
Done! |
One of the final blockers before landing v0.9 is the state of automatic layout transitions in vulkano. Before explaining the issue, it's important to understand 1. image layouts, 2. image memory barriers and 3. image layout transitions.
Image Layouts
The
ImageLayout
type describes the layout of an image (e.g.AttachmentImage
,StorageImage
orSwapchainImage
- anything that implementsImageAccess
) in GPU memory at each stage within a vulkan program. The thing that determines whatImageLayout
an image should have at each stage is the way it is used. I believe this is specified byImageUsage
, but there could be other factors involved in determining required layout - I'm unclear on this. This is highlighted nicely by a passage from the Vulkan cookbook:You can find a list of all possible image layouts supported by the spec along with descriptions here. Side note: Vulkano currently seems to enumerate only a subset of these in its
ImageLayout
type.Image Memory Barriers and Layout Transitions
Let's say we have a
SwapchainImage
and we want to clear it with a single colour and then present it via the swapchain.To clear the image, we can use the
vkCmdClearColorImage
command (via theclear_color_image
commandbuffer builder method in vulkano). ThevkCmdClearColorImage
operation requires that the image is in theVK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL
layout (ImageLayout::TransferDstOptimal
in vulkano).To present an image via the swapchain, the image must be in the
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR
image layout (ImageLayout::PresentSrc
in vulkano).To ensure that the image has the
TransferDstOptimal
layout while clearing and thePresentSrc
layout while presenting, we must use an Image Memory Barrier. On the barrier, theoldLayout
is the layout the image should have for commands precending the barrier (in our case,TransferDstOptimal
) andnewLayout
is the layout the image will have after the barrier (in our case,PresentSrc
). As far as I understand it is up to the vulkan implementation to actually perform this transition at runtime - as a result, transitions can incur a cost and unnecessary transitions should be avoided if possible.The Bug
Vulkano aims to make life easier by inferring the necessary image layout transitions (and in turn, insert the necessary image memory barriers) for the user (as noted in the
ImageLayout
docs) which I think is a highly worthy goal due to the tedious and error-prone nature of manually specifying the barriers. Unfortunately, it seems there are cases where vulkano does not (or can not?) have the necessary transition implemented just yet. E.g. tomaka mentions a missing transition here, a comment in clear_color_image has a TODO, etc. As a result, it is not uncommon to run into crypticassert_neq!(Undefined, Undefined)
panic!s orAccessError { ImageNotInitialized }
runtime errors. Some of these can be worked around by manually specifying theinitial_layout
andfinal_layout
fields for attachments withinrenderpass!
macros (I'm not 100% clear yet how exactly these fields are passed along to the barriers), however in other cases there is no clear option for manually specifying the necessary transition manually (e.g. transitioning for aclear_color_image
command).Here are some issues/PRs I believe are related:
vulkano-rs/vulkano#1117
vulkano-rs/vulkano#974
vulkano-rs/vulkano#977
vulkano-rs/vulkano#1061
The Temporary Patch
Currently in order to work with v0.9 we are using a branch of vulkano called
nannou-patches
which contains a small commit that changes theinitial_layout
of theSwapchainImage
andAttachmentImage
ImageAccess
implementations to use theUndefined
layout. The Vulkan spec indicates that it is OK to useUndefined
as theoldLayout
, though it is unclear what the performance cost is compared to specifying the image's actual layout.The docs on the
Undefined
layout:Solution?
While this branch serves as a temporary workaround, it is obviously not a correct, long-term solution. I think we can improve the situation by surveying cases where vulkano fails to infer the correct image layout transition and try and find the necessary fix within vulkano itself.
An alternative solution might be to ensure the user always has the ability to manually specify the layout transition themselves. However, I imagine this might lead to safety issues due to the potential for invalid transitions when vulkano has the potential to remove these issues altogether if it could only correctly infer all the necessary transitions itself.
To find a starting point, the bugs mentioned above can be easily reproduced by changing the
[replace]
vulkano instances in nannou's cargo.toml from my nannou-patches branch back to the vulkano-rsmaster
branch and removing all the instances in which we manually specifyinitial_layout
andfinal_layout
withinrenderpass!
macro invocations (frame/mod.rs, the draw vulkano.rs backend, ui.rs, the vk_*.rs examples).It might be worth filling in some of the knowledge gaps above and posting a similar issue about this to vulkano as discussion on the topic there is currently pretty scattered.
@freesig I'm going to stop digging for now and keep hacking on the mirrors, but thought I'd consolidate my thoughts/info here for when I come back or in case you wanted to have a dig yourself after the mac installer stuff.
The text was updated successfully, but these errors were encountered: