Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wayland Vulkan swapchain creation fails #131

Closed
flibitijibibo opened this issue May 19, 2022 · 22 comments
Closed

Wayland Vulkan swapchain creation fails #131

flibitijibibo opened this issue May 19, 2022 · 22 comments
Assignees

Comments

@flibitijibibo
Copy link

First, thanks so much for fixing #128! This did fix a number of things right away... but I've now come across another interesting Wayland Vulkan bug, and unlike the other one I'm not sure what's actually happening this time.

The short version: When you try to create a swapchain with a Wayland Vulkan surface, it will fail with VK_ERROR_INITIALIZATION_FAILED, with no real indication as to what that actually means. The validation layer is no help either! Just to be sure, I checked an installation with the blob installer and it runs okay. Problem is, unlike the symlink, there doesn't seem to be a clear issue with the Fedora packages at all! I even went and did crazy things like ls /usr/lib64 > [official|negativo].txt and diffed the two to be sure we didn't have another obscure file that needed to be there, and no luck.

Based on LD_DEBUG=libs and some straceing I think we're now looking at something much weirder - possibly a version sync issue with something like the EGL/GBM libraries??? I can do whatever digging you need on the lab boxes here; the test I'm using is vkcube-wayland:

dnf builddep vulkan-tools
git clone https://github.com/KhronosGroup/Vulkan-Tools -b sdk-1.3.204
mkdir Vulkan-Tools/flibitBuild
cd Vulkan-Tools/flibitBuild
cmake ..
make vkcube-wayland
./cube/vkcube-wayland

It'll currently crash on an assertion, which is their error check for vkCreateSwapchainKHR.

@scaronni
Copy link
Member

Regarding version mismatches, it's not clear which version of the supporting libraries Nvidia ship in the tarballs when releasing a new driver. My guess is that they just pick the latest commit. The "supporting" libraries are these:

https://github.com/NVIDIA/egl-gbm
https://github.com/NVIDIA/egl-wayland

For egl-gbm we're aligned with the latest release: https://github.com/NVIDIA/egl-gbm/commits/master
For egl-wayland there are 4 extra commits: https://github.com/NVIDIA/egl-wayland/commits/master

I assume you are on Fedora 36? Everything else is up to date. Will try the test you pasted above and let you know.

And last but not least the usual warning: https://forums.developer.nvidia.com/t/wayland-information-for-r515-beta-release/214275

@scaronni scaronni self-assigned this May 21, 2022
@leigh123linux
Copy link

Regarding version mismatches, it's not clear which version of the supporting libraries Nvidia ship in the tarballs when releasing a new driver. My guess is that they just pick the latest commit. The "supporting" libraries are these:

https://github.com/NVIDIA/egl-gbm https://github.com/NVIDIA/egl-wayland

For egl-gbm we're aligned with the latest release: https://github.com/NVIDIA/egl-gbm/commits/master For egl-wayland there are 4 extra commits: https://github.com/NVIDIA/egl-wayland/commits/master

I assume you are on Fedora 36? Everything else is up to date. Will try the test you pasted above and let you know.

And last but not least the usual warning: https://forums.developer.nvidia.com/t/wayland-information-for-r515-beta-release/214275

The f36 package is only one commit behind nvidia master.

https://src.fedoraproject.org/rpms/egl-wayland/blob/rawhide/f/egl-wayland.spec#_11

@flibitijibibo
Copy link
Author

Can confirm this is Fedora 36. I can also narrow it down and say the latest egl-wayland commit won't be it, that was something David found while looking at some specific threaded cases for SDL.

@scaronni
Copy link
Member

scaronni commented May 22, 2022

Then I guess your only option is posting a bug in the Nvidia forum.

@scaronni
Copy link
Member

scaronni commented Jun 1, 2022

Seems to be fixed now? https://forums.developer.nvidia.com/t/linux-solaris-and-freebsd-driver-515-48-07-production-branch-release/216112

I will package 515.48.07 in a moment, it will be online in a few hours.

@flibitijibibo
Copy link
Author

Will check this on the lab boxes today - worst case the debug extension will tell us what's going on.

@flibitijibibo
Copy link
Author

Updated to 515.48.07, still fails and VK_EXT_debug_utils doesn't tell us anything, despite the driver notes suggesting otherwise. I'll keep prodding the thread on the NVIDIA forums.

@flibitijibibo
Copy link
Author

flibitijibibo commented Jun 8, 2022

Looks like I was on the right track - from Erik Kurzinger:

There is an ABI incompatibility between the 515 driver and previous versions of our egl-wayland library which might be the cause of this. Updating to the latest version, 1.1.10, should resolve the issue if that is the case.

Going forward this should never happen again as we’ve eliminated the driver <=> egl-wayland ABI which the Vulkan WSI depended on. However, doing so required one final breaking change, unfortunately.

EDIT: Relevant commit NVIDIA/egl-wayland@e7a2f70

@scaronni
Copy link
Member

scaronni commented Jun 9, 2022

I've pushed a snapshot of 1.1.10 (it seems to be final, but the release has not been tagged). Please let me know how it goes.

Thanks.

@jp7677
Copy link
Contributor

jp7677 commented Jun 9, 2022

Running vkcube-wayland no longer crashes for me, but fails now with

./vkcube-wayland: symbol lookup error: /lib64/libnvidia-vulkan-producer.so: undefined symbol: wlEglInitializeSurfaceExport

@leigh123linux
Copy link

Running vkcube-wayland no longer crashes for me, but fails now with

./vkcube-wayland: symbol lookup error: /lib64/libnvidia-vulkan-producer.so: undefined symbol: wlEglInitializeSurfaceExport

1.1.10 has removed the symbol

NVIDIA/egl-wayland@fee26e9

@scaronni
Copy link
Member

Sigh

@flibitijibibo
Copy link
Author

Sigh x2, can reproduce locally.

(As a heads up I may be absent for the next week or so while I get TMNT out the door...)

@leigh123linux
Copy link

It should be fixed in the next driver release.

NVIDIA/egl-wayland@fee26e9#commitcomment-75822177

@jp7677
Copy link
Contributor

jp7677 commented Jun 28, 2022

A new driver version 515.57 has been released https://www.nvidia.com/download/driverResults.aspx/190422/
Let’s see what this version brings :)

PS: there are also a few new commit for egl-Wayland: https://github.com/NVIDIA/egl-wayland/commits/master as a response to https://forums.developer.nvidia.com/t/properties-and-filters-windows-make-obs-hang-on-wayland-when-closed/213009/12

@scaronni
Copy link
Member

Driver 515.57 is being pushed to the repositories now along with a new snapshot of egl-wayland.

@jp7677
Copy link
Contributor

jp7677 commented Jun 29, 2022

I can see the rotating cube now!
Contrary to vkcube, the vkcube-waylandwindow is transparent and has no window decoration, but that could just be something with my local vkcube-wayland build.

PS: The resize issue described here NVIDIA/egl-wayland#59 is actually very real...

@flibitijibibo
Copy link
Author

That's probably accurate - it probably doesn't use decorations and it's using a swapchain with alpha. (Hopefully it's not set to opaque, otherwise I have to go look at my other NV report again...)

Will locally test later today!

@oranmehavi
Copy link

A new driver version 515.57 has been released https://www.nvidia.com/download/driverResults.aspx/190422/ Let’s see what this version brings :)

PS: there are also a few new commit for egl-Wayland: https://github.com/NVIDIA/egl-wayland/commits/master as a response to https://forums.developer.nvidia.com/t/properties-and-filters-windows-make-obs-hang-on-wayland-when-closed/213009/12

I initially reported the obs issue and there is a weird segfault happening here with their fix:

https://forums.developer.nvidia.com/t/properties-and-filters-windows-make-obs-hang-on-wayland-when-closed/213009/16?u=oranhero

@flibitijibibo
Copy link
Author

Yayyy it works

Screenshot from 2022-06-29 15-57-11

Interestingly EGL has opaque presentation (per their recent commits) but Vulkan does not... oh well, back to this issue for that: https://forums.developer.nvidia.com/t/wayland-vulkan-ignores-vkcompositealphaflagskhr/202227

Thanks again for all the help!

@scaronni
Copy link
Member

scaronni commented Jul 1, 2022

FEZ ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants