Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update to wgpu-native 22.1 #547

Merged
merged 45 commits into from
Sep 11, 2024
Merged

update to wgpu-native 22.1 #547

merged 45 commits into from
Sep 11, 2024

Conversation

Vipitis
Copy link
Contributor

@Vipitis Vipitis commented Aug 4, 2024

I am really excited for better error handling, compilation info(not yet implemented in wgpu-native, I might try to do that myself to move it along) and glsl const built-ins, so I already started this.
Feel free to cherry pick my changes or commit into this branch if it helps.

I got a lot of time to work on this, but need to read up on error handling a bit.

(from the issue template:)

Preparations

  • Perhaps warp up outstanding PR's.
  • Might be a good time to roll a release of wgpu-py.
  • Perhaps for pygfx too.

Upstream

pending gfx-rs/wgpu-native#402

  • wgpu-native is updated to a recent wgpu-core.
  • wgpu-native uses a recent webgpu.h.
  • wgpu-native has a release with these changes.

Update to latest IDL

Optional, can also only update wgpu-native.

The lines below can be copied in the PR's top post.

  • Download the latest webgpu.idl and place in the resources folder.
  • Run python codegen to apply the automatic patches to the code.
  • [?] It may be necessary to tweak the idlparser.py to adjust to new formatting.
  • Check the diff of flags.py, enums.py, structs.py for any changes that might need manual work.
  • Go through all FIXME comments that were added in _classes.py:
    * Apply any necessary changes.
    * Remove the FIXME comment if no further action is needed, or turn into a TODO for later.
    * Note that all new classes/methods/properties (instead those marked as hidden) need a docstring.
  • Run python codegen again to validate that all is well. Repeat the step above if necessary.
  • Make sure that the tests run and provide full coverage.
  • Make sure that the examples all work.
  • Make sure that pygfx works (again).
  • Update release notes for API changes.

Update to latest wgpu-native

Optional, can also only update IDL

The lines below can be copied in the PR's top post.

  • Run python download-wgpu-native.py --version xx to download the latest webgpu.h and DLL.
  • Run python codegen to apply the automatic patches to the code.
  • [?] It may be necessary to tweak the hparser.py to adjust to new formatting.
  • Diff the report for new differences to take into account.
  • Diff wgpu_native/_api.py to get an idea of what structs and functions have changed.
  • Go through all FIXME comments that were added in _api.py:
    * Apply any necessary changes.
    * Remove the FIXME comment if no further action is needed, or turn into a TODO for later.
  • Run python codegen again to validate that all is well. Repeat the steps above if necessary.
  • Make sure that the tests run and provide full coverage.
  • Make sure that the examples all work.
  • Make sure that pygfx works (again).
  • Update release notes for changes in behavior.

Wrapping up

  • Update pygfx.
  • Release new wgpu-py.
  • Release new pygfx.
  • This can be a good moment to deal with certain issues, or perhaps some issues can be closed.

@Vipitis
Copy link
Contributor Author

Vipitis commented Aug 10, 2024

so seems like D3D12 and OpenGL cause issue with the tests in test_wgpu_native_compute_tex.py now - always returning all zeros. these do pass with vulkan (but render tests fail). Might need to try this on different hardware next week.

examples do pass on all backends (using offscreen via pytest)

remaining test errors are related to the specific Error formatting, so I will try to find out what changes are needed

@Vipitis
Copy link
Contributor Author

Vipitis commented Aug 10, 2024

So I got the error tests passing - but only when run on their own... not when doing the whole suite. Reminds me of having issues with gc last time but don't think that is the case here. So maybe it's an issue with how pytest does caching for default device (had trouble with that in wgpu-shadertoy) - or the changes to error handling need some further tweaks.

image

might look into enabling some new features next.

@Vipitis
Copy link
Contributor Author

Vipitis commented Aug 12, 2024

so seems like D3D12 and OpenGL cause issue with the tests in test_wgpu_native_compute_tex.py now - always returning all zeros. these do pass with vulkan (but render tests fail). Might need to try this on different hardware next week.

have now tested this on my laptop with an Nvidia GPU and getting the same result:
Vulkan returns null pointers for getSurfaceCapabilities, D3D12 fails compute tests with all zeros.

@almarklein
Copy link
Member

I'm on holiday right now with very little time (and very bad wifi), will review/help when I'm back.

@fyellin
Copy link
Contributor

fyellin commented Aug 18, 2024

I downloaded this code and tried to use it on my MacOS. I'm interested in looking at override and push_constant, both of which appear to be implemented (partly??) in the newest version.

When I would run python download-wgpu-native.py I'd get the error message that the downloaded file was not a zip file. Turns out the "zip file" contained "Not Found".

It seems that https://github.com/gfx-rs/wgpu-native/releases/download/v22.1.0/wgpu-macos-aarch64-release.zip does not exist.

  1. Is there something I need to do to get that file to exist?
    2 Is there any way we can a slightly more informative error message if the URL doesn't in fact exist?

@Vipitis
Copy link
Contributor Author

Vipitis commented Aug 18, 2024

  1. Is there something I need to do to get that file to exist?

Wgpu-native hasn't yet released. The commit hash points to the PR branch, not a release tag yet. So you need to clone the repo(and it's submodules), build it yourself (need to install a working rust tool chain and others), and then replace the lib file (.dylib on OS X) or use WGPU_LIB_PATH env var to point to the new file.
The header files are already part of this PR branch.

@Korijn
Copy link
Collaborator

Korijn commented Aug 21, 2024

To expand on that, you can track upstream progress here: gfx-rs/wgpu-native#402

@fyellin
Copy link
Contributor

fyellin commented Aug 21, 2024

I read the title of this and assumed 22.1 was further along than it actually is. I will wait.

I am also anxious to see some of the features here and I hope some of the bugs have been fixed

@Vipitis
Copy link
Contributor Author

Vipitis commented Aug 26, 2024

I'm on holiday right now with very little time (and very bad wifi), will review/help when I'm back.

Now I am on holiday too with absolutely no reception.Earliest I can continue working on this on is Friday.

@Korijn
Copy link
Collaborator

Korijn commented Aug 26, 2024

To expand on that, you can track upstream progress here: gfx-rs/wgpu-native#402

Upstream release happened 10 hrs ago!

@Vipitis
Copy link
Contributor Author

Vipitis commented Aug 30, 2024

so CI differs from my systems... But the fail case seems to be a device lost not raising the appropriate exception class. I am not sure if device lost is the intended error there in the first place. But there needs to be some adjustments to raise that properly. Will work on that tomorrow and Saturday.

pytest showing different behavior when running tests isolated versus the whole suite is a common thread, but makes it difficult to debug, as it could be caching or gc.

@rajveermalviya
Copy link

rajveermalviya commented Sep 9, 2024

Is it possible it's hitting the error here? — https://github.com/gfx-rs/wgpu-native/blob/5bd6da3851be5a214c746cac691cd372e50f7333/src/lib.rs#L3836-L3838
(should have a warning log :( )
Here are the default values — https://docs.rs/wgpu-types/22.0.0/src/wgpu_types/lib.rs.html#5407

@almarklein
Copy link
Member

Hi @rajveermalviya, thanks for chiming in!

The situation seems related to us having been very lazy about passing surface_if into WGPURequestAdapterOptions, simply setting it to NULL. This e.g. allows creating an adapter before a GUI backend is selected.

This has worked well until now. But with wgpu-native v22.1.0.2, this produces null pointers in the surface capabilities struct (i.e. zero supported formats and present modes), and when we just force them get "Surface does not support the adapter's queue family".

I just tried to play nice, and pass the canvas/surface, but now it selects DX12 instead of Vulkan, and produces warnings about downlevel flags .... 🤔

@almarklein
Copy link
Member

Since the resulting struct has zero formats, not even bgra8unorm_srgb and fifo, it looks like we don't get the default, but something less than that.

@rajveermalviya
Copy link

rajveermalviya commented Sep 9, 2024

Since the resulting struct has zero formats, not even bgra8unorm_srgb and fifo, it looks like we don't get the default, but something less than that.

Looks like it is the default value to me. 🙂

@fyellin
Copy link
Contributor

fyellin commented Sep 9, 2024

Just so you aren't caught by surprise. I had to completely rewrite the handling of limits to get push_constants to work. I needed some wgpu-specific limits, and there are some extra shenanigans involved in fetching them and setting them. Also as I mention in my comments on that pull-request, I think it's incorrect that features_use_hyphens while limits_use_underscore.

Essentially warning you that you may have a merge mess coming up.

@almarklein
Copy link
Member

Looks like it is the default value to me. 🙂

Ah, you're right. I was looking at the code below, which mentions certain guarantees, which I assumed to be part of the default.

@almarklein
Copy link
Member

It looks like wgpu has become unworkable on Windows (at least the machine I have at hand):

  • Not specifying a surface: Panics with "Surface does not support the adapter's queue family".
    • This worked fine before, what has changed that causes this to fail now?
  • Specifying the surface id: Selects D3D12 and gives a big warning about downlevel flags.
    • If D3D12 worked, this was ok, but that warning does not sound good. Is this because D3D12 is not yet fully supported, or because D3D12 cannot provide the full wgpu functionality?
  • Forcing Vulkan via WGPURequestAdapterOptions.backendType and specify the surface id: Errors with "No suitable adapters found".
    • Why is Vulkan suddenly not supported anymore?

This is on an Intel machine with Integrated graphics and Windows 11.

@Vipitis
Copy link
Contributor Author

Vipitis commented Sep 10, 2024

The warnings also show in wgpu-native when I force the backend to D3D12 there. But otherwise it does work. Apart from compute shaders, which I don't think are meant to work on D3D12.

Vulkan works in wgpu-native, so it should work in wgpu-py too. Offscreen canvas works, but glfw and qt did not.

Were you able to try Vulkan on a non Windows OS?

@almarklein
Copy link
Member

Vulkan works in wgpu-native, so it should work in wgpu-py too. Offscreen canvas works, but glfw and qt did not.

Oh, interesting ... I will try wgpu-native when I'm back at the office (where I have a Windows machine) tomorrow. What happens on your machine is you modify cube.py using:

adapter = wgpu.gpu.request_adapter(power_preference="high-performance", canvas=canvas)

Were you able to try Vulkan on a non Windows OS?

Yeah, worked fine on Linux.

@Vipitis
Copy link
Contributor Author

Vipitis commented Sep 10, 2024

Oh, interesting ... I will try wgpu-native when I'm back at the office (where I have a Windows machine) tomorrow. What happens on your machine is you modify cube.py using:

adapter = wgpu.gpu.request_adapter(power_preference="high-performance", canvas=canvas)

exactly the same as forcing D3D12, warnings but works. If I force Vulkan here I do get the "No suitable adapter found" from wgpuInstanceRequestAdapter the status returned by the callback is "Unavailable". Which is spawned from a NotFound in wgpu-core

also tried OpenGL backend which works.

I poked around in the blame and couldn't find anything obvious. A few PRs in that region talked about backends in instance. e.g: gfx-rs/wgpu#5535

E:
the warning about downlevel flags for me is this specific feature which is noted as missing for D3D12

@almarklein
Copy link
Member

almarklein commented Sep 11, 2024

found it! It was related to how the surface id was obtained. Somehow the one we created was wrong. Patch underway.

@almarklein
Copy link
Member

almarklein commented Sep 11, 2024

@Vipitis can you confirm it works for you too now? If it does, I think this is about ready. There are no real API changes, but I'm going to have a go with pygfx to make sure no changes are needed to e.g. shaders.

I will update to the latest wgpu-native (that was released since this was started) in a new (small) pr.

@Vipitis
Copy link
Contributor Author

Vipitis commented Sep 11, 2024

I will try this on both machines when I am home in ~90 minutes

@Vipitis
Copy link
Contributor Author

Vipitis commented Sep 11, 2024

checked on both machines, seems to be good now. examples work with gui, pytest passes. Also tried with forced Vulkan and D3D12.

I will collect the fatal errors and raise a separate issue/PR to figure out if they can be handled and a panic avoided

@Vipitis Vipitis marked this pull request as ready for review September 11, 2024 13:33
@Vipitis Vipitis requested a review from Korijn as a code owner September 11, 2024 13:33
@Vipitis Vipitis changed the title [WIP] update to wgpu-native 22.1 update to wgpu-native 22.1 Sep 11, 2024
@almarklein
Copy link
Member

🥳

@almarklein almarklein merged commit e9e3672 into pygfx:main Sep 11, 2024
20 checks passed
@almarklein
Copy link
Member

Thanks @Vipitis!

@Vipitis Vipitis deleted the wgpu22-1 branch September 11, 2024 13:48
@almarklein almarklein mentioned this pull request Sep 13, 2024
@Vipitis Vipitis mentioned this pull request Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants