Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Fragment density map as fallback for Vulkan VRS #85824

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

BastiaanOlij
Copy link
Contributor

@BastiaanOlij BastiaanOlij commented Dec 6, 2023

This PR introduces support for VK_EXT_fragment_density_map, this is a newer VRS extension that only implements density maps and works slightly different from VK_KHR_fragment_shading_rate, which we currently use for VRS.

I've split out the changes related to VRS_XR into #89894, this PR now only focusses on implementing VK_EXT_fragment_density_map.

The advantage of the fragment density map extension is that it is implemented on various Android devices including the Meta Quest, and possibly other VR devices.

Note Quest 3 now also supports VK_KHR_fragment_shading_rate

Note, to test change your start script to:

xr_interface = XRServer.find_interface("OpenXR")
if xr_interface and xr_interface.is_initialized():
	print("OpenXR instantiated successfully.")
	var vp : Viewport = get_viewport()

	# Enable XR on our viewport
	vp.use_xr = true

	# Make sure v-sync is off, v-sync is handled by OpenXR
	DisplayServer.window_set_vsync_mode(DisplayServer.VSYNC_DISABLED)

	# Enable VRS
	vp.vrs_mode = Viewport.VRS_XR
	vp.vrs_update_mode = Viewport.VRS_UPDATE_ONCE
	xr_interface.xr_vrs_strength = 0.5

Or use VRS demo project to test.

@DarioSamo
Copy link
Contributor

Right now I default to the fragment shading rate extension, and fall back on the fragment density map extension. There is an argument for reversing this.

I think the logic for this should be at a level above the context so each backend doesn't have to be aware of this logic and should just limit itself to reporting whether VRS and FD are supported separately on their own.

@BastiaanOlij
Copy link
Contributor Author

I think the logic for this should be at a level above the context so each backend doesn't have to be aware of this logic and should just limit itself to reporting whether VRS and FD are supported separately on their own.

Lets see if we can find some time on rocketchat/discord to throw some ideas around. I agree with this in principle but probably a few things need to be rejigged to accommodate it.

@BastiaanOlij
Copy link
Contributor Author

BastiaanOlij commented Feb 20, 2024

Mostly rebased this and added an update mode so we don't needlessly convert the VRS texture.

Still left todo:

  • update code in XRInterface to use the same input format as user supplied textures
  • update VRS conversion shader to assume this input format, and switch properly between different output modes.
  • Add control over strength, so we can adjust the VRS effect.
  • testing, lots of testing

To be considered:

  • Find a way to internalise the main Vulkan extension approach and fragment density extension approach into the Vulkan renderer (possibly by just exposing a format?) (see Dario's remark)
  • If XRInterface can supply a VRS density map in the correct native format, skip conversion. This one is dangerous because it assumes the user hasn't overwritten the render target size. Also feedback I've so far gotten talking to experts is that most engines are doing their own VRS implementation just like we are, there might be limited benefit getting this from the XR runtime. Currently no runtime has good support for this anyway.
  • If eye tracking is available, update the density map each frame based on eye focus point (I've already put a placeholder get_eye_focus method here, will likely implement something in the OpenXRInterface for this).
  • Make it possible to provide a shader fragment instead of texture so the VRS map can be calculated. Esp with eye tracking this could be much more efficient.
  • Add ability to update force update mode "always" if eye tracking is available and used.
  • Write documentation!!!!!!!

@BastiaanOlij BastiaanOlij force-pushed the enhance_vulkan_vrs branch 2 times, most recently from baf4561 to bb64a64 Compare February 20, 2024 06:42
@BastiaanOlij
Copy link
Contributor Author

Note, currently it is possible to get error spam when using this with the mobile renderer:

ERROR: Tracker can't have more than one type of usage in the same draw list. Draw list usage is 15 and the requested usage is 14.

I do not believe this is an issue with this PR (I need to double check) but an issue with the tonemapper running in a subpass. Will need to discuss with @DarioSamo

@DarioSamo
Copy link
Contributor

DarioSamo commented Feb 20, 2024

ERROR: Tracker can't have more than one type of usage in the same draw list. Draw list usage is 15 and the requested usage is 14.

Admittedly this is one area I did not get to test much as I was not aware which effect relied on using a color attachment that was written to inside one render pass. The original draft specified these as two different usages but Vulkan does not use a different layout for color attachments that are used for writing or reading, so I think it'd be perfectly valid to give it only one usage as "color attachment" and leave it at that.

We probably need a small extension to the logic of the graph to indicate usages can be compatible in this case as they map internally to the same layout. But there must probably be a distinction to indicate when the attachment is used for writing and when it's for reading... or we can just leave it as one usage, as it's pretty rare for this to be used like this anyway and it's usually within the same render pass.

EDIT: Probably worth pointing out that error only shows up in dev builds so it won't actually show in release builds yet, so it's mostly to aid us in finding these kind of subtle errors. But I don't think it actually translated to any real error in Vulkan as it maps to the same layout internally.

@BastiaanOlij
Copy link
Contributor Author

Admittedly this is one area I did not get to test much as I was not aware which effect relied on using a color attachment that was written to inside one render pass.

I think this is the problem @DarioSamo , this is not related to any effect. The problem is that we enable glow by default (which is BAD BAD BAD for VR, and for mobile in general), which disables a number of performance improvements. Turn glow off and the error appears. I raised an issue for this as its unrelated to VRS: #88606

@BastiaanOlij BastiaanOlij marked this pull request as ready for review February 22, 2024 02:39
@BastiaanOlij BastiaanOlij requested review from a team as code owners February 22, 2024 02:39
@BastiaanOlij
Copy link
Contributor Author

I've taken this out of draft, the rest of the changes I feel should be in follow up PRs. It now does what the desktop VRS mode already did. There is definitely room for improvement but also there, it makes sense to do this in follow up PRs.

For testing reviewing I suggest to use the sample project in the OP. Two remarks at this point for testers:

  1. Note the discussion with Dario up above, the error spam around draw list usage is not due to this PR and can be ignored.
  2. I'm currently having an issue when using remote debug with Quest that the Quest will hand until you quit the Godot IDE. Again not an issue with this PR, this behaviour seems to be a bug introduced in remote debug. When in doubt, export to APK and sideload!

Testing on Quest 3 VRS drops my frame time from 9-10ms down to 5-6ms. Sure that is a simple project but it's a major improvement. Right now the low/medium/high settings don't appear to have a big impact. There is room for improvement in how we're calculating the density map.

@BastiaanOlij BastiaanOlij force-pushed the enhance_vulkan_vrs branch 2 times, most recently from 322d962 to b5e065d Compare February 22, 2024 07:45
@BastiaanOlij
Copy link
Contributor Author

Fixed a few things in the way the density map was calculated, also increased the strength range. Still trying to find out where the sweet spots lie. More testing is needed here.

@@ -6094,6 +6099,7 @@ Error RenderingDeviceDriverD3D12::_check_capabilities() {
vrs_capabilities.primitive_in_multiviewport = options6.PerPrimitiveShadingRateSupportedWithViewportIndexing;
vrs_capabilities.ss_image_supported = true;
vrs_capabilities.ss_image_tile_size = options6.ShadingRateImageTileSize;
vrs_capabilities.ss_max_fragment_size = 8; // TODO figure out if this is supplied and/or needed
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RandomShaper I've just hardcoded this to 8, we need to have a chat on how VRS is actually setup for DX12 and what sort of density map it expects. Combined with Darios remarks I have some idea to clean this up a little more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also my private message on rocket chat with something I can't share publicly :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not been able to find any way of using a density map with Direct3D 12. Only a tile-based shading rate image is supported. I think the vrs_capabilities structure could be upgraded with separate mentions to the classical VRS image and the new density map, both with flags advertising support, or lack thereof, as well as parameters. In D3D12, all the new members would be false/0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RandomShaper ok the way I made this work is that we supply a density map but we convert this internally to a shading rate image if that is what is supported (which is currently what we support for VRS). This is what vrs.glsl does.

This variable steers whether our maximum fragment size is a 4x4 or 8x8 (which differs per GPU)

@@ -970,7 +971,7 @@ class RenderingDeviceDriverD3D12 : public RenderingDeviceDriver {
virtual uint64_t get_total_memory_used() override final;
virtual uint64_t limit_get(Limit p_limit) override final;
virtual uint64_t api_trait_get(ApiTrait p_trait) override final;
virtual bool has_feature(Features p_feature) override final;
virtual bool has_feature(Features p_feature) const override final;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DarioSamo I couldn't get this compiled without making this a const, not sure if this somehow got overlooked due to me being on MSVC but I think this is part of your restructure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, the blame points to that line still being the same as the original commit by Pedro on both D3D12 and the original interface header.

virtual bool has_feature(Features p_feature) = 0;

It doesn't make any sense that you have to make it const. The original header has it as not const. Either way, both lines point to @RandomShaper so maybe you want to check with him what's up with that, but it doesn't sound right at all that you need to do this. What's the compiler error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this compiles without complaint on GCC (which is does in my testing). It's technically a different function with or without const, and I don't see the const version on any of the parents, and the override should fail if there isn't a parent with the method. Super weird! But compiles fine on GCC in any case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, well my compiler complains about it because it gets called by a const has_feature from the RD logic itself, and seeing a const function can't call a non const function, it doesn't compile. It's confusing...

@BastiaanOlij
Copy link
Contributor Author

OK, interesting, while the Quest 2 only had support for fragment density map, the Quest 3 has support for both fragment density map and fragment shader rate. So it was always running in fragment shader rate mode.

I've got a bit more puzzling to do.

Comment on lines 206 to 208
The strength with which our VRS density map is calculated. The higher this number, the stronger VRS is applied, the more performance but at a cost of quality.
[b]Note:[/b] Mobile and Forward+ renderers only. Requires [member Viewport.vrs_mode] to be set to [constant Viewport.VRS_XR].
[b]Note:[/b] On compatibility renderer use [member OpenXRInterface.foveation_level].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Petty tweaks

Suggested change
The strength with which our VRS density map is calculated. The higher this number, the stronger VRS is applied, the more performance but at a cost of quality.
[b]Note:[/b] Mobile and Forward+ renderers only. Requires [member Viewport.vrs_mode] to be set to [constant Viewport.VRS_XR].
[b]Note:[/b] On compatibility renderer use [member OpenXRInterface.foveation_level].
The strength used to calculate the VRS density map. The greater this value, the more noticeable VRS is. This improves performance at the cost of quality.
[b]Note:[/b] Mobile and Forward+ renderers only. Requires [member Viewport.vrs_mode] to be set to [constant Viewport.VRS_XR].
[b]Note:[/b] On the compatibility renderer use [member OpenXRInterface.foveation_level].

doc/classes/XRInterface.xml Show resolved Hide resolved
@dsnopek
Copy link
Contributor

dsnopek commented Feb 22, 2024

I tested on the Meta Quest 3 using the demo project from the description, and the I'm seeing very similar results to Bastiaan: I'm getting 8-9ms without VRS and 6-7ms with VRS Medium.

I also tested on Quest 2 and Quest Pro, where VRS doesn't seem to have any effect: I'm getting ~12ms regardless.

It's a pretty impressive result on the Quest 3 regardless!

Copy link
Contributor

@dsnopek dsnopek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not qualified to review the Vulkan stuff, but I took a look at the XRInterface changes.

Comment on lines +157 to +161
float get_vrs_min_radius() const;
void set_vrs_min_radius(float p_vrs_min_radius);
float get_vrs_strength() const;
void set_vrs_strength(float p_vrs_strength);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like putting all this on XRInterface. I feel like everything on XRInterface should be virtual and overridden by the specific interfaces (ie. OpenXRInterface, WebXRInterface, etc). If we just need to store some data, perhaps this should be on XRServer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I put it here because we now have a fallback that works in all cases. All interfaces can piggy back on this. Some XR runtimes may have their own implementation but interestingly, that turns out to be widely missing.

Meta does have it's own density map implementation but their approach isn't compatible with our rendering engine.

Comment on lines +161 to +164
virtual Vector2 get_eye_focus(uint32_t p_view, float p_aspect); /* Obtain eye focus (if supported) in 2D space NDC coordinates to given view */
virtual RID get_vrs_texture(); /* obtain VRS texture */
Copy link
Contributor

@dsnopek dsnopek Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two are OK, because they are virtual. I'm not really fond of the default implementation of get_vrs_texture() (which was already there before this PR, it's just updated here) because it's very RD specific, and not all XRInterface's are necessarily going to use RD. It's not the worst thing, because it's virtual and so can be overridden. But, still, it would have been better to have that implementation elsewhere and the XRInterface can call it if it supports VRS and uses RD, rather than having it be a part of all XRInterface's automatically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually its not. I've standardised the VRS texture to always be an RG2 texture, whether supplied by the XR system or by the user (vrs_mode = VRS_TEXTURE), then inside the renderer it's converted to what is actually needed. Turns out Vulkan fragment rate, vulkan density map, DX12 and likely Metal all have different requirements here. So this approach should be compatible with the OpenGL renderer if we ever implement our own VRS solution there.

That is also the reason I put this default logic here, this implementation will work on any headset. It's virtual because certain back ends might have their own logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is also the reason I put this default logic here, this implementation will work on any headset.

Well, except for the interfaces where VRS won't work at all. For example, this won't work on WebXR because it only works through the compatibility renderer and this default implementation uses RD. I should probably update WebXRInterface's get_vrs_texture() to just return a RID() because that function won't do anything useful there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, when it's not supported, it's not supported. It won't even ever call get_vrs_texture in that case as it's gated on the renderer having VRS support.

@BastiaanOlij
Copy link
Contributor Author

I tested on the Meta Quest 3 using the demo project from the description, and the I'm seeing very similar results to Bastiaan: I'm getting 8-9ms without VRS and 6-7ms with VRS Medium.

I also tested on Quest 2 and Quest Pro, where VRS doesn't seem to have any effect: I'm getting ~12ms regardless.

It's a pretty impressive result on the Quest 3 regardless!

I actually found out that Quest 3 support the old fragment rate solution that we're using on desktop and thats what we're measuring. The Quest 3 and older Quests all support the density map solution which arguably is better, but there is an issue in the init logic. Still working on fixing that.

@BastiaanOlij BastiaanOlij marked this pull request as draft February 23, 2024 05:35
@BastiaanOlij
Copy link
Contributor Author

BastiaanOlij commented Feb 23, 2024

Put this back to draft as I found issues and I'm going to round off reflection probes first.

@BastiaanOlij BastiaanOlij force-pushed the enhance_vulkan_vrs branch 2 times, most recently from f458480 to ff1cfd4 Compare March 11, 2024 03:29
@BastiaanOlij
Copy link
Contributor Author

Ok, fixed some init issues with using density map and now preferring that over fragment shader rate if both are supported.

Still issues actually running it on the Quest, it no longer crashes but even though it seems to be setup correctly, it's not actually getting used. Must be overlooking something...

@BastiaanOlij
Copy link
Contributor Author

Note that I've move a bunch of the VRS changes not related to introducing the fragment density map extension into a new PR:
#89894

@dsnopek @Mickeon can you check your feedback here and see what you consider resolved, or what should be raised on the new PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants