Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

video: Prefer Wayland over X11 #4306

Merged
merged 1 commit into from
Jan 26, 2022

Conversation

flibitijibibo
Copy link
Collaborator

@flibitijibibo flibitijibibo commented Apr 17, 2021

This is a draft patch that will (at last) prioritize the Wayland video driver over X11. This is here purely for internal testing only and is NOT meant for wider community testing; to test SDL Wayland support you should set SDL_VIDEODRIVER=wayland instead.

Click here for the SDL Wayland bug tracker.

The Big List of Annoying Things That Aren't SDL's Fault:

The Big List of EGL Stuff That Might Be a Problem:

This Big List of Things in X11 That Wayland Might Also Want but It's Not a Huge Deal:

Known Issues:

Fixes #2710. Fixes #4988.

@Cacodemon345
Copy link
Contributor

It looks like there is no way to programmatically set the icon of a Wayland window; I found no mentions of an icon API in the wayland-protocols repository.

Someone else will have to figure out a Wayland protocol extension for that API.

@icculus
Copy link
Collaborator

icculus commented Apr 18, 2021

* GetDisplayBounds

This should probably just be the display size (which the Wayland backend does report, afaik). Is the problem multi-monitor?

* GetDisplayUsableBounds

Just pass these requests to GetDisplayBounds and add a FIXME for needing a protocol for now.

I suspect this will also be a good candidate for libdecoration to supply, since we basically need to know what Gnome, etc specific reservations are being taken.

* HideWindow

* RaiseWindow

We can't do these? Really?!

* SetWindowGammaRamp - [wlr-gamma-control-unstable-v1](https://github.com/swaywm/wlr-protocols/blob/master/unstable/wlr-gamma-control-unstable-v1.xml)?

Fwiw, I don't think these work on X11 at the moment if you have an OpenGL window, which is about the same as saying I don't think these work on X11.

* Shape API?

My current desire is to remove the shape API in 2.1. Just leave it unimplemented for Wayland.

@flibitijibibo
Copy link
Collaborator Author

flibitijibibo commented Apr 18, 2021

This should probably just be the display size (which the Wayland backend does report, afaik). Is the problem multi-monitor?

Yeah, the only missing part here is providing x/y geometry to the VideoDisplay. As far as w/h are concerned this technically works already.

Just pass these requests to GetDisplayBounds and add a FIXME for needing a protocol for now.

SDL_video.c does this for us already, so nothing to worry about here!

We can't do these? Really?!

For RaiseWindow I thought I saw some murmurs about there being a protocol for process-local raising or some such thing, but yeah, I guess it's along the same lines as SetWindowInputFocus where that's Not Okay(TM), but like... yeah, come on, man.

HideWindow I think we just need to set the opaque region to empty, or something like that? I know this is doable somehow considering we can start with a hidden window and show it later (as FNA does), but I wasn't sure what the recommended technique was here (Wayland example code is frustratingly sparse).

Fwiw, I don't think these work on X11 at the moment if you have an OpenGL window, which is about the same as saying I don't think these work on X11.

Fair enough - I happened to come across the wlr protocol while looking at text input, but I have zero use cases for this so I wouldn't even know where to start with it. Maybe this will be useful for idTech-style brightness settings or something, I'm 100% clueless here.

My current desire is to remove the shape API in 2.1. Just leave it unimplemented for Wayland.

Got it, will cut that from the list!

@flibitijibibo
Copy link
Collaborator Author

Took care of GetDisplayBounds and SetWindowModalFor, and fixed up a couple other things while I was at it: #4309

We uh, need to talk about HideWindow though. I found this in the zxdg-shell spec:

    <request name="destroy" type="destructor">
      <description summary="destroy the xdg_toplevel">
	Unmap and destroy the window. The window will be effectively
	hidden from the user's point of view, and all state like
	maximization, fullscreen, and so on, will be lost.
      </description>
    </request>

You probably read that like I did and went "wait, to hide it we... destroy it?" And yes, the stable spec basically says the same thing:

    <request name="destroy" type="destructor">
      <description summary="destroy the xdg_toplevel">
	This request destroys the role surface and unmaps the surface;
	see "Unmapping" behavior in interface section for details.
      </description>
    </request>

EARLIER:

      Unmapping an xdg_toplevel means that the surface cannot be shown
      by the compositor until it is explicitly mapped again.
      All active operations (e.g., move, resize) are canceled and all
      attributes (e.g. title, state, stacking, ...) are discarded for
      an xdg_toplevel surface when it is unmapped.

Oh, okay, so it's just named kind of weird, it's an unmap call. But this is a problem because we create these objects in CreateWindow, and pretty much depend on its existence throughout waylandwindow.c (and twice in waylandevents.c). This means we pretty much have to fold all that Create work into ShowWindow, then fold all the Destroy work into HideWindow, then fix up the whole driver to consider the idea that yes, we have the protocol, but not the window object, at least for that moment.

Much like GammaRamp I don't have any active use case for this other than "I want to keep the window hidden until we've loaded and are ready to draw," so this may be a refactor for someone else to deal with.

@Cacodemon345
Copy link
Contributor

Does destroying the window object cause the EGL surface to be lost?

@flibitijibibo
Copy link
Collaborator Author

Does destroying the window object cause the EGL surface to be lost?

Not that I could tell, I believe the wl_surface and EGL surface are still valid, they just won't have an xdg-toplevel mapped to the compositor. I would definitely test this idea though.

@flibitijibibo
Copy link
Collaborator Author

Managed to figure out GetDisplayDPI as well, another really easy one that I should have seen the first time around.

@Cacodemon345
Copy link
Contributor

Cacodemon345 commented Apr 20, 2021

The Wayland gamma protocol depends on the gamma ramp size which is likely to be not equal to 256. X11's XVidMode gamma functions (which the X11 backend does not use) also uses different gamma ramp sizes depending on the display. Same goes for the XRandR gamma API. And the KMSDRM backend optionally can support gamma ramp sizes higher than 256 but uses the legacy gamma setting functions so it is hard-limited to 256.

SDL really needs a internal function to convert between gamma ramps of different sizes.

Edit: Cocoa backend converts the gamma ramp values to floats before passing them CGSetDisplayTransferByTable which is documented to internally interpolate the gamma ramp values to match the display hardware. That looks useful for a gamma ramp size convert function on SDL's side.

@flibitijibibo
Copy link
Collaborator Author

Figured out HideWindow in #4319. The rearranging of things was easier than I expected once I realized why the protocol was blowing up in my face when calling ShowWindow a second time.

@flibitijibibo
Copy link
Collaborator Author

The last feature that doesn't require a new protocol has been merged... I'm still a bit antsy about the bug found in #4319 but I'm hoping it's just that we're doing something new and not wrong, seeing that similar toolkits were doing similar techniques.

At this point unless we want to start pitching new protocols this is as feature-complete as it's going to get, we're now in the bugfixing stage while the libdecor developers do their thing.

@Cacodemon345
Copy link
Contributor

https://gitlab.freedesktop.org/wayland/wayland-protocols/-/blob/main/staging/xdg-activation/xdg-activation-v1.xml
This protocol should be used for implementing RaiseWindow. It is in staging by the way, so it is WIP.

@flibitijibibo
Copy link
Collaborator Author

Added to the OP. I assume this protocol doesn't set input focus though, right? I'm guessing SetWindowInputFocus will probably get thrown out with stuff like SetWindowPosition, but I wasn't 100% sure if both Raise and SetInputFocus were being looked at.

@Cacodemon345
Copy link
Contributor

@flibitijibibo
Copy link
Collaborator Author

Looks like it does what we want, but it also looks like it'll be a while before that's finalized. Will add it to the list just in case!

@flibitijibibo
Copy link
Collaborator Author

Just realized that "staging" took the place of "unstable" for Wayland protocols, and 1.21 includes the staged activation protocol. Implementations are early though, so if someone has early access to a working implementation you're probably the only one that can write and test this right now. Would be cool to have for 2.0.16 though!

@icculus
Copy link
Collaborator

icculus commented May 3, 2021

Would be cool to have for 2.0.16 though!

They still have 19 days. :)

@flibitijibibo
Copy link
Collaborator Author

flibitijibibo commented Jun 2, 2021

Quick update on this since 2.0.16 is still in the works: The KDE components for xdg-activation have been merged upstream, and the wlroots patch has been reviewed as of today and may be merged soon it's been merged now - if anyone is working off of Git revisions of either system, now's your chance to shine (or tell us how to test these things without having to dedicate an entire system for it, either one).

@flibitijibibo
Copy link
Collaborator Author

Opened #4413 in the hopes that someone with the right environment can test it - it's a very small patch compared to some of the other Wayland features but it requires the very latest KWin or wlroots to run (mutter's patch is still under review).

@emersion
Copy link

emersion commented Jun 2, 2021

SetWindowGammaRamp - wlr-gamma-control-unstable-v1?

Please don't. This protocol is privileged and shouldn't be used by regular clients. The main use-case is the "night light" feature.

@flibitijibibo
Copy link
Collaborator Author

SetWindowGammaRamp - wlr-gamma-control-unstable-v1?

Please don't. This protocol is privileged and shouldn't be used by regular clients. The main use-case is the "night light" feature.

Phew, that's what I figured - removing this from the list!

@flibitijibibo
Copy link
Collaborator Author

RaiseWindow support is in, thanks to the various desktop developers for helping us out!

At this point we now have implementations for every function where a protocol exists - any more features will require creating or contributing to new protocols. There is still the concern of #4360 not working, but we have our own built-in IME support working with the caveat that the text input rectangle won't work in windowed mode (it's fine in fullscreen mode thanks to convenient window placement).

This is about as good as it's going to get for 2.0.16, meaning we can focus on removing deprecated stuff and fixing bugs for the whole 2.0.18 cycle while libdecor support continues to develop.

@flibitijibibo
Copy link
Collaborator Author

I've been doing a lot of testing this week with games big and small, and so far everything has held up very well - for Vulkan in particular I would go as far as saying this driver is now production ready on desktops with server decorations! You can see some samples here and here.

OpenGL is still annoying because of some holes in EGL and that loathsome infinite wait in SwapBuffers. I decided to try and pitch some extensions to fix these problems:

https://gitlab.freedesktop.org/mesa/mesa/-/issues/4932

https://gitlab.freedesktop.org/mesa/mesa/-/issues/4933

Once these are solved I'd put the "production ready" stamp on OpenGL as well.

@flibitijibibo
Copy link
Collaborator Author

Another update... I'm helping out with a new Wayland protocol designed by Joshua Ashton and implemented in wlroots by Simon Ser to better support suspended surfaces:

https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/99

I don't expect SDL to have any new code for this, it will just make vkAcquireNextImageKHR and eglSwapBuffers work in a way that prevents it from deadlocking forever. A timeout is still a good idea but this will be much better for supporting surfaces that are obscured.

@flibitijibibo
Copy link
Collaborator Author

NVIDIA's 470 driver beta is now available:

https://www.nvidia.com/download/driverResults.aspx/176525/en-us

I ran it on my NV lab box and can confirm that it actually works very well already. While this opens the door for X applications working on Wayland NV, it does confirm my worries that a ton of people will be running SDL applications via Xwayland really really soon, which for first-person games is extremely bad news because the mousewarp emulation is, as you might expect, really crappy and stuttery (and I don't blame desktop developers for not caring about it). At least for my catalog, making our way to switching the default is now officially urgent rather than just maybe urgent.

In addition to finally getting to try Xwayland for the first time, NVIDIA testers are encouraged to try SDL_VIDEODRIVER=wayland to make sure that we're ready for NVIDIA's 470 release when using native Wayland applications.

@Cacodemon345
Copy link
Contributor

This sounds like a compositor bug, unless there's something in the protocol text saying otherwise.

Sway does not warp the cursor to the application surface at all when it is resized and relative mode is active, which allows mouse to click into the background and cause the cursor to be able to move again should the cursor reappear. This is a problem for applications like DosBox which do resize their windows.

@emersion
Copy link

Sway does not warp the cursor to the application surface at all when it is resized and relative mode is active

This is a compositor bug: swaywm/sway#6070

@flibitijibibo flibitijibibo changed the title [DONOTMERGE] video: Prefer Wayland over X11 video: Prefer Wayland over X11 Jan 13, 2022
@flibitijibibo flibitijibibo marked this pull request as ready for review January 13, 2022 22:19
@Cacodemon345
Copy link
Contributor

Sway does not warp the cursor to the application surface at all when it is resized and relative mode is active

This is a compositor bug: swaywm/sway#6070

I see it happening in LabWC 0.4.0 too.

@DanielGibson
Copy link
Contributor

It seems like even with EGL_EXT_present_opaque there's still something wrong with alpha and OpenGL.

The short version: dhewm3 (sometimes) blends to the default framebuffer with glBlendFunc(GL_DST_ALPHA, GL_ONE_MINUS_DST_ALPHA), so there's glitches if the OpenGL context (and thus its default framebuffer) doesn't have an alpha channel. It's pretty obvious in the main menu, see dhewm/dhewm3#426 (comment) for screenshots.
This issue also happens on X11 (and even Windows with nvidia's driver), when using SDL_GL_SetAttribute(SDL_GL_ALPHA_SIZE, 0);.

Now the weird thing is, that this also happens on Wayland even if the context should have 8 alpha bits (SDL_GL_SetAttribute(SDL_GL_ALPHA_SIZE, 8);).
Even though after creating the context SDL_GL_GetAttribute(SDL_GL_ALPHA_SIZE, &a); seems to confirm that there are 8 alpha bits. dhewm/dhewm3#426 (comment) has screenshots, and dhewm/dhewm3#426 (comment) has information on the exact configuration (Linux 5.16.1 Mesa 21.3.4; Radeon 5700XT; Gnome 41.3; latest SDL from git).

Do you have any idea if this is expected behavior (that would be bad)? If not, who is to blame - SDL or Mesa or Gnome's wayland implementation or who else?

@flibitijibibo
Copy link
Collaborator Author

I would expect this issue to be in Mesa, it should be picking an XRGB8 surface so those bits should exist somewhere...

@DanielGibson
Copy link
Contributor

Is the alpha chan supposed to contain valid bits that are used for alpha blending if it's indeed XRGB instead of ARGB?

@flibitijibibo
Copy link
Collaborator Author

It should - as the name suggests, we just want to present opaque, we still want the surface to be a 32-bit image with valid information in all channels.

@Yamagi
Copy link

Yamagi commented Jan 18, 2022

I'm pretty sure that there's something wrong on Mesas side. I've opened a bug report: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5886

@flibitijibibo
Copy link
Collaborator Author

On the subject of external reports, I've tried to report everything we've found throughout development to close all the NOTOURBUGs. The end result wasn't too much, thankfully:

The rest have already been mentioned above (i.e. dhewm3's present_opaque report, event queue overflow proposal) or they already got back to us and either fixed the problem or proved that it was really us. This leaves us with exactly two issues that aren't mentioned in the OP:

  1. An NVIDIA-specific resize issue that has been acknowledged by the vendor: wayland: Only call wl_egl_window_resize in SwapWindow #4821
  2. wayland: Buffers with old size can get commited already after ack_configure on window resize #4563

Unless you're Valve, NVIDIA, or a maintainer of the projects mentioned above, all eyes should be on the second issue. The sooner that's fixed, the sooner we can say "okay, this is done and locked in, desktops/drivers can look at the latest revision and know that it will likely be what ships as the default in the very near future."

@flibitijibibo
Copy link
Collaborator Author

A commit has been pushed that hopefully fixes the second issue, but we're still waiting on feedback... either way it was agreed that we should be able to iron out the final issues for 2.0.22, so here we go!

@flibitijibibo flibitijibibo merged commit 8ceba27 into libsdl-org:main Jan 26, 2022
@flibitijibibo flibitijibibo deleted the only-took-9-years branch January 26, 2022 17:32
@DanielGibson
Copy link
Contributor

Do you know if EGL_EXT_present_opaque at least works with other games (than Doom3) that need it, like Portal2?

@flibitijibibo
Copy link
Collaborator Author

Yes, my tests included Portal 2, Left 4 Dead 2, Unity, FNA, and a few custom engines from my catalog of games.

@DanielGibson
Copy link
Contributor

DanielGibson commented Jan 26, 2022

Ok, good to know, so Doom3 really is special..

I can work around the issue in dhewm3: https://github.com/dhewm/dhewm3/blob/test-alpha/neo/renderer/tr_backend.cpp#L532-L594
It's not pretty, but it seems to work..

I hope Mesa eventually fixes the issue; the way it's implemented now, SDL probably could've done the same without an extension (using a visual without alpha while pretending in SDL_GL_GetAttribute(SDL_GL_ALPHA_SIZE, &a); and maybe even glGetIntegerv(GL_ALPHA_BITS, &a); that an alpha channel exist)..

I even wonder if SDL could implement a workaround like I do in dhewm3 (blend a window-sized quad with color (0, 0, 0, 1) before SDL_GL_SwapWindow()) - but probably would be quite elaborate with handling legacy and core/es context + all the state that must be set and afterwards restored..

@flibitijibibo
Copy link
Collaborator Author

Yeah, the faux-backbuffer idea is something I threw around in FNA3D for a bit but it's not easy to wedge into call streams unless you control the whole stream (i.e. people like to glBindFramebuffer(0) a LOT).

I did try and fuss with the attrib flags as well and came up with nothing... you can trick the application most of the time, but like Doom3 there's always somebody who really expects to get what they asked for. Pretty much the only sure fix was to address it one layer below us (either in the EGL implementation or the compositors, and since Vulkan already has a composite alpha flag EGL seemed like the right idea).

@DanielGibson
Copy link
Contributor

Yeah, the faux-backbuffer idea is something I threw around in FNA3D for a bit but it's not easy to wedge into call streams unless you control the whole stream (i.e. people like to glBindFramebuffer(0) a LOT).

Ok, that's a problem of course

I did try and fuss with the attrib flags as well and came up with nothing... you can trick the application most of the time, but like Doom3 there's always somebody who really expects to get what they asked for

Yeah, applications that actually need the alpha chan are screwed with that approach, but so far, so they are with the EGL extension.

Also I wonder if my approach of making the alpha channel opaque at the end of the frame (with blending or whatever) can break applications, in case anyone reads from the front buffer in the next frame? No idea if that's a thing.

So we'll have to hope that Mesa fixes it eventually (and that they won't pretend it works like expected)

@flibitijibibo
Copy link
Collaborator Author

I would check the Mesa commit that introduces the extension; it should work similar to the Vulkan WSI implementation which explicitly has this as a feature, so if anything is different there should be a direct reference to show how it's supposed to be done. If there's a Vulkan renderer for Doom 3 somewhere I would check that as well.

@flibitijibibo
Copy link
Collaborator Author

flibitijibibo commented Feb 1, 2022

NVIDIA 510.47.03 fixes native Wayland Vulkan support.

Interestingly, NV Vulkan has the same opacity issues that EGL does. EGL I get since the extension isn't supported yet, but I'm surprised that Vulkan doesn't conform to VkCompositeAlphaFlagsKHR.

DanielGibson added a commit to dhewm/dhewm3 that referenced this pull request Feb 5, 2022
For some reason Wayland thought it would be clever to be the only
windowing system that (non-optionally) uses the alpha chan of the
window's default OpenGL framebuffer for window transparency.
This always caused glitches with dhewm3, as Doom3 uses that alpha-chan
for blending tricks (with GL_DST_ALPHA) - especially visible in the main
menu or when the flashlight is on.
So far the workaround has been r_waylandcompat which requests an OpenGL
context/visual without alpha chan (0 alpha bits), but that also causes
glitches.
There's an EGL extension that's supposed to fix this issue
(EGL_EXT_present_opaque), and newer SDL2 versions use it (when using
the wayland backend) - but unfortunately the Mesa implementation is
broken (seems to provide a visual without alpha channel even if one was
requested), see https://gitlab.freedesktop.org/mesa/mesa/-/issues/5886
and libsdl-org/SDL#4306 (comment)
for the corresponding SDL2 discussion

To work around this issue, dhewm3 now disables the use of that EGL
extension and (optionally) makes sure the alpha channel is opaque at
the end of the frame.
This behavior is controlled with the r_fillWindowAlphaChan CVar:
If it's 1, this always is done (regardless if wayland is used or not),
if it's 0 it's not done (even on wayland),
if it's -1 (the default) it's only done if the SDL "video driver" is
  wayland (this could be easily enhanced later in case other windowing
  systems have the same issue)

r_waylandcompat has been removed (it never worked properly anyway),
so now the window always has an alpha chan
@DanielGibson
Copy link
Contributor

I would check the Mesa commit that introduces the extension;

That was https://gitlab.freedesktop.org/mesa/mesa/-/commit/9aee7855d2ddf47169270d5d1e3e92ff6e5f65c2 and https://gitlab.freedesktop.org/mesa/mesa/-/commit/9aee7855d2ddf47169270d5d1e3e92ff6e5f65c2#b91f29c819033a507b2cac556bea4f5aecd8893d_178_178 indeed looks like they just use 8 alpha bits internally, if I interpret that code correctly.

it should work similar to the Vulkan WSI implementation which explicitly has this as a feature, so if anything is different there should be a direct reference to show how it's supposed to be done.

I wasn't able to find the code for that Vulkan extension in Mesa - the bit it set somewhere, but I didn't see it being read to act on it (I didn't spend terribly much time on it though).

If there's a Vulkan renderer for Doom 3 somewhere I would check that as well.

The only one I'm aware of is based on Doom3BFG which has a much more modern (OpenGL 3) renderer that IIRC uses FBOs internally so it shouldn't have this problem.

I committed a workaround to dhewm3 (that fills the alpha chan at the end of each frame) to make sure it continues working with SDL2 versions that use wayland by default: dhewm/dhewm3@699779e

rorgoroth pushed a commit to rorgoroth/dhewm3 that referenced this pull request Apr 8, 2023
For some reason Wayland thought it would be clever to be the only
windowing system that (non-optionally) uses the alpha chan of the
window's default OpenGL framebuffer for window transparency.
This always caused glitches with dhewm3, as Doom3 uses that alpha-chan
for blending tricks (with GL_DST_ALPHA) - especially visible in the main
menu or when the flashlight is on.
So far the workaround has been r_waylandcompat which requests an OpenGL
context/visual without alpha chan (0 alpha bits), but that also causes
glitches.
There's an EGL extension that's supposed to fix this issue
(EGL_EXT_present_opaque), and newer SDL2 versions use it (when using
the wayland backend) - but unfortunately the Mesa implementation is
broken (seems to provide a visual without alpha channel even if one was
requested), see https://gitlab.freedesktop.org/mesa/mesa/-/issues/5886
and libsdl-org/SDL#4306 (comment)
for the corresponding SDL2 discussion

To work around this issue, dhewm3 now disables the use of that EGL
extension and (optionally) makes sure the alpha channel is opaque at
the end of the frame.
This behavior is controlled with the r_fillWindowAlphaChan CVar:
If it's 1, this always is done (regardless if wayland is used or not),
if it's 0 it's not done (even on wayland),
if it's -1 (the default) it's only done if the SDL "video driver" is
  wayland (this could be easily enhanced later in case other windowing
  systems have the same issue)

r_waylandcompat has been removed (it never worked properly anyway),
so now the window always has an alpha chan
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

On Wayland, SDL2 seems to default to xwayland [patch] Wayland: SDL applications start in XWayland by default