Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UI: Fix crash on *nix caused by OBSQTDisplay destruction after draw surface invalidation #4576

Merged
merged 1 commit into from
Nov 21, 2021

Conversation

tt2468
Copy link
Member

@tt2468 tt2468 commented Apr 22, 2021

Description

Deletes the underlying obs_display_t object when the window draw surface is destroyed (X11 and Wayland), preventing OBS from attempting to switch to invalid GLX context on draw.

Closes #4185

Motivation and Context

Currently: This issue is mainly caused by the behavior of Qt's DeleteLater. When a window using OBSQTDisplay is closed, there is an indeterminate delay until the Qt object for it is deleted, allowing for a race.

With this: The OBSQTDisplay object is deleted before the draw surface is, preventing the race condition. No longer reports any X errors, let alone crashes.

How Has This Been Tested?

This issue can be most easily reproduced by doing:

  • Open OBS
  • Open the filters dialog of a playing source
  • Close the dialog via the x in the top right corner (Not the Close button)
    Expect X errors to be logged, and if lucky enough, a crash. It varies system to system, but if you have the bug (like me), it's usually very easy to reproduce.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • My code has been run through clang-format.
  • I have read the contributing document.
  • My code is not on the master branch.
  • The code has been tested.
  • All commit messages are properly formatted and commits squashed where appropriate.
  • I have included updates to all appropriate documentation.

@kkartaltepe
Copy link
Collaborator

Very cool find! I dont think this is the fix but it seems there is some issue with our sync between the UI and the libobs.

@WizardCM WizardCM added the Bug Fix Non-breaking change which fixes an issue label Apr 22, 2021
@tt2468 tt2468 force-pushed the glx_context_fix_x11 branch from 1c8f91b to 1f953e0 Compare April 23, 2021 12:12
@tt2468
Copy link
Member Author

tt2468 commented Apr 23, 2021

I've updated the PR with a more updated and direct patch to the issue. Testing on both of my PCs, I could not get any GLXBadDrawable errors or crashes. One PC is running nvidia-440 drivers and a tesla t4, one is running nvidia-450 and an rtx 4000.

On 440.100, the problem usually manifests itself as:

#0  __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:262
#1  0x00007ffff5dd10f1 in memcpy (__len=<optimized out>, __src=0x7fff29c8bb60, __dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string_fortified.h:34
#2  gs_texture_set_image (tex=0x7fffd01297c0, data=0x7fff29c8bb60 "X", linesize=<optimized out>, flip=flip@entry=false) at /home/tt2468/irltk/obs-studio/libobs/graphics/graphics.c:1173

@tt2468 tt2468 changed the title UI/libobs: Fix crash when switching to invalid glx context UI: Fix crash on linux when OBSQTDisplay is not deleted before its parent window is closed. Apr 23, 2021
@tt2468 tt2468 force-pushed the glx_context_fix_x11 branch from 1f953e0 to cf3644c Compare April 24, 2021 18:09
@ekordas
Copy link

ekordas commented May 4, 2021

On my System it used to crash every time after closing the v4l2 video properties Dialog.
Tested your pull and it fixes the Crash of the v4l2 video properties Dialog.

@jp9000
Copy link
Member

jp9000 commented Jul 2, 2021

Just wanted to make sure you didn't forget about this PR. Seems some people have been testing it.

@tt2468
Copy link
Member Author

tt2468 commented Jul 2, 2021

Just wanted to make sure you didn't forget about this PR. Seems some people have been testing it.

I haven't forgotten about it, but I'm unsure what direction to proceed in. This seems to fix the issue, but it feels more like a patch. The graphics thread is still susceptible to failed context switches. I don't remember which particular call was the problem, but my idea was to turn it into a boolean return, to be able to know if the GLX context is valid. Implementing my PR in its current form does not guard against this same issue cropping up in future features using OBSQTDisplay.

@leohoe
Copy link

leohoe commented Sep 16, 2021

Just making sure, is this still in the works? The problem is ongoing, and if this fixes it, it would be great to have available for production use.

@tt2468
Copy link
Member Author

tt2468 commented Sep 16, 2021

@leohoe Last time I brought it up to OBS project devs, I was told to wait to ask questions about the direction I should take until 27.1 is out, so that's the current status of this PR.

@leohoe
Copy link

leohoe commented Nov 10, 2021

Since we're on 27.1.3 now, how are things looking for this PR?

@tt2468 tt2468 force-pushed the glx_context_fix_x11 branch 2 times, most recently from a8c841e to deec83f Compare November 21, 2021 07:08
When a window with an OBSQTDisplay is closed, the surface can be
destroyed before the display is, opening a gap for OBS to attempt to
draw to the invalid surface.

This deletes the underlying OBSDisplay object when the actual surface
is destroyed, closing that gap.

Note: This appears to have been an issue previously with Wayland, as
hinted by the existing ifdefs.
@tt2468 tt2468 force-pushed the glx_context_fix_x11 branch from deec83f to b6b124c Compare November 21, 2021 07:11
@tt2468 tt2468 changed the title UI: Fix crash on linux when OBSQTDisplay is not deleted before its parent window is closed. UI: Fix crash on *nix caused by OBSQTDisplay destruction after draw surface invalidation Nov 21, 2021
@tt2468 tt2468 marked this pull request as ready for review November 21, 2021 07:25
@tt2468
Copy link
Member Author

tt2468 commented Nov 21, 2021

Update for those subscribed: This is slated to be released with 27.2. Thank you for your patience!

@jp9000 jp9000 merged commit 44f07f9 into obsproject:master Nov 21, 2021
@tt2468 tt2468 deleted the glx_context_fix_x11 branch November 21, 2021 07:51
@RytoEX RytoEX added this to the OBS Studio 27.2 milestone Nov 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Fix Non-breaking change which fixes an issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ubuntu 20.10 - Crash at VLC Media Source
7 participants