Incorrect use of SDL_GetError() to check whether calls failed #257

smcv · 2023-02-08T12:54:40Z

The SDL documentation says this:

The message is only applicable when an SDL function has signaled an error. You must check the return values of SDL function calls to determine when to appropriately call SDL_GetError(). You should not use the results of SDL_GetError() to decide if an error has occurred! Sometimes SDL will set an error string even when reporting success.

However, the pysdl2 test suite has a lot of this pattern, which the quoted documentation calls out as incorrect:

    ret = sdl2.SDL_Init(sdl2.SDL_INIT_VIDEO | sdl2.SDL_INIT_AUDIO)
    assert sdl2.SDL_GetError() == b""
    assert ret == 0

What doesn't work?
One concrete example is that when I run the test suite on Debian 12 alpha, I get a lot of failures like this:

    def test_SDL_VideoInitQuit():
        # Test with default driver
        assert sdl2.SDL_WasInit(0) & sdl2.SDL_INIT_VIDEO != sdl2.SDL_INIT_VIDEO
        ret = sdl2.SDL_VideoInit(None)
>       assert sdl2.SDL_GetError() == b""
E       AssertionError: assert b'Unable to o.../input/event8' == b''
E         Full diff:
E         - b''
E         + b'Unable to open /dev/input/event8'

How To Reproduce

SDL_VIDEODRIVER=dummy SDL_AUDIODRIVER=dummy SDL_RENDER_DRIVER=software pytest -v

I'm using a SDL 2.27.x git snapshot from 2023-01-03, in case it matters.

Platform (if relevant):

OS: Linux (Debian 12 alpha)
Python Version: 3.11
SDL2 Version: 2.27.x snapshot from 2023-01-03
Using pysdl2-dll: No, using my OS's copy of SDL

The text was updated successfully, but these errors were encountered:

SDL_GetError works like Standard C errno: if a call fails it will (generally) set the error indicator, but if a call succeeds there is no guarantee about whether the error indicator has been set, cleared, or left at its previous value. The SDL documentation says this: The message is only applicable when an SDL function has signaled an error. You must check the return values of SDL function calls to determine when to appropriately call SDL_GetError(). You should not use the results of SDL_GetError() to decide if an error has occurred! Sometimes SDL will set an error string even when reporting success. In particular, with a SDL 2.27.x git snapshot, a successful SDL_Init can leave the error indicator set to a message about inability to open an input device node, and this does not mean it has failed. In this commit I have not attempted to check or fix other uses of SDL_GetError, only the ones that follow SDL_Init. This is enough to make the test suite pass on Debian 12 alpha with `SDL_VIDEODRIVER=dummy SDL_AUDIODRIVER=dummy SDL_RENDER_DRIVER=software` for now, but the other SDL_GetError() calls in the test suite should ideally also be checked. Partial solution for: py-sdl#257 Signed-off-by: Simon McVittie <smcv@collabora.com>

smcv · 2023-02-08T13:00:12Z

#258 partially solves this. A complete solution for this would be to apply similar changes to all the other SDL_GetError() calls.

SDL_GetError works like Standard C errno: if a call fails it will (generally) set the error indicator, but if a call succeeds there is no guarantee about whether the error indicator has been set, cleared, or left at its previous value. The SDL documentation says this: The message is only applicable when an SDL function has signaled an error. You must check the return values of SDL function calls to determine when to appropriately call SDL_GetError(). You should not use the results of SDL_GetError() to decide if an error has occurred! Sometimes SDL will set an error string even when reporting success. In particular, with a SDL 2.27.x git snapshot, a successful SDL_Init can leave the error indicator set to a message about inability to open an input device node, and this does not mean it has failed. In this commit I have not attempted to check or fix other uses of SDL_GetError, only the ones that follow SDL_Init. This is enough to make the test suite pass on Debian 12 alpha with `SDL_VIDEODRIVER=dummy SDL_AUDIODRIVER=dummy SDL_RENDER_DRIVER=software` for now, but the other SDL_GetError() calls in the test suite should ideally also be checked. Partial solution for: #257 Signed-off-by: Simon McVittie <smcv@collabora.com>

a-hurst · 2023-02-12T19:32:42Z

Adding this to the to-do milestone for the next release! I should also make sure sdl2.ext doesn't make this assumption anywhere either.

smcv · 2023-08-19T10:46:06Z

Thanks for working on improving this, but this issue is unresolved, so please reopen. There are still lots of places in the tests that assert that some operation left the error indicator empty, for example:

$ git grep -B3 SDL_GetError
...
sdl2/test/joystick_test.py-    count = sdl2.SDL_NumJoysticks()
sdl2/test/joystick_test.py-    for i in range(count):
sdl2/test/joystick_test.py-        stick = sdl2.SDL_JoystickOpen(i)
sdl2/test/joystick_test.py:        assert sdl2.SDL_GetError() == b""
...
sdl2/test/keyboard_test.py-
sdl2/test/keyboard_test.py-def test_SDL_StartStopTextInput(with_sdl):
sdl2/test/keyboard_test.py-    sdl2.SDL_StopTextInput()
sdl2/test/keyboard_test.py:    assert SDL_GetError() == b""
sdl2/test/keyboard_test.py-    assert sdl2.SDL_IsTextInputActive() == SDL_FALSE
sdl2/test/keyboard_test.py-    sdl2.SDL_StartTextInput()
sdl2/test/keyboard_test.py:    assert SDL_GetError() == b""
...
sdl2/test/render_test.py-    sf = surface.SDL_CreateRGBSurface(
sdl2/test/render_test.py-        0, 100, 100, 32, 0xFF000000, 0x00FF0000, 0x0000FF00, 0x000000FF
sdl2/test/render_test.py-    )
sdl2/test/render_test.py:    assert SDL_GetError() == b""
sdl2/test/render_test.py-    pixfmt = sf.contents.format.contents
sdl2/test/render_test.py-    fill = pixels.SDL_MapRGBA(pixfmt, 0, 0, 0, 255)
sdl2/test/render_test.py-    surface.SDL_FillRect(sf, None, fill)
sdl2/test/render_test.py:    assert SDL_GetError() == b""
...
sdl2/test/rwops_test.py-def test_SDL_RWFromMem():
sdl2/test/rwops_test.py-    buf = ctypes.create_string_buffer(b"1234")
sdl2/test/rwops_test.py-    rw = sdl2.SDL_RWFromMem(buf, len(buf))
sdl2/test/rwops_test.py:    assert sdl2.SDL_GetError() == b""

On my Linux system, enumeration succeeds, but the error indicator gets set as a side-effect (which appears to be because the loader checks whether the symbol exists in SDL or a direct dependency before it dlopens libudev). The API of SDL_hid_enumerate does not make it possible to distinguish between successfully returning an empty list of devices (returns NULL with the error indicator in an undefined state) and a failure (returns NULL with the error indicator set), and systems that run automated tests usually don't have any HID game controllers connected, so we can't make any meaningful use of the error indicator here. Helps: py-sdl#257 Signed-off-by: Simon McVittie <smcv@debian.org>

Helps: py-sdl#257 Signed-off-by: Simon McVittie <smcv@debian.org>

…ilure SDL_GetError() is like errno: it's documented not to be suitable for detecting failure, only for getting more details if failure was already detected (its result is unspecified on success, because a successful API call might have been implemented by doing something that failed, detecting that, and falling back to doing something different). However, some functions in SDL2 return void, so we have no other way to tell whether they have failed (they do return a result in SDL3). To make it less likely that upgrading SDL2 will make these tests regress, clear the error indicator immediately before calling the function under test. It is still not guaranteed to be empty on success, but at least this way we make sure it doesn't already contain an error message from a previous function call. Helps: py-sdl#257 Signed-off-by: Simon McVittie <smcv@debian.org>

a-hurst · 2023-08-19T22:25:38Z

Oops. I thought I did a search through the codebase for incorrect SDL_GetError usage and dealt with it all, but apparently whatever search strategy I used wasn't thorough enough!

Also, thanks a ton for your continued efforts upstreaming fixes for bugs in the test suite (and elsewhere): I'm not a software developer by trade, so there's been a lot of learning-as-I-go in taking over maintenance of a project like PySDL2 (especially in terms of unit tests). I apologize that you've had to spend so much time fixing my failing tests as a result, but I sincerely appreciate your detailed bug reports and clean, well-documented patches.

On my Linux system, enumeration succeeds, but the error indicator gets set as a side-effect (which appears to be because the loader checks whether the symbol exists in SDL or a direct dependency before it dlopens libudev). The API of SDL_hid_enumerate does not make it possible to distinguish between successfully returning an empty list of devices (returns NULL with the error indicator in an undefined state) and a failure (returns NULL with the error indicator set), and systems that run automated tests usually don't have any HID game controllers connected, so we can't make any meaningful use of the error indicator here. Helps: #257 Signed-off-by: Simon McVittie <smcv@debian.org>

smcv · 2023-08-21T13:20:57Z

I should probably mention why I keep opening issues about breakage with new SDL versions, to give you some context.

In Debian, we have both SDL2 and pysdl2 available as packages. We regularly run pysdl2's test suite, and when there's a new version of SDL2 waiting to be integrated, one of the things our QA infrastructure does is to run the old pysdl2's test suite against the new SDL2. If it fails, there are two possible reasons for that:

the new SDL2 has a regression, and pysdl2's test suite catches that regression and correctly fails
pysdl2 was doing something wrong (most likely, making assumptions that aren't guaranteed to stay true), and the new SDL2 breaks its assumption, so pysdl2's test suite incorrectly fails

I'm a maintainer of our SDL2 package, and when I upload new SDL2 versions, it's relatively common for them to get stuck and not be integrated immediately, because the pysdl2 test suite started failing; so to get the new SDL2 out to users, I have to dig into pysdl2 and find out whether we're in scenario 1 or 2. If we're in scenario 1, great, a regression was successfully caught, and I need to get SDL2 fixed; but if we're in scenario 2, then I need to either fix pysdl2 or get it kicked out of the development branch of Debian, otherwise SDL2 can't progress.

Obviously I prefer to fix potentially-useful software rather than removing it, hence the merge requests.

I'm not actually entirely clear on why we have pysdl2 in Debian, because we don't seem to have any games or apps that depend on it - but its test coverage is currently broader than SDL's, so it's a good way to check that SDL hasn't obviously broken something, and if it occasionally detects a genuine regression in SDL2 then that's enough to earn its place.

a-hurst · 2023-08-21T20:46:18Z

@smcv Ah, that makes perfect sense. Interesting that PySDL2 got added as a package if it wasn't needed for anything... looks like someone just requested it back in 2013 and it's been around ever since!

I'm glad to hear the test suite has proved useful for catching upstream SDL bugs: I've spent a lot of time and effort cleaning up/expanding the test suite at this point, so I'm glad it's paying off for something other than just making sure my bindings are correct! Hopefully with the latest set of fixes it'll be less likely to give you false positives.

smcv mentioned this issue Feb 8, 2023

tests: Don't assume SDL_Init will leave SDL_GetError unset #258

Merged

4 tasks

a-hurst self-assigned this Feb 12, 2023

a-hurst added this to the PySDL2 0.9.16 milestone Feb 12, 2023

a-hurst mentioned this issue Jun 23, 2023

Improve unit tests and restore Python 2.7 CI runners #265

Merged

4 tasks

a-hurst closed this as completed in #265 Jun 24, 2023

smcv mentioned this issue Aug 19, 2023

test: Don't assert that SDL_hid_enumerate doesn't set error #269

Merged

4 tasks

smcv added a commit to smcv/py-sdl2 that referenced this issue Aug 19, 2023

video_test: Don't check error unless a function failed

e38c4ab

Helps: py-sdl#257 Signed-off-by: Simon McVittie <smcv@debian.org>

smcv mentioned this issue Aug 19, 2023

Partially resolve #257: video_test #270

Merged

4 tasks

a-hurst reopened this Aug 19, 2023

a-hurst closed this as completed in ed28ea6 Aug 19, 2023

a-hurst reopened this Aug 19, 2023

a-hurst mentioned this issue Aug 20, 2023

Remove more misuse of SDL_GetError in tests #271

Merged

4 tasks

a-hurst closed this as completed in #271 Aug 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect use of SDL_GetError() to check whether calls failed #257

Incorrect use of SDL_GetError() to check whether calls failed #257

smcv commented Feb 8, 2023

smcv commented Feb 8, 2023

a-hurst commented Feb 12, 2023

smcv commented Aug 19, 2023

a-hurst commented Aug 19, 2023

smcv commented Aug 21, 2023

a-hurst commented Aug 21, 2023

Incorrect use of SDL_GetError() to check whether calls failed #257

Incorrect use of SDL_GetError() to check whether calls failed #257

Comments

smcv commented Feb 8, 2023

smcv commented Feb 8, 2023

a-hurst commented Feb 12, 2023

smcv commented Aug 19, 2023

a-hurst commented Aug 19, 2023

smcv commented Aug 21, 2023

a-hurst commented Aug 21, 2023