-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Replace bare static exception<T>
with gil_safe_call_once_and_store
.
#4897
Conversation
…and_store`. This is to ensure that `Py_DECREF()` is not called after the Python interpreter was finalized already: https://github.com/pybind/pybind11/blob/3414c56b6c7c521d868c9a137ca2ace2e26b5b2e/include/pybind11/gil_safe_call_once.h#L19
Google global testing ID (passed): OCL:575710443:BASE:575809221:1698070476323:98dd25a0 (using this PR @ ca7bdac) |
@tkoeppe FYI I never got to systematically looking if there are more bugs like this BTW. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks for the fix!
I never got to systematically looking if there are more bugs like this BTW.
I bet we could set up some global variables to track this / throw assertions. Not sure if it would be worth the effort though.
// directly in register_exception, but that makes clang <3.5 segfault - issue #1349). | ||
template <typename CppException> | ||
exception<CppException> &get_exception_object() { | ||
static exception<CppException> ex; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this default constructor call into CPython (and release/reacquire the GIL)?
The deadlock scenario can only arise if there's some non-trivial lock interaction in the initializer of the static
variable.
I suppose this change would also deal with some end-of-program lifetime situations, though, e.g. if this object gets destroyed too early or too late.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this default constructor call into CPython (and release/reacquire the GIL)?
No, it just copies nullptr
to m_ptr
(
pybind11/include/pybind11/pytypes.h
Line 297 in 3414c56
PyObject *m_ptr = nullptr; |
The deadlock scenario can only arise if there's some non-trivial lock interaction in the initializer of the
static
variable.
Yes, that was on my mind.
I suppose this change would also deal with some end-of-program lifetime situations, though, e.g. if this object gets destroyed too early or too late.
That was the only reason I got into here.
I considered simply changing this to use new
(with a boilerplate "intentional leak" comment), but then decided: the universally safe solution
- is easy,
- inexpensive,
- expressive (call once is exactly what we want),
- reads nicely,
- future proof (in case the
exception<T>
implementation is changed), - and a great pattern for people to remember and follow,
let's use it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, makes sense! Using the new facility just for the lifetime is certainly also a good move.
Thanks @EthanSteinberg!
I don't know how we could do that (going completely blank tbh). Just disclosing & an indication of curiosity, but ...
... yeah, I'm assuming there is only very little left to be had, simply based on how widely used pybind11 is. I'm slightly surprised that nobody needed this fix before. I'm guessing this bug bites very rarely, and only during process teardown, so it's just some occasional flakiness maybe that's easy to ignore. |
I don't know exactly how to set this up (would need to do more research), but my initial guess would be to have a boolean flag that is flipped true by https://docs.python.org/3/library/atexit.html#atexit.register. We could then flag on any calls to Py_DECREF or something of that nature ... The timing here is really tricky. I'm not sure atexit would be at the point we need. |
I'm thinking, let's not go there (too uncertain, not a lot to gain for it). But something else crossed my mind (~randomly): How does this Is the stored |
I don' t know how the CPython API works for multiple interpreters, but just thinking about this on a high level, none of our |
Thanks Thomas for confirming what I was suspecting! — Sounds like we (someone...) should make this warning much stronger: pybind11/include/pybind11/embed.h Lines 235 to 242 in fa27d2f
|
static exception<T>
with gil_safe_call_once_and_store
.static exception<T>
with gil_safe_call_once_and_store
.
Description
This is to ensure that
Py_DECREF()
is not called after the Python interpreter was finalized already:pybind11/include/pybind11/gil_safe_call_once.h
Line 19 in 3414c56
(Bug noticed by chance while working on #4888.)
Suggested changelog entry: