Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Improve PyCapsule exception handling #4232

Merged
merged 3 commits into from
Oct 11, 2022

Conversation

Skylion007
Copy link
Collaborator

@Skylion007 Skylion007 commented Oct 10, 2022

Description

This further improves another error in exception handling pointed out by #4221 .
It makes the following changes

  • Only check if PyErrOccurred() only if the destructor is nullptr (the exceptions are only set in if the destructor is nullptr according the spec). This should be a bit faster and be more robust as it better follows the spec.
  • There is a pretty rare edge case that could occur when wrapping a capsule around a function ptr if the destructor was somehow changed to nullptr. I couldn't actually get this one to fire at all, however, it's better to have proper handling in case this corner case is every encountered since it's a valid return state according the CPython API. I checked the code and it is actually impossible to get to have GetPointer return a nullptr and NOT set up an exception
  • Change a cstyle cast to an explicit C++ reinterpret_cast for readability.

Template:

Suggested changelog entry:

* Further improve another error in exception handling.

@Skylion007 Skylion007 changed the title bugfix: Handle pycapsule exception handling corner case chore: Improve PyCapsule exception handling Oct 10, 2022
@Skylion007 Skylion007 requested a review from rwgk October 10, 2022 19:44
@Skylion007
Copy link
Collaborator Author

Okay, now this PR mainly just avoids a call to PyErrOccurred() if possible.

@@ -1829,7 +1829,7 @@ class capsule : public object {
// guard if destructor called while err indicator is set
error_scope error_guard;
auto destructor = reinterpret_cast<void (*)(void *)>(PyCapsule_GetContext(o));
if (PyErr_Occurred()) {
if (destructor == nullptr && PyErr_Occurred()) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main thing here is PyErr_Occurred() is a much more expensive call, especially on the common path so might as well avoid doing so if possible

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm ... but what if the error indicator is set (for whatever reason)? We're leaving the process in an invalid state, follow-on Python C API calls may behave in confusing ways (painful debugging memories come back).

If the Python documentation guarantees that the error indicator is not set unless destructor == nullptr, this change is fine.

Looking: https://docs.python.org/3/c-api/capsule.html#c.PyCapsule_GetContext


Return the current context stored in the capsule. On failure, set an exception and return NULL.

It is legal for a capsule to have a NULL context. This makes a NULL return code somewhat ambiguous; use PyCapsule_IsValid() or PyErr_Occurred() to disambiguate.


To me it sounds like:

  • If the error indicator is set the return value is certainly nullptr.
  • But knowing that the return value is nullptr does not tell us anything about the state of the error indicator.

So checking the error indicator is mandatory, unless we want to risk leaving the process in an invalid state, which I don't think is a responsible thing to do, even it this was more than a minor optimization.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwgk The documentation and the reference implementation seem to guarantee that the error indicator is not set unless it returns a nullptr.

A better way to state this is the following. Is it ever valid to set the ErrorIndicator and return a non-nullptr from this method? That is there any case where non-null and the error indicator is set (for the return to this function. Ignoring it could be set to an err due leakage from some other part of the code.

The answer is no, because if it returned any non-null ptr, then an error did not occur. The error only occurs if the capsule is PyCapsule_IsValid() returns false internally, it therefore unable to access the members of the pycapsule, and then return null. Therefore, this function only sets the error indicator IFF it returns a nullptr. The error guard guarantees that we the error indicator is not set.

Therefore, destructor being non-null strictly implies that no error has occurred.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm ... but what if the error indicator is set (for whatever reason)?

I mean that logic could be implied at any point in the code. At some point, you have to decide when ErrorIndicator is worth checking. This is a place where the flow is clearly defined and the ErrorIndicator cannot be set unless it's nullptr since we have an error guard currently in place.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, sorry for getting this wrong. I didn't look at the implementation yesterday, from that it's totally clear.

@@ -1829,7 +1829,7 @@ class capsule : public object {
// guard if destructor called while err indicator is set
error_scope error_guard;
auto destructor = reinterpret_cast<void (*)(void *)>(PyCapsule_GetContext(o));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line was introduced with PR #752:

https://github.com/pybind/pybind11/pull/752/files

@wjakob Why is this PyCapsule_GetContext() and not PyCapsule_GetDestructor()?

I see it's self-consistent within pybind11, PyCapsule_SetContext() is used here:

if (!m_ptr || PyCapsule_SetContext(m_ptr, (void *) destructor) != 0) {

But I'd expect that the rest of the world handles destructors via PyCapsule_SetDestructor() & PyCapsule_GetDestructor(). Could this lead to leaks, because we're never calling PyCapsule_GetDestructor() anywhere in pybind11?

I just looked, even Python 2.7 had PyCapsule_SetDestructor() & PyCapsule_GetDestructor() already.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out my worries here were unfounded. Based on feedback from @wjakob, I came to realize that my core misunderstanding was that py::capsule is controlling the lifetime (in a C++ dtor) but there is no C++ dtor here and the lifetime it is (of course) tied to the Python refcount of the capsule. This is connected to having misunderstood the nullptr for the 2nd argument of PyCapsule_New, that is the name, not the destructor.

I.e. everything is fine here. But having looked at this pretty closely now, I decided this minor cleanup will be useful: #4238

@Skylion007 Skylion007 requested a review from rwgk October 11, 2022 18:02
@Skylion007 Skylion007 merged commit 0927c4d into pybind:master Oct 11, 2022
@Skylion007 Skylion007 deleted the pycapsule-errcheck-only-if-null branch October 11, 2022 20:07
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Oct 11, 2022
@henryiii henryiii changed the title chore: Improve PyCapsule exception handling fix: Improve PyCapsule exception handling Oct 20, 2022
@henryiii henryiii removed the needs changelog Possibly needs a changelog entry label Oct 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants