Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: refcount bug involving trampoline functions with PyObject * return type. #5156

Merged
merged 9 commits into from
Jun 11, 2024

Conversation

rwgk
Copy link
Collaborator

@rwgk rwgk commented Jun 7, 2024

Description

The heap-use-after-free bug was discovered through PyCLIF-pybind11 testing with ASAN (see #5156 (comment)). The fix has two simple parts:

  1. Introduce a cast_safe() specialization for PyObject *.
  2. Preempt use of cast_ref in PYBIND11_OVERRIDE_IMPL (so that cast_safe() is used instead).

These changes ensure that

                return hfunc.f(std::forward<Args>(args)...).template cast<Return>();

in pybind11/functional.h does not incorrectly drop a Python reference count.

See also: PR #4601, which introduced type_caster<PyObject> but missed the corner case fixed by this PR.

For easy reference: Google-internal global testing ID: OCL:641800583:BASE:641923381:1718036131555:271433f2

Suggested changelog entry:

A refcount bug (leading to heap-use-after-free) involving trampoline functions with ``PyObject *`` return type was fixed.

Ralf W. Grosse-Kunstleve added 7 commits June 7, 2024 14:16
…cast to avoid duplicating the type name [modernize-use-auto,-warnings-as-errors]
```
/__w/pybind11/pybind11/tests/test_type_caster_pyobject_ptr.cpp:23:13: error: definition of implicit copy constructor for 'WithPyObjectPtrReturn' is deprecated because it has a user-declared destructor [-Werror,-Wdeprecated]
    virtual ~WithPyObjectPtrReturn() = default;
            ^
```
@rwgk
Copy link
Collaborator Author

rwgk commented Jun 9, 2024

For completeness, the original failure was (using the Google-internal toolchain):

blaze test --config=asan //third_party/clif/testing/python:virtual_funcs_test --//devtools/clif/python/codegen_mode=pybind11
...
==8010==ERROR: AddressSanitizer: heap-use-after-free on address 0x502000465750 at pc 0x560a2e7e0e08 bp 0x7ffff62b38b0 sp 0x7ffff62b38a8
READ of size 8 at 0x502000465750 thread T0
    #0 0x560a2e7e0e07 in Py_DECREF third_party/python_runtime/v3_11/Include/object.h:537:9
    #1 0x560a2e7e0e07 in Py_XDECREF third_party/python_runtime/v3_11/Include/object.h:602:9
    #2 0x560a2e7e0e07 in pybind11::handle::dec_ref() const & third_party/pybind11/include/pybind11/detail/../detail/../pytypes.h:283:9
    #3 0x560a2e818f3a in ~object third_party/pybind11/include/pybind11/detail/../detail/../pytypes.h:379:17
    #4 0x560a2e818f3a in pybind11::detail::type_caster<_object, void>::~type_caster() third_party/pybind11/include/pybind11/type_caster_pyobject_ptr.h:18:7
    #5 0x7fc5b7d0edf5 in __run_exit_handlers (/usr/grte/v5/lib64/libc.so.6+0x78df5) (BuildId: 9a996398ce14a94560b0c642eb4f6e94)
    #6 0x7fc5b7d0eef9 in exit (/usr/grte/v5/lib64/libc.so.6+0x78ef9) (BuildId: 9a996398ce14a94560b0c642eb4f6e94)
    #7 0x560a2f83591d in Py_Exit third_party/python_runtime/v3_11/Python/pylifecycle.c:2966:5
...

@rwgk
Copy link
Collaborator Author

rwgk commented Jun 9, 2024

To ensure that the added test reproduces the original failure:

Failure of the test added in this PR (without the production code changes; using the Google-internal toolchain):

blaze test --config=asan third_party/pybind11/tests:test_type_caster_pyobject_ptr
...
==2670==ERROR: AddressSanitizer: heap-use-after-free on address 0x50600260e4d0 at pc 0x7f38b2cc9bb8 bp 0x7ffd5eb55410 sp 0x7ffd5eb55408
READ of size 8 at 0x50600260e4d0 thread T0
    #0 0x7f38b2cc9bb7 in Py_DECREF third_party/python_runtime/v3_11/Include/object.h:537:9
    #1 0x7f38b2cc9bb7 in Py_XDECREF third_party/python_runtime/v3_11/Include/object.h:602:9
    #2 0x7f38b2cc9bb7 in pybind11::handle::dec_ref() const & third_party/pybind11/include/pybind11/detail/../detail/../pytypes.h:283:9
    #3 0x7f38b386b1aa in ~object third_party/pybind11/include/pybind11/detail/../detail/../pytypes.h:379:17
    #4 0x7f38b386b1aa in pybind11::detail::type_caster<_object, void>::~type_caster() third_party/pybind11/include/pybind11/type_caster_pyobject_ptr.h:18:7
...

rwgk pushed a commit to rwgk/pybind11clif that referenced this pull request Jun 10, 2024
@rwgk
Copy link
Collaborator Author

rwgk commented Jun 10, 2024

Ignoring the 3 failing Python 3.13 builds: The same failures appear under #3939.

@rwgk rwgk marked this pull request as ready for review June 10, 2024 08:13
@rwgk rwgk requested a review from Skylion007 June 10, 2024 08:13
@rwgk rwgk changed the title WIP: Fix refcount bug involving trampoline functions with PyObject * return type. Fix refcount bug involving trampoline functions with PyObject * return type. Jun 10, 2024
@rwgk
Copy link
Collaborator Author

rwgk commented Jun 11, 2024

Thanks @henryiii!

@rwgk rwgk merged commit ab955f1 into pybind:master Jun 11, 2024
83 checks passed
@rwgk rwgk deleted the trampoline_pyobject_ptr_return_fix_master branch June 11, 2024 21:01
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Jun 11, 2024
@henryiii henryiii changed the title Fix refcount bug involving trampoline functions with PyObject * return type. fix: refcount bug involving trampoline functions with PyObject * return type. Jun 23, 2024
@henryiii henryiii removed the needs changelog Possibly needs a changelog entry label Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants