Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pybind11_select_caster #3862

Draft
wants to merge 16 commits into
base: master
Choose a base branch
from
Draft

Add pybind11_select_caster #3862

wants to merge 16 commits into from

Conversation

laramiel
Copy link
Contributor

@laramiel laramiel commented Apr 11, 2022

Description

Instead of only using type_caster template specialization, it is possible to register the conversions with a function declaration, which will be found with ADL.

namespace foo {

class BarCaster;

class Bar {
  friend BarCaster pybind11_select_caster(Bar*);
};

The parameter is a pointer to the C++ (intrinsic) type. The returned type is the caster class. This function should have no implementation!

  • The declarations need to be defined in the original namespace of the converted C++ type, ideally declared as a friend function in the type itself.
  • The default behavior did not change: type_caster will be used as a fallback.

Because type_caster not always used, load_caster has to accept a generic Caster argument.

This is manually adapted from #3643 and #3674

NOTE

https://en.cppreference.com/w/cpp/language/definition#One_Definition_Rule

ODR violations can be easy to introduce and hard to diagnose in pybind11. Essentially if there exist two types (perhaps specializations of pybind11::type_caster<T>), or two functions (perhaps pybind11::cast<T>) which use definitions that have "different sequences of tokens", then the code is in violation of the ODR.

While this doesn't solve that problem, it allows a classes to more easily declare custom casters which can help ensure that the correct caster is always used.

For more on ODR violations, see https://www.youtube.com/watch?v=IY8tHh2LSX4

Suggested changelog entry:

Allow type_caster<> selection via ADL.


struct inty {
...
friend inty_type_caster pybind11_select_caster(inty*);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the feature is intrusive on the wrapped C++ code, it is not immediately obvious that that's a choice. I think the previous version of the doc was actually better, because it makes it clear that pybind11_select_caster is usually only in the pybind11 bindings code (not user code). Maybe adding a note could nice:

Note that pybind11_select_caster can also be implemented as a friend function if that is preferred.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it's preferred to use it as a friend function. But it is permitted to be used as a non-friend function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it's preferred to use it as a friend function.

My main worry is that people walk away with the wrong impression that this feature is intrusive on their C++ code, and therefore don't use pybind11_select_caster at all. With the doc as is, I'd bet that's fairly likely to happen.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Maybe a small comment stating it is not required if you can use only public variables?

@rwgk rwgk requested a review from Skylion007 April 11, 2022 21:31
value.long_value = PyLong_AsLong(tmp);
Py_DECREF(tmp);
/* Ensure return code was OK (to avoid out-of-range errors etc) */
return !(value.long_value == -1 && !PyErr_Occurred());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This is actually a bad practice in our documentation. We should clear the python error or raise it to a C++ error if it makes sense Leaving the error state like this not ideal.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be a good followup, unless you want to suggest the update inline. It might be elsewhere?

Copy link
Collaborator

@henryiii henryiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of very minor nits would be nice to address, but looks pretty safe and useful.

// Dummy implementation, purely to avoid compiler warnings (unused function).
// Easier than managing compiler-specific pragmas for warning suppression.
// MSVC happens to not generate any warnings. Leveraging that to prove that
// this test actually works without an implementation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention [[maybe_unused]] for C++17? If we ever go C++17+ only, we can then just use that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! We already have PYBIND11_MAYBE_UNUSED, trying that now in the CI.

@rwgk
Copy link
Collaborator

rwgk commented May 6, 2022

@laramiel
@amauryfa
@henryiii
@Skylion007

I revised custom.rst top to bottom. Could you please review?

I only changed custom.rst, the rest is as before.

I kept example code changes to a minimum, but did address @Skylion007's concern (PyErr_Clear()).


struct inty {
...
friend inty_type_caster pybind11_select_caster(inty*);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work? the type inty_type_caster probably cannot be defined before this declaration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would have to be forward declared:

class inty_type_caster;

struct inty {
  ...
  friend inty_type_caster pybind11_select_caster(inty*);
};

When using custom type casters, it's important to declare them consistently
in every compilation unit of the Python extension module. Otherwise,
undefined behavior can ensue.
When using this method, it is important to declare the specializations
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the ADL method does not solve the ODR issue.
For example, for a given T there can be only one definition of load_caster<T>(const handle&) (defined below), so make_caster<T> must be consistent across all modules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the warning needs to cover both type casters.

.. warning::

For a given type, T, the pybind11 method make_caster<T> must be consistent across all compilation units
to satisfy the C++ One Definition Rule (ODR <https://en.cppreference.com/w/cpp/language/definition>_).

@rwgk rwgk changed the title Allow type_caster<> selection via ADL Add pybind11_select_caster May 6, 2022
@rwgk
Copy link
Collaborator

rwgk commented May 7, 2022

@laramiel @amauryfa @henryiii @Skylion007

Thanks Amaury and Laramie for the feedback! And Henry for the [[maybe_unused]] hint from 3 weeks ago. I think I addressed everything now, and beyond, by porting the rst code blocks to the new tests/test_docs_advanced_cast_custom.cpp, making a pass revising the source code comments, clang-formatting, then copying back from the test to the rst file.

I also renamed test_make_caster_adl to test_select_caster, to make it more clear what is tested. (I tried to use test_pybind11_select_caster but that leads to a name clash, because test_ is stripped off for the submodule name.)

Regrading test_docs_advanced_cast_custom.cpp, the three alternatives shown in the documentation are handled with a #define:

// 0: Use the older pybind11::detail::type_caster<T> specialization mechanism.
// 1: Use pybind11_select_caster declaration in user_space.
// 2: Use pybind11_select_caster friend function declaration.
#ifndef PYBIND11_TEST_DOCS_ADVANCED_CAST_CUSTOM_ALTERNATIVE
#    define PYBIND11_TEST_DOCS_ADVANCED_CAST_CUSTOM_ALTERNATIVE 1
#endif

It would be ideal to exercise all 3 alternatives, but I'm not sure how to best do that. I'm not even sure if it's worth the trouble, the extra coverage is just a few lines of C++. I'm thinking of leaving that for "later, maybe". What's in this PR is already clearly beyond the status quo (rst code blocks not tested at all; outdated formatting).

@rwgk
Copy link
Collaborator

rwgk commented May 13, 2022

Putting this PR into Draft mode. The rationale:

  • The primary motivation behind the original PR Function pybind11_select_caster uses ADL to select the caster. #3643 was to add support for translation-unit-specific casters.
  • But the idea that pybind11_select_caster could be a stepping stone towards TU-specific casters was conclusively laid to rest by what we learned from the experiments under DEAD END: Make make_caster unique_to_translation_unit #3931.
  • We are still looking into another idea, which is not very concrete at the moment (it involves leveraging the typename... Extra mechanism).
  • To not make the starting point for that work more complicated, we think it is best to hold off merging pybind11_select_caster for now.

Let's wait and see where the new idea leads, and reevaluate adding pybind11_select_caster when there is more certainty.

@rwgk rwgk marked this pull request as draft May 13, 2022 19:30
rwgk pushed a commit to rwgk/pybind11 that referenced this pull request Jun 21, 2022
rwgk pushed a commit to rwgk/pybind11 that referenced this pull request Jun 28, 2022
rwgk pushed a commit that referenced this pull request Jul 21, 2022
* Insert type_caster_odr_guard<> (an empty struct to start with).

* Add odr_guard_registry() used in type_caster_odr_guard() default constructor.

* Add minimal_real_caster (from PR #3862) to test_async, test_buffers

* VERY MESSY SNAPSHOT of WIP, this was the starting point for cl/454658864, which has more changes on top.

* Restore original test_async, test_buffers from current smart_holder HEAD

* Copy from cl/454991845 snapshot Jun 14, 5:08 PM

* Cleanup of tests. Systematically insert `if (make_caster<T>::translation_unit_local) {`

* Small simplification of odr_guard_impl()

* WIP

* Add PYBIND11_SOURCE_FILE_LINE macro.

* Replace PYBIND11_TYPE_CASTER_UNIQUE_IDENTIFIER with PYBIND11_TYPE_CASTER_SOURCE_FILE_LINE, baked into PYBIND11_TYPE_CASTER macro.

* Add more PYBIND11_DETAIL_TYPE_CASTER_ACCESS_TRANSLATION_UNIT_LOCAL; resolves "unused" warning when compiling test_custom_type_casters.cpp

* load_type fixes & follow-on cleanup

* Strip ./ from source_file_line

* Add new tests to CMakeLists.txt, disable PYBIND11_WERROR

* Replace C++17 syntax. Compiles with Debian clang 13 C++11 mode, but fails to link. Trying GitHub Actions anyway to see if there are any platforms that support https://en.cppreference.com/w/cpp/language/tu_local before C++20. Note that Debian clang 13 C++17 works locally.

* Show C++ version along with ODR VIOLATION DETECTED message.

* Add source_file_line_basename()

* Introduce PYBIND11_TYPE_CASTER_ODR_GUARD_ON (but not set automatically).

* Minor cleanup.

* Set PYBIND11_TYPE_CASTER_ODR_GUARD_ON automatically.

* Resolve clang-tidy error.

* Compatibility with old compilers.

* Fix off-by-one in source_file_line_basename()

* Report PYBIND11_INTERNALS_ID & C++ Version from pytest_configure()

* Restore use of PYBIND11_WERROR

* Move cpp_version_in_use() from cast.h to pybind11_tests.cpp

* define PYBIND11_DETAIL_ODR_GUARD_IMPL_THROW_DISABLED true in test_odr_guard_1,2.cpp

* IWYU cleanup of detail/type_caster_odr_guard.h

* Replace `throw err;` to resolve clang-tidy error.

* Add new header filename to CMakeLists.txt, test_files.py

* Experiment: Try any C++17 compiler.

* Fix ifdef for pragma GCC diagnostic.

* type_caster_odr_guard_impl() cleanup

* Move type_caster_odr_guard to type_caster_odr_guard.h

* Rename test_odr_guard* to test_type_caster_odr_guard*

* Remove comments that are (now) more distracting than helpful.

* Mark tu_local_no_data_always_false operator bool as explicit (clang-tidy). See also: https://stackoverflow.com/questions/39995573/when-can-i-use-explicit-operator-bool-without-a-cast

* New PYBIND11_TYPE_CASTER_ODR_GUARD_STRICT option (current on by default).

* Add test_type_caster_odr_registry_values(), test_type_caster_odr_violation_detected_counter()

* Report UNEXPECTED: test_type_caster_odr_guard_2.cpp prevailed (but do not fail).

* Apply clang-tidy suggestion.

* Attempt to handle valgrind behavior.

* Another attempt to handle valgrind behavior.

* Yet another attempt to handle valgrind behavior.

* Trying a new direction: show compiler info & std for UNEXPECTED: type_caster_odr_violation_detected_count() == 0

* compiler_info MSVC fix. num_violations == 0 condition.

* assert pybind11_tests.compiler_info is not None

* Introduce `make_caster_intrinsic<T>`, to be able to undo the 2 changes from `load_type` to `load_type<T>`. This is to avoid breaking 2 `pybind11::detail::load_type()` calls found in the wild (Google global testing).

One of the breakages in the wild was: https://github.com/google/tensorstore/blob/0f0f6007670a3588093acd9df77cce423e0de805/python/tensorstore/subscript_method.h#L61

* Add test for stl.h / stl_bind.h mix.

Manually verified that the ODR guard detects the ODR violation:

```
C++ Info: Debian Clang 13.0.1 C++17 __pybind11_internals_v4_clang_libstdcpp_cxxabi1002_sh_def__
=========================================================== test session starts ============================================================
platform linux -- Python 3.9.12, pytest-6.2.3, py-1.10.0, pluggy-0.13.1 -- /usr/bin/python3
...
================================================================= FAILURES =================================================================
_____________________________________________ test_type_caster_odr_violation_detected_counter ______________________________________________

    def test_type_caster_odr_violation_detected_counter():
        ...
        else:
>           assert num_violations == 1
E           assert 2 == 1
E             +2
E             -1

num_violations = 2

test_type_caster_odr_guard_1.py:51: AssertionError
========================================================= short test summary info ==========================================================
FAILED test_type_caster_odr_guard_1.py::test_type_caster_odr_violation_detected_counter - assert 2 == 1
======================================================= 1 failed, 5 passed in 0.08s ========================================================
```

* Eliminate need for `PYBIND11_DETAIL_TYPE_CASTER_ACCESS_TRANSLATION_UNIT_LOCAL` macro.

Copying code first developed by @amauryfa. I tried this at an earlier stage, but by itself this was insufficient. In the meantime I added in the TU-local mechanisms: trying again.

Passes local testing:
```
DISABLED std::system_error: ODR VIOLATION DETECTED: pybind11::detail::type_caster<mrc_ns::type_mrc>: SourceLocation1="/usr/local/google/home/rwgk/forked/pybind11/tests/test_type_caster_odr_guard_1.cpp:18", SourceLocation2="/usr/local/google/home/rwgk/forked/pybind11/tests/test_type_caster_odr_guard_2.cpp:19"
C++ Info: Debian Clang 13.0.1 C++17 __pybind11_internals_v4_clang_libstdcpp_cxxabi1002_sh_def__
=========================================================== test session starts ============================================================
platform linux -- Python 3.9.12, pytest-6.2.3, py-1.10.0, pluggy-0.13.1 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /usr/local/google/home/rwgk/forked/pybind11/tests, configfile: pytest.ini
collected 6 items

test_type_caster_odr_guard_1.py::test_type_mrc_to_python PASSED
test_type_caster_odr_guard_1.py::test_type_mrc_from_python PASSED
test_type_caster_odr_guard_1.py::test_type_caster_odr_registry_values PASSED
test_type_caster_odr_guard_1.py::test_type_caster_odr_violation_detected_counter PASSED
test_type_caster_odr_guard_2.py::test_type_mrc_to_python PASSED
test_type_caster_odr_guard_2.py::test_type_mrc_from_python PASSED

============================================================ 6 passed in 0.01s =============================================================
```

* tu_local_descr with src_loc experiment

* clang-tidy suggested fixes

* Use source_file_line_from_sloc in type_caster_odr_guard_registry

* Disable type_caster ODR guard for __INTEL_COMPILER (see comment). Also turn off printf.

* Add missing include (discovered via google-internal testing).

* Work `scr_loc` into `descr`

* Use `TypeCasterType::name.sloc` instead of `source_file_line.sloc`

Manual re-verification:

```
+++ b/tests/test_type_caster_odr_guard_2.cpp
-    // m.def("pass_vector_type_mrc", mrc_ns::pass_vector_type_mrc);
+    m.def("pass_vector_type_mrc", mrc_ns::pass_vector_type_mrc);
```

```
>           assert num_violations == 1
E           assert 2 == 1

num_violations = 2

test_type_caster_odr_guard_1.py:51: AssertionError
```

* Fix small oversight (src_loc::here() -> src_loc{nullptr, 0}).

* Remove PYBIND11_DETAIL_TYPE_CASTER_ACCESS_TRANSLATION_UNIT_LOCAL macro completely.

* Remove PYBIND11_TYPE_CASTER_SOURCE_FILE_LINE macro completely. Some small extra cleanup.

* Minor tweaks looking at the PR with a fresh eye.

* src_loc comments

* Add new test_descr_src_loc & and fix descr.h `concat()` `src_loc` bug discovered while working on the test.

* Some more work on source code comments.

* Fully document the ODR violations in the ODR guard itself and introduce `PYBIND11_TYPE_CASTER_ODR_GUARD_ON_IF_AVAILABLE`

* Update comment (incl. mention of deadsnakes known to not work as intended).

* Use no-destructor idiom for type_caster_odr_guard_registry, as suggested by @laramiel

* Fix clang-tidy error: 'auto reg' can be declared as 'auto *reg' [readability-qualified-auto,-warnings-as-errors]

* WIP

* Revert "WIP" (tu_local_no_data_always_false_base experiment).

This reverts commit 31e8ac5.

* Change `PYBIND11_TYPE_CASTER_ODR_GUARD_ON` to `PYBIND11_ENABLE_TYPE_CASTER_ODR_GUARD`, based on a suggestion by @rainwoodman

* Improved `#if` determining `PYBIND11_ENABLE_TYPE_CASTER_ODR_GUARD`, based on suggestion by @laramiel

* Make `descr::sloc` `const`, as suggested by @rainwoodman

* Rename macro to `PYBIND11_DETAIL_TYPE_CASTER_ODR_GUARD_IMPL_DEBUG`, as suggested by @laramiel

* Tweak comments some more (add "white hat hacker" analogy).

* Bring back `PYBIND11_CPP17` in determining `PYBIND11_ENABLE_TYPE_CASTER_ODR_GUARD`, to hopefully resolve most if not all of the many CI failures (89 failing, 32 successful: https://github.com/pybind/pybind11/runs/7430295771).

* Try another workaround for `__has_builtin`-related breakages (https://github.com/pybind/pybind11/runs/7430720321).

* Remove `defined(__has_builtin)` and subconditions.

* Update "known to not work" expectation in test and comment.

* `pytest.skip` `num_violations == 0` only `#ifdef __NO_INLINE__` (irrespective of the compiler)

* Systematically change all new `#ifdef` to `#if defined` (review suggestion).

* Bring back MSVC comment that got lost while experimenting.
@rwgk
Copy link
Collaborator

rwgk commented Aug 31, 2024

2+ years have passed: Closing.

(It would be great to rescue the custom.rst edits.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants