Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP cross module exception ODR experiments #4283

Closed
wants to merge 8 commits into from

Conversation

rwgk
Copy link
Collaborator

@rwgk rwgk commented Oct 24, 2022

Description

Guessing what could be behind #4105

Local testing passes. Let's see what happens in the CI.

Suggested changelog entry:

@henryiii
Copy link
Collaborator

This has to be built with different versions of pybind11, right? - see #4282 where I'm setting up a way to do that.

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 24, 2022

I saw the emails from your PR but haven't looked at it yet.

I'm trying to emulate ODR, so that the test is simpler. This is following what worked for #4022.

I don't know if and how cross-module exceptions are different or special, compared to the type_caster situation, but it's a starting point I'm familiar with.

I'm not certain this will become a permanent test, or only inform something else after I learn something here.

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 24, 2022

The CI succeeds on all platforms. — I was hoping for at least some failures, because very clearly, the definitions of class evolving in cross_module_exception_odr_1.cpp & cross_module_exception_odr_2.cpp are very different (modeling #1895). The MSVC errors in the first attempt prove that that class is DLL-exported. Why does that obvious ODR violation not break anything? What am I missing or overlooking?

@henryiii
Copy link
Collaborator

I believe ODR is UB, and in this case, because everything's pretty much the same, UB is simply the (un)expected behavior?

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 25, 2022

Quick note, I cannot explain properly at the moment, but at least on my Linux workstation the finding is conclusive:

When adding this to conftest.py

import sys
sys.setdlopenflags(0x100 | 0x2)  # RTLD_GLOBAL | RTLD_NOW
  • it still works if I run test_cross_module_exception_odr_1.py or test_cross_module_exception_odr_2.py separately,
  • but running the 2nd after the 1st, I'm seeing a Fatal Python error: Segmentation fault, or RuntimeError: D instead of v2 when running with gdb (i.e. the behavior of the same binaries is different, running normally, or with gdb).

I.e. classical UB.

Preliminary conclusion: RTLD_LOCAL is another layer of hiding the symbols that otherwise lead to ODR.

I still have to figure out what the situation is under macOS and Windows.

This is with pytest.skip() in test_raise_evolving_from_module_2.

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 25, 2022

@henryiii @Skylion007 @lalaland @jbms does anyone know: what's the equivalent of RTLD_GLOBAL on Windows?

From looking around I'm thinking it doesn't exist, Windows only has behavior similar to RTLD_LOCAL, but I didn't find a source that clearly says that.

Based on the CI for 42c9554, it looks like RTLD_GLOBAL (nicely!) breaks Linux, but not macOS. — CAVEAT: I didn't go through with a comb yet.

I'm still not clear how exactly symbols are resolved when pybind11 extensions are dlopened, but it seem the default RTLD_LOCAL makes the combination of #2999 and #1895 safe.

Everything I'm saying here is preliminary and could use multiple sets of extra eyes.

I also started another super simple experiment under #4284.

@jbms
Copy link
Contributor

jbms commented Oct 25, 2022

I don't believe Windows has anything like rtld_global.

I think #2999 is relevant when you have one c++ shared library directly linking to another c++ shared library and a pybind11 exception type crosses the abi boundary between them. I don't think Python is really directly involved at all here.

@jbms
Copy link
Contributor

jbms commented Oct 25, 2022

Actually, I realized I hadn't looked sufficiently carefully at #3016. The intent of that test case indeed seems to be to allow pybind11 exceptions to be thrown by one pybind11 extension and caught/translated by another, which effectively means pybind11 abi compatibility has to also extend to the pybind11 exception types, which may not be desirable and I don't think was fully intended.

In fact there need not be any linking between the two c++ libraries, if the exception type is fully defined in a header. Assigning non-hidden visibility to the exception type actually has two effects:

  1. Allows its members to be linked globally
  2. When using libc++, allows it to be globally identified for rtti purposes (I think by string comparison of their mangled names)

1 is not relevant in this case because no linking happens anyway (due to rtld_local and lack of explicit delendency from one module to the other.

2 is not relevant when using libstdc++, because it appears that types are always globally identifiable when using libstdc++ regardless of visibility.

It seems like at this point some users may be relying on the exception types as part of the abi, as in the example, and therefore the abi version must be bumped whenever their abi changes.
This was the case even before #2999, since even without the visibility change, it worked in the common case of libstdc++.

If it is desired that the exception types not be a part of the abi, then there are some potential migration paths.

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 25, 2022

To close the loop here, #4015, which triggered this PR, was closed. "Not an ABI issue" again.

I left a comment there: #4105 (comment)

@jbms I think I want to use #4284 to trim back what was done with #2999. I'll tag you there when I get to it.

rwgk pushed a commit to rwgk/pybind11 that referenced this pull request Oct 31, 2022
Background: pybind#2999, pybind#4105, pybind#4283, pybind#4284

In a nutshell:

* Only macOS actually needs `PYBIND11_EXPORT_EXCEPTION` (pybind#4284).

* Evidently (pybind#4283), under macOS `PYBIND11_EXPORT_EXCEPTION` does not run the risk of introducing ODR violations,

* but evidently (pybind#4283) under Linux it does, in the presumably rare/unusual situation that `RTLD_GLOBAL` is used.

* Windows does no have the equivalent of `RTLD_GLOBAL`, therefore `PYBIND11_EXPORT_EXCEPTION` has no practical benefit, on the contrary, noisy warning suppression pragmas are needed, therefore it is best left empty.
henryiii pushed a commit that referenced this pull request Oct 31, 2022
…#4298)

Background: #2999, #4105, #4283, #4284

In a nutshell:

* Only macOS actually needs `PYBIND11_EXPORT_EXCEPTION` (#4284).

* Evidently (#4283), under macOS `PYBIND11_EXPORT_EXCEPTION` does not run the risk of introducing ODR violations,

* but evidently (#4283) under Linux it does, in the presumably rare/unusual situation that `RTLD_GLOBAL` is used.

* Windows does no have the equivalent of `RTLD_GLOBAL`, therefore `PYBIND11_EXPORT_EXCEPTION` has no practical benefit, on the contrary, noisy warning suppression pragmas are needed, therefore it is best left empty.
@rwgk
Copy link
Collaborator Author

rwgk commented Oct 31, 2022

Archiving the CI summary here before closing the PR, JIC, for future reference.

2 skipped, 54 successful, and 21 failing checks

@github-actions
Upstream / 🐍 3.11 latest internals • ubuntu-latest • x64 (pull_request) Skipped
Details
@github-actions
CI / 🐍 3.6 • ubuntu-latest • x64 -DPYBIND11_FINDPYTHON=ON -DCMAKE_CXX_FLAGS="-D_=1" (pull_request) Successful in 12m
Details
@github-actions
Config / 🐍 3.7 • CMake 3.23 • ubuntu-latest (pull_request) Successful in 24s
Details
@github-actions
Format / Format (pull_request) Successful in 55s
Details
@github-actions
Pip / 🐍 3.6 • 📦 tests • windows-latest (pull_request) Successful in 4m
Details
@github-actions
CI / 🐍 3.9 • ubuntu-latest • x64 (pull_request) Successful in 13m
Details
@github-actions
Config / 🐍 3.7 • CMake 3.23 • macos-latest (pull_request) Successful in 1m
Details
@github-actions
CI / 🐍 3.10 • ubuntu-latest • x64 (pull_request) Successful in 10m
Details
@github-actions
Config / 🐍 3.7 • CMake 3.23 • windows-latest (pull_request) Successful in 1m
Details
@github-actions
CI / 🐍 3.11-dev • ubuntu-latest • x64 (pull_request) Successful in 10m
Details
@github-actions
Config / 🐍 3.7 • CMake 3.4 • ubuntu-latest (pull_request) Successful in 24s
Details
@github-actions
CI / 🐍 pypy-3.7 • ubuntu-latest • x64 (pull_request) Successful in 11m
Details
@github-actions
Config / 🐍 3.7 • CMake 3.7 • macos-latest (pull_request) Successful in 55s
Details
@github-actions
CI / 🐍 pypy-3.8 • ubuntu-latest • x64 -DPYBIND11_FINDPYTHON=ON (pull_request) Successful in 12m
Details
@github-actions
Config / 🐍 3.7 • CMake 3.18 • windows-2019 (pull_request) Successful in 2m
Details
@github-actions
CI / 🐍 pypy-3.9 • ubuntu-latest • x64 (pull_request) Successful in 12m
Details
@github-actions
CI / 🐍 3.6 • windows-2022 • x64 (pull_request) Successful in 16m
Details
@github-actions
CI / 🐍 3.9 • windows-2022 • x64 (pull_request) Successful in 14m
Details
@github-actions
CI / 🐍 3.10 • windows-2022 • x64 (pull_request) Successful in 16m
Details
@github-actions
CI / 🐍 3.11-dev • windows-2022 • x64 (pull_request) Successful in 14m
Details
@github-actions
CI / 🐍 pypy-3.7 • windows-2022 • x64 (pull_request) Successful in 19m
Details
@github-actions
CI / 🐍 pypy-3.8 • windows-2022 • x64 (pull_request) Successful in 18m
Details
@github-actions
CI / 🐍 pypy-3.9 • windows-2022 • x64 (pull_request) Successful in 18m
Details
@github-actions
CI / 🐍 3.6 • macos-latest • x64 (pull_request) Successful in 22m
Details
@github-actions
CI / 🐍 3.9 • macos-latest • x64 (pull_request) Successful in 18m
Details
@github-actions
CI / 🐍 3.10 • macos-latest • x64 (pull_request) Successful in 22m
Details
@github-actions
CI / 🐍 3.11-dev • macos-latest • x64 (pull_request) Successful in 23m
Details
@github-actions
CI / 🐍 pypy-3.7 • macos-latest • x64 (pull_request) Failing after 11m
Details
@github-actions
CI / 🐍 pypy-3.8 • macos-latest • x64 (pull_request) Failing after 8m
Details
@github-actions
CI / 🐍 pypy-3.9 • macos-latest • x64 (pull_request) Failing after 12m
Details
@github-actions
CI / 🐍 3.6 • windows-2019 • x64 -DPYBIND11_FINDPYTHON=ON (pull_request) Successful in 15m
Details
@github-actions
CI / 🐍 3.9 • windows-2019 • x64 (pull_request) Successful in 13m
Details
@github-actions
CI / 🐍 3.9-dbg (deadsnakes) • Valgrind • x64 (pull_request) Failing after 8m
Details
@github-actions
Format / Clang-Tidy (pull_request) Successful in 10m
Details
@github-actions
Pip / 🐍 3.8 • 📦 & 📦 tests • ubuntu-latest (pull_request) Successful in 1m
Details
@github-actions
CI / 🐍 3.11-dev (deadsnakes) • x64 (pull_request) Failing after 3m
Details
@github-actions
CI / 🐍 3 • Clang 3.6 • C++11 • x64 (pull_request) Failing after 2m
Details
@github-actions
Pip / Upload to PyPI (pull_request) Skipped
Details
@github-actions
CI / 🐍 3 • Clang 3.7 • C++11 • x64 (pull_request) Failing after 3m
Details
@github-actions
CI / 🐍 3 • Clang 3.9 • C++11 • x64 (pull_request) Failing after 4m
Details
@github-actions
CI / 🐍 3 • Clang 7 • C++11 • x64 (pull_request) Failing after 4m
Details
@github-actions
CI / 🐍 3 • Clang 9 • C++11 • x64 (pull_request) Failing after 4m
Details
@github-actions
CI / 🐍 3 • Clang dev • C++11 • x64 (pull_request) Successful in 5m
Details
@github-actions
CI / 🐍 3 • Clang 5 • C++14 • x64 (pull_request) Failing after 3m
Details
@github-actions
CI / 🐍 3 • Clang 10 • C++20 • x64 (pull_request) Failing after 6m
Details
@github-actions
CI / 🐍 3 • Clang 10 • C++17 • x64 (pull_request) Failing after 4m
Details
@github-actions
CI / 🐍 3 • Clang 11 • C++20 • x64 (pull_request) Failing after 5m
Details
@github-actions
CI / 🐍 3 • Clang 12 • C++20 • x64 (pull_request) Failing after 5m
Details
@github-actions
CI / 🐍 3 • Clang 13 • C++20 • x64 (pull_request) Failing after 4m
Details
@github-actions
CI / 🐍 3 • Clang 14 • C++20 • x64 (pull_request) Failing after 5m
Details
@github-actions
CI / 🐍 3.10 • CUDA 11.7 • Ubuntu 22.04 (pull_request) Failing after 6m
Details
@github-actions
CI / 🐍 3 • CentOS7 / PGI 22.9 • x64 (pull_request) Failing after 19m
Details
@github-actions
CI / 🐍 3 • GCC 7 • C++11• x64 (pull_request) Successful in 4m
Details
@github-actions
CI / 🐍 3 • GCC 7 • C++17• x64 (pull_request) Successful in 6m
Details
@github-actions
CI / 🐍 3 • GCC 8 • C++14• x64 (pull_request) Successful in 4m
Details
@github-actions
CI / 🐍 3 • GCC 8 • C++17• x64 (pull_request) Successful in 4m
Details
@github-actions
CI / 🐍 3 • GCC 10 • C++17• x64 (pull_request) Successful in 4m
Details
@github-actions
CI / 🐍 3 • GCC 11 • C++20• x64 (pull_request) Successful in 5m
Details
@github-actions
CI / 🐍 3 • GCC 12 • C++20• x64 (pull_request) Successful in 5m
Details
@github-actions
CI / 🐍 3 • ICC latest • x64 (pull_request) Failing after 9m
Details
@github-actions
CI / 🐍 3 • centos:7 • x64 (pull_request) Failing after 4m
Details
@github-actions
CI / 🐍 3 • almalinux:8 • x64 (pull_request) Successful in 4m
Details
@github-actions
CI / 🐍 3 • almalinux:9 • x64 (pull_request) Successful in 4m
Details
@github-actions
CI / 🐍 3.7 • Debian • x86 • Install (pull_request) Successful in 3m
Details
@github-actions
CI / Documentation build test (pull_request) Successful in 1m
Details
@github-actions
CI / 🐍 3.6 • MSVC 2019 • x86 (pull_request) Successful in 6m
Details
@github-actions
CI / 🐍 3.7 • MSVC 2019 • x86 -DCMAKE_CXX_STANDARD=14 (pull_request) Successful in 6m
Details
@github-actions
CI / 🐍 3.8 • MSVC 2019 • x86 -DCMAKE_CXX_STANDARD=17 (pull_request) Successful in 5m
Details
@github-actions
CI / 🐍 3.9 • MSVC 2019 • x86 -DCMAKE_CXX_STANDARD=20 (pull_request) Successful in 6m
Details
@github-actions
CI / 🐍 3.8 • MSVC 2019 (Debug) • x86 -DCMAKE_CXX_STANDARD=17 (pull_request) Successful in 6m
Details
@github-actions
CI / 🐍 3.9 • MSVC 2019 (Debug) • x86 -DCMAKE_CXX_STANDARD=20 (pull_request) Successful in 7m
Details
@github-actions
CI / 🐍 3.9 • MSVC 2022 C++20 • x64 (pull_request) Successful in 7m
Details
@github-actions
CI / 🐍 3 • windows-latest • mingw64 (pull_request) Successful in 18m
Details
@github-actions
CI / 🐍 3 • windows-latest • mingw32 (pull_request) Successful in 19m
Details

continuous-integration/appveyor/pr — AppVeyor build succeeded
Details

docs/readthedocs.org:pybind11 — Read the Docs build succeeded!
Details
@pre-commit-ci
pre-commit.ci - pr — checks completed successfully

@rwgk rwgk closed this Oct 31, 2022
rwgk pushed a commit that referenced this pull request Nov 18, 2022
…#4298)

Background: #2999, #4105, #4283, #4284

In a nutshell:

* Only macOS actually needs `PYBIND11_EXPORT_EXCEPTION` (#4284).

* Evidently (#4283), under macOS `PYBIND11_EXPORT_EXCEPTION` does not run the risk of introducing ODR violations,

* but evidently (#4283) under Linux it does, in the presumably rare/unusual situation that `RTLD_GLOBAL` is used.

* Windows does no have the equivalent of `RTLD_GLOBAL`, therefore `PYBIND11_EXPORT_EXCEPTION` has no practical benefit, on the contrary, noisy warning suppression pragmas are needed, therefore it is best left empty.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants