Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-100554: Add Py_tp_vectorcall slot to set PyTypeObject.tp_vectorcall using the PyType_FromSpec function family. #123332

Merged
merged 7 commits into from
Sep 13, 2024

Conversation

wjakob
Copy link
Contributor

@wjakob wjakob commented Aug 26, 2024

This commit adds the Py_tp_vectorcall slot to set PyTypeObject.tp_vectorcall using the PyType_FromSpec() function family.

For context, this field is used to intercept type instantiation (MyType(args..)) and implement it using the vector call protocol. It provides access to one of the last missing non-internal PyTypeObject fields that cannot be set using the PyType_FromSpec() API.

The rationale for this change is that it can significantly improve the efficiency of C extensions built using the limited API. The runtime of a microbenchmark that repeatedly constructs a type with keyword arguments goes from 2.017 sec to 1.030 sec, a ~2x change on an extremely hot code path. All of this wasted time is spent creating temporary objects to go from one calling convention to another, and it can be entirely avoided by enabling the extension to provide a vector call-based constructor.

Having access to this flag is especially useful for binding frameworks that declare custom types written in another programming language (e.g. C++ or Rust) because a large set of downstream projects immediately reap the benefits. The original issue #100554 was created by @davidhewitt, who develops PyO3. I am the developer of nanobind and a coauthor of pybind11.

I would like to propose this feature for Python 3.14. Likely more controversial: I would argue that there is a case even for including it in 3.13, which is technically in the feature-freeze phase.

Fixes issue #100554.


📚 Documentation preview 📚: https://cpython-previews--123332.org.readthedocs.build/

Comment on lines 495 to 497
* The field :c:member:`~PyTypeObject.tp_vectorcall` can be set since
Python 3.14. On older versions, use :c:member:`~PyTypeObject.tp_new`
and/or :c:member:`~PyTypeObject.tp_init`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use .. versionchanged, like the 3.9 & 3.11 PyBufferProcs entries a few lines down.

Comment on lines 1016 to 1019
static PyType_Slot HeapCTypeVectorcall_slots[] = {
{Py_tp_vectorcall, heapctype_vectorcall},
{0, 0},
};
Copy link
Member

@encukou encukou Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like a good example of how not to do it, since it leaves tp_call from the base's metaclass:

>>> import _testcapi
>>> _testcapi.HeapCTypeVectorcall.__call__(_testcapi.HeapCTypeVectorcall, 'Sub', (), {})
<class '__main__.Sub'>

As per the docs, a class supporting vectorcall must also implement tp_call with the same semantics. Here, the overridden vectorcall function would need to match the behaviour of PyType_Type->tp_call (including any changes in future versions).

So, you do need to override tp_new, and things get even more tricky if you allow subclasses/instances of HeapCTypeVectorcall.

Copy link
Contributor Author

@wjakob wjakob Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, the point of this PR is to be able to do exactly the same as what tp_call does, just more efficiently, so this difference in behavior would not arise in real-world usage.

But that is less straightforward to test from the outside, hence the the contrived example of returning 123, which is not even an instance of the type.

(To keep the test reasonably compact, I would prefer not to deal with metaclasses on top)

I will override tp_new and make the type non-inheritable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried about leading users to a trap, and I'd prefer to see a full solution for the use case, to get an idea about where in the docs to put warnings, and what kinds of issues this opens.
Putting this in limited API means we should be extra careful about not exposing implementation details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to implement a more representative version but wonder about the following: how should the vectorcall version communicate to the test suite that it was used? If the equivalence requirement you stated is interpreted strictly, it should not be possible to detect any difference.

For example, would it be OK I return two different instances from each mode of construction and then add a comment saying that these are only different for test-related purposes, and that the contract of these operations is that they should be interchangeable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's fine for tests.

@wjakob
Copy link
Contributor Author

wjakob commented Aug 27, 2024

@encukou I incorporated the suggestions from your review, please let me know if this is okay now.

Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm convinced! :)
Thank you for the patch!

I sent a PR to your branch with suggested docs changes: wjakob#1

Additions to the limited API should get a C-API WG approval; I opened capi-workgroup/decisions#41

Comment on lines 499 to 500
:c:member:`~PyTypeObject.tp_new` and/or
:c:member:`~PyTypeObject.tp_init`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, you should use tp_new and/or tp_init on all versions; in 3.14 you can optimize after you have them.
(I removed this in my PR.)

@wjakob
Copy link
Contributor Author

wjakob commented Aug 28, 2024

What shall I do about the Misc/stable_abi.toml conflict?

(In the past, when I rebased PRs or changed history in other ways, it was mentioned to me that this breaks the review process that some authors use)

@encukou
Copy link
Member

encukou commented Aug 29, 2024

The C API WG vote is next.

To solve the conflict, merge in the main branch with a merge commit (or use the “resolve conflicts” button on GitHub).
Or leave it to me when I merge, it's a straightforward conflict.

@wjakob
Copy link
Contributor Author

wjakob commented Aug 29, 2024

To solve the conflict, merge in the main branch with a merge commit (or use the “resolve conflicts” button on GitHub).

Done.

Copy link
Contributor

@erlend-aasland erlend-aasland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this!

Modules/_testcapi/heaptype.c Show resolved Hide resolved
Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vote is finally done!

The code is good to go in, let me just fix up some flaws in my suggestions for the docs.

Doc/c-api/typeobj.rst Outdated Show resolved Hide resolved
Doc/c-api/typeobj.rst Outdated Show resolved Hide resolved
Doc/c-api/typeobj.rst Outdated Show resolved Hide resolved
Doc/c-api/typeobj.rst Outdated Show resolved Hide resolved
@encukou encukou merged commit 74330d9 into python:main Sep 13, 2024
36 of 37 checks passed
@wjakob
Copy link
Contributor Author

wjakob commented Sep 13, 2024

Awesome, thank you!!

wjakob added a commit to wjakob/nanobind that referenced this pull request Sep 14, 2024
Following the merge of python/cpython#123332,
type slot 82 is reserved for type vector calls. This means that we can
safely intercept this value if provided to nanobind even on older
versions of Python where the feature isn't supported yet. In this case,
the metaclass-based implementation is used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants