-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PEP 630: add disclaimers re. heap types & conversion, add PyType_GetModuleByDef #2319
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,9 +21,9 @@ describes problems of such per-process state and efforts to make | |
per-module state, a better default, possible and easy to use. | ||
|
||
The document also describes how to switch to per-module state where | ||
possible. The switch involves allocating space for that state, switching | ||
from static types to heap types, and—perhaps most importantly—accessing | ||
per-module state from code. | ||
possible. The switch involves allocating space for that state, potentially | ||
switching from static types to heap types, and—perhaps most | ||
importantly—accessing per-module state from code. | ||
|
||
About this document | ||
------------------- | ||
|
@@ -40,9 +40,15 @@ development, gaps identified in this text can show where to focus | |
the effort, and the text can be updated as new features are implemented | ||
|
||
Whenever this PEP mentions *extension modules*, the advice also | ||
applies to *built-in* modules, such as the C parts of the standard | ||
library. The standard library is expected to switch to per-module state | ||
early. | ||
applies to *built-in* modules. | ||
|
||
.. note:: | ||
This PEP contains generic advice. When following it, always take into | ||
account the specifics of your project. | ||
|
||
For example, while much of this advice applies to the C parts of | ||
Python's standard library, the PEP does not factor in stdlib specifics | ||
(unusual backward compatibility issues, access to private API, etc.). | ||
|
||
PEPs related to this effort are: | ||
|
||
|
@@ -101,7 +107,7 @@ testing the isolation, multiple module objects corresponding to a single | |
extension can even be loaded in a single interpreter. | ||
|
||
Per-module state provides an easy way to think about lifetime and | ||
resource ownership: the extension module author will set up when a | ||
resource ownership: the extension module will initialize when a | ||
module object is created, and clean up when it's freed. In this regard, | ||
a module is just like any other ``PyObject *``; there are no “on | ||
interpreter shutdown” hooks to think about—or forget about. | ||
|
@@ -235,7 +241,8 @@ Set ``PyModuleDef.m_size`` to a positive number to request that many | |
bytes of storage local to the module. Usually, this will be set to the | ||
size of some module-specific ``struct``, which can store all of the | ||
module's C-level state. In particular, it is where you should put | ||
pointers to classes (including exceptions) and settings (e.g. ``csv``'s | ||
pointers to classes (including exceptions, but excluding static types) | ||
and settings (e.g. ``csv``'s | ||
`field_size_limit <https://docs.python.org/3.8/library/csv.html#csv.field_size_limit>`__) | ||
which the C code needs to function. | ||
|
||
|
@@ -302,9 +309,9 @@ zero. In your own module, you're in control of ``m_size``, so this is | |
easy to prevent.) | ||
|
||
Heap types | ||
~~~~~~~~~~ | ||
---------- | ||
|
||
Traditionally, types defined in C code were *static*, that is, | ||
Traditionally, types defined in C code are *static*, that is, | ||
``static PyTypeObject`` structures defined directly in code and | ||
initialized using ``PyType_Ready()``. | ||
|
||
|
@@ -321,10 +328,27 @@ the Python level: for example, you can't set ``str.myattribute = 123``. | |
that shares any Python objects across interpreters implicitly depends | ||
on CPython's current, process-wide GIL. | ||
|
||
An alternative to static types is *heap-allocated types*, or heap types | ||
Because they are immutable and process-global, static types cannot access | ||
“their” module state. | ||
If any method of such a type requires access to module state, | ||
the type must be converted to a *heap-allocated type*, or *heap type* | ||
for short. These correspond more closely to classes created by Python’s | ||
``class`` statement. | ||
|
||
For new modules, using heap types by default is a good rule of thumb. | ||
|
||
Static types can be converted to heap types, but note that | ||
the heap type API was not designed for “lossless” conversion | ||
from static types -- that is, creating a type that works exactly like a given | ||
static type. Unlike static types, heap type objects are mutable by default. | ||
Also, when rewriting the class definition in a new API, | ||
you are likely to unintentionally change a few details (e.g. pickle-ability | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would reconsider this wording. Take care not to IMO sounds nicer than You are likely to. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, is the pickle issue still an issue? Wasn't that fixed by Serhiy long ago? The bpo issue that mentioned this issue is closed, marked as resolved. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The pickle issue was closed, because there's now a mechanism to use if preserving pickle-ability is important for you. In the stdlib, it pretty much always is important, but some third-party libraries won't care. To keep the PEP general, I don't want to include the specific fixes here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To get back to the original comment, I think changing the behavior is a fact of life rather than something to avoid. At least right now. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. |
||
or inherited slots). Always test the details that are important to you. | ||
|
||
|
||
Defining Heap Types | ||
~~~~~~~~~~~~~~~~~~~ | ||
|
||
Heap types can be created by filling a ``PyType_Spec`` structure, a | ||
description or “blueprint” of a class, and calling | ||
``PyType_FromModuleAndSpec()`` to construct a new class object. | ||
|
@@ -370,7 +394,7 @@ is called on a *subclass* of your type, ``Py_TYPE(self)`` will refer to | |
that subclass, which may be defined in different module than yours. | ||
|
||
.. note:: | ||
The following Python code. can illustrate the concept. | ||
The following Python code can illustrate the concept. | ||
``Base.get_defining_class`` returns ``Base`` even | ||
if ``type(self) == Sub``:: | ||
|
||
|
@@ -424,6 +448,52 @@ For example:: | |
{NULL}, | ||
} | ||
|
||
Module State Access from Slot Methods, Getters and Setters | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
.. note:: | ||
|
||
This is new in Python 3.11. | ||
|
||
.. After adding to limited API: | ||
encukou marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
If you use the `limited API <https://docs.python.org/3/c-api/stable.html>__, | ||
you must update ``Py_LIMITED_API`` to ``0x030b0000``, losing ABI | ||
compatibility with earlier versions. | ||
|
||
Slot methods -- the fast C equivalents for special methods, such as | ||
`nb_add <https://docs.python.org/3/c-api/typeobj.html#c.PyNumberMethods.nb_add>`__ | ||
for ``__add__`` or `tp_new <https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_new>`__ | ||
for initialization -- have a very simple API that doesn't allow | ||
passing in the defining class as in ``PyCMethod``. | ||
The same goes for getters and setters defined with | ||
`PyGetSetDef <https://docs.python.org/3/c-api/structures.html#c.PyGetSetDef>`__. | ||
|
||
To access the module state in these cases, use the | ||
`PyType_GetModuleByDef <https://docs.python.org/typeobj.html#c.PyType_GetModuleByDef>`__ | ||
function, and pass in the module definition. | ||
Once you have the module, call `PyModule_GetState <https://docs.python.org/3/c-api/module.html?highlight=pymodule_getstate#c.PyModule_GetState>`__ | ||
to get the state:: | ||
|
||
PyObject *module = PyType_GetModuleByDef(Py_TYPE(self), &module_def); | ||
my_struct *state = (my_struct*)PyModule_GetState(module); | ||
if (state === NULL) { | ||
return NULL; | ||
} | ||
|
||
``PyType_GetModuleByDef`` works by searching the `MRO <https://docs.python.org/3/glossary.html#term-method-resolution-order>`__ | ||
(i.e. all superclasses) for the first superclass that has a corresponding | ||
module. | ||
|
||
.. note:: | ||
|
||
In very exotic cases (inheritance chains spanning multiple modules | ||
created from the same definition), ``PyType_GetModuleByDef`` might not | ||
return the module of the true defining class. However, it will always | ||
return a module with the same definition, ensuring a compatible | ||
C memory layout. | ||
|
||
|
||
Open Issues | ||
----------- | ||
|
||
|
@@ -432,28 +502,18 @@ Several issues around per-module state and heap types are still open. | |
Discussions about improving the situation are best held on the `capi-sig | ||
mailing list <https://mail.python.org/mailman3/lists/capi-sig.python.org/>`__. | ||
|
||
Module State Access from Slot Methods, Getters and Setters | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Currently (as of Python 3.9), there is no API to access the module state | ||
from: | ||
|
||
- slot methods (meaning type slots, such as ``tp_new``, ``nb_add`` or | ||
``tp_iternext``) | ||
- getters and setters defined with ``tp_getset`` | ||
|
||
Type Checking | ||
~~~~~~~~~~~~~ | ||
|
||
Currently (as of Python 3.9), heap types have no good API to write | ||
Currently (as of Python 3.10), heap types have no good API to write | ||
``Py*_Check`` functions (like ``PyUnicode_Check`` exists for ``str``, a | ||
static type), and so it is not easy to ensure whether instances have a | ||
particular C layout. | ||
|
||
Metaclasses | ||
~~~~~~~~~~~ | ||
|
||
Currently (as of Python 3.9), there is no good API to specify the | ||
Currently (as of Python 3.10), there is no good API to specify the | ||
*metaclass* of a heap type, that is, the ``ob_type`` field of the type | ||
object. | ||
|
||
|
@@ -466,6 +526,15 @@ its variable-size storage is currently consumed by slots. Fixing this | |
is complicated by the fact that several classes in an inheritance | ||
hierarchy may need to reserve some state. | ||
|
||
Lossless conversion to heap types | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
The heap type API was not designed for “lossless” conversion from static types, | ||
that is, creating a type that works exactly like a given static type. | ||
The best way to address it would probably be to write a guide that covers | ||
known “gotchas”. | ||
|
||
|
||
Copyright | ||
--------- | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Lossless" is a little vague IMO. Could you reword this with a clearer language?