diff --git a/pep-0630.rst b/pep-0630.rst index 5ca9d02e6dc..27b4eb4e4a1 100644 --- a/pep-0630.rst +++ b/pep-0630.rst @@ -21,9 +21,9 @@ describes problems of such per-process state and efforts to make per-module state, a better default, possible and easy to use. The document also describes how to switch to per-module state where -possible. The switch involves allocating space for that state, switching -from static types to heap types, and—perhaps most importantly—accessing -per-module state from code. +possible. The switch involves allocating space for that state, potentially +switching from static types to heap types, and—perhaps most +importantly—accessing per-module state from code. About this document ------------------- @@ -40,9 +40,15 @@ development, gaps identified in this text can show where to focus the effort, and the text can be updated as new features are implemented Whenever this PEP mentions *extension modules*, the advice also -applies to *built-in* modules, such as the C parts of the standard -library. The standard library is expected to switch to per-module state -early. +applies to *built-in* modules. + +.. note:: + This PEP contains generic advice. When following it, always take into + account the specifics of your project. + + For example, while much of this advice applies to the C parts of + Python's standard library, the PEP does not factor in stdlib specifics + (unusual backward compatibility issues, access to private API, etc.). PEPs related to this effort are: @@ -101,7 +107,7 @@ testing the isolation, multiple module objects corresponding to a single extension can even be loaded in a single interpreter. Per-module state provides an easy way to think about lifetime and -resource ownership: the extension module author will set up when a +resource ownership: the extension module will initialize when a module object is created, and clean up when it's freed. In this regard, a module is just like any other ``PyObject *``; there are no “on interpreter shutdown” hooks to think about—or forget about. @@ -235,7 +241,8 @@ Set ``PyModuleDef.m_size`` to a positive number to request that many bytes of storage local to the module. Usually, this will be set to the size of some module-specific ``struct``, which can store all of the module's C-level state. In particular, it is where you should put -pointers to classes (including exceptions) and settings (e.g. ``csv``'s +pointers to classes (including exceptions, but excluding static types) +and settings (e.g. ``csv``'s `field_size_limit `__) which the C code needs to function. @@ -302,9 +309,9 @@ zero. In your own module, you're in control of ``m_size``, so this is easy to prevent.) Heap types -~~~~~~~~~~ +---------- -Traditionally, types defined in C code were *static*, that is, +Traditionally, types defined in C code are *static*, that is, ``static PyTypeObject`` structures defined directly in code and initialized using ``PyType_Ready()``. @@ -321,10 +328,27 @@ the Python level: for example, you can't set ``str.myattribute = 123``. that shares any Python objects across interpreters implicitly depends on CPython's current, process-wide GIL. -An alternative to static types is *heap-allocated types*, or heap types +Because they are immutable and process-global, static types cannot access +“their” module state. +If any method of such a type requires access to module state, +the type must be converted to a *heap-allocated type*, or *heap type* for short. These correspond more closely to classes created by Python’s ``class`` statement. +For new modules, using heap types by default is a good rule of thumb. + +Static types can be converted to heap types, but note that +the heap type API was not designed for “lossless” conversion +from static types -- that is, creating a type that works exactly like a given +static type. Unlike static types, heap type objects are mutable by default. +Also, when rewriting the class definition in a new API, +you are likely to unintentionally change a few details (e.g. pickle-ability +or inherited slots). Always test the details that are important to you. + + +Defining Heap Types +~~~~~~~~~~~~~~~~~~~ + Heap types can be created by filling a ``PyType_Spec`` structure, a description or “blueprint” of a class, and calling ``PyType_FromModuleAndSpec()`` to construct a new class object. @@ -370,7 +394,7 @@ is called on a *subclass* of your type, ``Py_TYPE(self)`` will refer to that subclass, which may be defined in different module than yours. .. note:: - The following Python code. can illustrate the concept. + The following Python code can illustrate the concept. ``Base.get_defining_class`` returns ``Base`` even if ``type(self) == Sub``:: @@ -424,6 +448,52 @@ For example:: {NULL}, } +Module State Access from Slot Methods, Getters and Setters +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. note:: + + This is new in Python 3.11. + + .. After adding to limited API: + + If you use the `limited API __, + you must update ``Py_LIMITED_API`` to ``0x030b0000``, losing ABI + compatibility with earlier versions. + +Slot methods -- the fast C equivalents for special methods, such as +`nb_add `__ +for ``__add__`` or `tp_new `__ +for initialization -- have a very simple API that doesn't allow +passing in the defining class as in ``PyCMethod``. +The same goes for getters and setters defined with +`PyGetSetDef `__. + +To access the module state in these cases, use the +`PyType_GetModuleByDef `__ +function, and pass in the module definition. +Once you have the module, call `PyModule_GetState `__ +to get the state:: + + PyObject *module = PyType_GetModuleByDef(Py_TYPE(self), &module_def); + my_struct *state = (my_struct*)PyModule_GetState(module); + if (state === NULL) { + return NULL; + } + +``PyType_GetModuleByDef`` works by searching the `MRO `__ +(i.e. all superclasses) for the first superclass that has a corresponding +module. + +.. note:: + + In very exotic cases (inheritance chains spanning multiple modules + created from the same definition), ``PyType_GetModuleByDef`` might not + return the module of the true defining class. However, it will always + return a module with the same definition, ensuring a compatible + C memory layout. + + Open Issues ----------- @@ -432,20 +502,10 @@ Several issues around per-module state and heap types are still open. Discussions about improving the situation are best held on the `capi-sig mailing list `__. -Module State Access from Slot Methods, Getters and Setters -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Currently (as of Python 3.9), there is no API to access the module state -from: - -- slot methods (meaning type slots, such as ``tp_new``, ``nb_add`` or - ``tp_iternext``) -- getters and setters defined with ``tp_getset`` - Type Checking ~~~~~~~~~~~~~ -Currently (as of Python 3.9), heap types have no good API to write +Currently (as of Python 3.10), heap types have no good API to write ``Py*_Check`` functions (like ``PyUnicode_Check`` exists for ``str``, a static type), and so it is not easy to ensure whether instances have a particular C layout. @@ -453,7 +513,7 @@ particular C layout. Metaclasses ~~~~~~~~~~~ -Currently (as of Python 3.9), there is no good API to specify the +Currently (as of Python 3.10), there is no good API to specify the *metaclass* of a heap type, that is, the ``ob_type`` field of the type object. @@ -466,6 +526,15 @@ its variable-size storage is currently consumed by slots. Fixing this is complicated by the fact that several classes in an inheritance hierarchy may need to reserve some state. +Lossless conversion to heap types +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The heap type API was not designed for “lossless” conversion from static types, +that is, creating a type that works exactly like a given static type. +The best way to address it would probably be to write a guide that covers +known “gotchas”. + + Copyright ---------