-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyType_GetModuleByDef
is really slow.
#119663
Comments
Another thing that could improve performance is to attach the module state to the end of the module object, and save a pointer hop when accessing data in the module state. |
Unfortunately
Avoiding the pointer hop would make it infeasible to revive PEP-547, which assumes you can add state to an existing (but fresh) |
Can you create several modules from the same I don't think there is any fundamental reason why
It would prevent the mechanism proposed in PEP 547, but we could achieve the goal of PEP 547 with a different mechanism. |
You can find an example in Modules/_testsinglephase.c. Another case: loading the same module more than once (which isn't the same thing as importing multiple times or reloading). Each time an extension module is directly loaded, a new module is created. (Naturally, that isn't strictly true for all modules. Single-phase init modules with |
That test was added recently, and seems to be testing a weird corner case that I suspect no one uses. For the vast majority of modules that want the sane behavior of one module per module def and want reasonable performance, we should provide that. So, for backwards compatibility, we will need to leave |
It's a new test, the behaviour is old. If we add a module option that ensures the module can't be initialized more than once, it would make sense to disable finalization as well (or, delay it until interpreter shutdown). That would get us immortal singleton modules (per-interpreter), which does sound like a reasonable feature. |
@mdboom If this issue is about micro-benchmarking, I would like to see the result of the main vs that with the diff (a) below applied, on the As to the macro-benchmarks, I'm not sure how impactful the (a) interpreter state accessdiff --git a/Include/internal/pycore_interp.h b/Include/internal/pycore_interp.h
index 86dada5..c20e59e 100644
--- a/Include/internal/pycore_interp.h
+++ b/Include/internal/pycore_interp.h
@@ -234,6 +234,7 @@ struct _is {
// more comments.
struct _obmalloc_state *obmalloc;
+ void *decimal_state;
PyObject *audit_hooks;
PyType_WatchCallback type_watchers[TYPE_MAX_WATCHERS];
PyCode_WatchCallback code_watchers[CODE_MAX_WATCHERS];
diff --git a/Modules/_decimal/_decimal.c b/Modules/_decimal/_decimal.c
index 94a2cc2..f1e5224 100644
--- a/Modules/_decimal/_decimal.c
+++ b/Modules/_decimal/_decimal.c
@@ -115,9 +115,7 @@ typedef struct {
static inline decimal_state *
get_module_state(PyObject *mod)
{
- decimal_state *state = _PyModule_GetState(mod);
- assert(state != NULL);
- return state;
+ return (decimal_state *)PyInterpreterState_Get()->decimal_state;
}
static struct PyModuleDef _decimal_module;
@@ -125,18 +123,13 @@ static struct PyModuleDef _decimal_module;
static inline decimal_state *
get_module_state_by_def(PyTypeObject *tp)
{
- PyObject *mod = PyType_GetModuleByDef(tp, &_decimal_module);
- assert(mod != NULL);
- return get_module_state(mod);
+ return (decimal_state *)PyInterpreterState_Get()->decimal_state;
}
static inline decimal_state *
find_state_left_or_right(PyObject *left, PyObject *right)
{
- PyObject *mod = _PyType_GetModuleByDef2(Py_TYPE(left), Py_TYPE(right),
- &_decimal_module);
- assert(mod != NULL);
- return get_module_state(mod);
+ return (decimal_state *)PyInterpreterState_Get()->decimal_state;
}
@@ -5836,6 +5829,7 @@ static int minalloc_is_set = 0;
static int
_decimal_exec(PyObject *m)
{
+ PyInterpreterState_Get()->decimal_state = _PyModule_GetState(m);
PyObject *numbers = NULL;
PyObject *Number = NULL;
PyObject *collections = NULL; (b) global state access (attached just in case)diff --git a/Modules/_decimal/_decimal.c b/Modules/_decimal/_decimal.c
index 94a2cc2..de10e73 100644
--- a/Modules/_decimal/_decimal.c
+++ b/Modules/_decimal/_decimal.c
@@ -112,12 +112,14 @@ typedef struct {
PyCFunction _py_float_as_integer_ratio;
} decimal_state;
+
+decimal_state global_state;
+
+
static inline decimal_state *
get_module_state(PyObject *mod)
{
- decimal_state *state = _PyModule_GetState(mod);
- assert(state != NULL);
- return state;
+ return &global_state;
}
static struct PyModuleDef _decimal_module;
@@ -125,18 +127,13 @@ static struct PyModuleDef _decimal_module;
static inline decimal_state *
get_module_state_by_def(PyTypeObject *tp)
{
- PyObject *mod = PyType_GetModuleByDef(tp, &_decimal_module);
- assert(mod != NULL);
- return get_module_state(mod);
+ return &global_state;
}
static inline decimal_state *
find_state_left_or_right(PyObject *left, PyObject *right)
{
- PyObject *mod = _PyType_GetModuleByDef2(Py_TYPE(left), Py_TYPE(right),
- &_decimal_module);
- assert(mod != NULL);
- return get_module_state(mod);
+ return &global_state;
}
@@ -6156,7 +6153,7 @@ decimal_free(void *module)
static struct PyModuleDef_Slot _decimal_slots[] = {
{Py_mod_exec, _decimal_exec},
- {Py_mod_multiple_interpreters, Py_MOD_PER_INTERPRETER_GIL_SUPPORTED},
+ {Py_mod_multiple_interpreters, Py_MOD_MULTIPLE_INTERPRETERS_NOT_SUPPORTED},
{Py_mod_gil, Py_MOD_GIL_NOT_USED},
{0, NULL},
};
@@ -6165,7 +6162,7 @@ static struct PyModuleDef _decimal_module = {
PyModuleDef_HEAD_INIT,
.m_name = "decimal",
.m_doc = doc__decimal,
- .m_size = sizeof(decimal_state),
+ .m_size = 0,
.m_methods = _decimal_methods,
.m_slots = _decimal_slots,
.m_traverse = decimal_traverse, |
Two |
Why not modifying |
PyType_GetModuleByDef
does a lot of work for what was a single memory read prior to sub-interpreters.For example, #114682.
We should provide a more efficient way to get to the module state from the module def.
AFAICT, the function to get the module state is a pure function of the interpreter and the module def.
Since the module def already has a
m_index
field that is described as "The module's index into its interpreter's modules_by_index cache.", then aPy_GetModuleFromDef()
function could be as simple as:The text was updated successfully, but these errors were encountered: