-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure capsules stay alive when caching pointers into their interior #373
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I see what you mean. I wonder if there's an alternative we can do here to leaking and stashing in a static? With the prevailing winds in CPython seeming to head towards sub-interpreters and multi-phase initialization (PEP 489) I'm feeling like we need to at least try to not make PyO3 / numpy's incompatibility with that world knowingly worse... 🤔
From my point of view, a What is the fastest way to get any state managed by the Python interpreter? Accessing an attribute of a well-known module? Are there any alternatives for native code to at least stash a pointer-to-void somewhere? Maybe in the Python thread state? If not, I guess the additional overhead of calling What we could also do is try to amortize such an attribute access over many uses, e.g. PyO3 does the access once and stashes the result in a |
Another idea would be to continue to use |
I think that in principle we could use (That doc is actually pretty good. It does also allude to per-interpreter state, though doesn't describe how to use it.) I haven't got a concrete design for how we actually make this nice to use. I feel like it might end up requiring a thread-local "context" similar to what Anyway, I think my comment here might actually have been unhelpful, because it's not clear to me whether |
Is it always obvious which module to use? For rust-numpy we generally reach for NumPy-provided modules but I suspect PyO3 would rather use the "current" module, i.e. the top-level module of the extension we are implementing. But is that always well-defined? Are "free standing functions" implemented using PyO3 and passed to pure Python code as context possible as well? |
Sorry for complicating the discussion... but should the API capsules (like _ARRAY_API) also be allocated into the heap and a part of the 'module state'? They are just immutable function tables. |
I think so in a world where the Python interpreter and all modules can be unloaded and possibly different versions be loaded later into the same process. Those function tables would refer to possibly non-existing addresses or functions with incompatible signatures after re-initialization of Python. (As @davidhewitt pointed out above, NumPy may not yet actually support this full de-initialization and re-initialization, but it is the direction the Python ecosystem seems to be moving towards.) ((As an aside, the capsules themselves are already on the Python heap. What we are caching here in a |
Thank you. I think I understand now.
|
Indeed I think it's a hard problem. I vaguely hope that by making user-facing APIs to manipulate module state as well as making the "current module" more obvious by e.g. PyO3/pyo3#2367 then there might just be enough flexibility to make module state work. Probably some experimentation for us ahead :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we've all agreed that it's too early to start worrying about not using statics here!
While this started as a simple PyO3 API usage modernization, I think we actually need to make sure the capsules stay alive as long as we hold onto pointers into their interior as the modules and hence attributes could theoretically be deleted leaving us with a dangling pointer.
Leaking references to the capsules themselves seemed like the easiest way to avoid the overhead of calling
PyCapsule_Pointer
for each access. This should not be a functional limitation as we never update the cached pointer in any case, i.e. storingPy<PyCapsule>
would have effectively the same result as the intentional leak.