-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Py_NewInterpreterFromConfig end with hard crash #123488
Comments
the docs say also… Note that the PyGILState_* functions assume there is only one global interpreter (created automatically by Py_Initialize()). Python supports the creation of additional interpreters (using Py_NewInterpreter()), but mixing multiple interpreters and the PyGILState_* API is unsupported. |
Right, mixing them is unsupported. You still need to have Python initialized, and then you can create subinterpreters with the main interpreter's GIL -- once the sub interpreter is initialized, it will have its own GIL. |
First success … If I use a "release" build, (non debug python) than
|
I have an initialized main python interpreter which loads my library and in the library a thread is created etc.. this is the reason why it is workc but only for 4 instances :-( Task → my current research is to make the type itself thread-local I mean the new type I define in my extension but now the thread startup does not understand the type template (because the original type is from the main-thread and this type has no meaning in the new thread. :-(
The "MyServer" is the initial server from the main thread
|
Yeah, this looks like it's probably a duplicate of #123134. I haven't reproduced this on Linux yet, but on Windows, there seems to be a thread safety issue inside the allocator when using a per-interpreter GIL. I'll investigate this further later today. In the meantime, try adding some synchronization between your threads to prevent them from calling |
even if I do a one thread per step my python is failing after the fourth thread was created (and closed) → there is no "concurrency" right now |
You'll have to share more of the C code, but that's probably just a side-effect of a prior memory error. |
question : do you use a TYPE per THREAD or a TYPE per process , I mean the C-Python-Type definition I add the macro
|
new test case with parallel access . looks like the "race" thing
|
OK - now a success news (with smell) In the docu about the
and my first question was , does the type object have to be isolated as well? I started a research about thread local type objectswitching to thread-local-types the interpreter does not crash anymore BUT ...
analysis
race crash
|
After playing around with this, I was unable to reproduce the data race. Though, I did encounter a deadlock when trying to call |
Just to be clear what the flow is:
If I do…
than I get :
I think If I skip
|
The error is this line:
|
(FWIW, the call to |
If I skip the
→ block In the "Non-Python created threads" docu they say:
|
Question before I delete a thread I call → is this ok? |
Yeah, that's what you're supposed to do. I'm guessing we need to document subinterpreters a little better 😅 |
Why I need to to a
|
You need to call it because it's a non-Python thread. |
and why the from the docu:
|
You need to release the GIL in the main thread, in a |
so I changed my thread creating function
and the result is still blocking :
|
Yeah, you're still calling |
already changed.. I update the code. |
Oh wait, I see the problem. |
I call the the
|
It's difficult for me to follow the code due to the use of macros, but if it's truly the main thread, then I think the problem is the Regardless, I don't think this is a CPython issue :( |
Found something special under "load" !! → Q: how to check if "ts" is still valid? The main problem is that every thread is a server that can terminate itself. So if a server terminates between
If I switch to updateeven if I disable the |
AFAIK, |
the problem is that this is just an example → this lost
|
Ok, but it's not obvious what the clear "fix" would be here. |
I just testing around the |
In the above example, you're using // I'm not totally sure why you're swapping here, you'll have to clarify.
// If MK_RT_REF.threadData *is* the current thread state, as I suspect, then
// the call to Py_EndInterpreter will also fail, since it will invalidate it.
PyThreadState* ts = PyThreadState_Swap(MK_RT_REF.threadData);
// At best, adding a check to turn PyThreadState_Swap into a no-op
// if the passed thread state matches the current thread state
// could be viable, but I'm not sure how useful it would be.
printV("ts start: %p → %p", ts, MK_RT_REF.threadData);
// MK_RT_REF.threadData is invalidated, as well as ts (since they're the same pointer)
// You can't dereference either of them past this point.
Py_EndInterpreter(MK_RT_REF.threadData);
printV("ts end : %p → %p", MK_RT_REF.threadData, ts);
// This call does nothing -- ts would already be the thread state.
// The segfault comes from ts being freed by Py_EndInterpreter.
PyThreadState_Swap(ts); Again, I don't think there's anything here that's a CPython issue :( I'll submit a PR for some extra documentation on subinterpreters, but I suggest opening a help thread and closing this issue. |
I already delete this test code because it was just a test → the original problem of crash somewhere is still an issue. |
I can't help with that unless I see the code, but let's move this to discourse. If we find that this is indeed a CPython issue, we can reopen (or better yet, make a new one with a more solid reproducer). |
Again, I don't think there's anything here that's a CPython issue :( well I already use my library as an extension for many languages and obvious able I'm not smart enough to use the python thread api. → I'll think this is an python problem :-) After trying for the second time in my limited lifetime to do something with thread on python and being miserably wiser twice, I must now admit that python is apparently incapable of providing any usable thread-api. → the smartest thing to do with thread on python… activate the ignoreThread switch
|
Not disagreeing that multithreading from the C API is difficult! We need to document it better. Though, if you want to get this working, I can help you on discourse, instead of polluting the issues page with relentless back-and-forth debugging. |
@encukou, this should be closed as not planned. I'll create a new issue sometime tomorrow to add some better documentation on using PEP 684. |
I've created #123672 to address the documentation problem. |
Crash report
What happened?
I started using Py_NewInterpreterFromConfig to add an PER-THREAD interpreter with my library.
The library is thread safe and supports many languages
including other languages with thread support
are working fine. Even python works fine without thread support.
Task: now I started to use thread support in python.
I had to change my code to to Multi-phase initialization, isolate the python data on an per-thread-level and finally get an easy thread example working -> ok
I test a client server application. the server create a thread in my library and this thread is initialized with
I figure out that always after 4 successful simple round trip clinet → server → client the server crash with the message below
MqDisasterSignal
handler.The server code (file sourced into the new interpreter) is quite simple.
the problem is that running this example with a server with valgrind nothing happen (no crash)
the same example using fork or using spawn to startup the server instances works fine.
the problem is only with thread support.
additional Information: because my library only support one server-instance per thread, I choose
PyInterpreterConfig_OWN_GIL
to be clear that I don't want to have interaction between different python interpreters.all the communication (between processes and threads) is done by my library. my library uses thread-local-storage to isolate the data per-thread.
CPython versions tested on:
3.12
Operating systems tested on:
Linux
Output from running 'python -VV' on the command line:
Python 3.12.4 (main, Aug 29 2024, 18:00:55) [GCC 13.3.0]
The text was updated successfully, but these errors were encountered: