-
Notifications
You must be signed in to change notification settings - Fork 516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mono_threads_state_poll_with_info crashes with error: Cannot transition thread from STATE_BLOCKING with STATE_POLL #7742
Comments
Possibly related to one of these? mono/mono#10800 @praeclarum - Can you reproduce this at all? @marek-safar - Thoughts? |
@chamons It's happened 3 times to me now. It's always in that callback from SceneKit - so it's at least somewhat reproducible. I don't have any strict steps yet though, will keep working on it. I should also note that I didn't start getting this until upgrading macOS to 10.15.2, Xcode to 10.3 and whatever Xamarin Stable is right now - it has been stable for over a year prior. Unfortunately, I made all those changes at once so am not sure which one is the culprit. |
@chamons It's becoming more and more common - still no repro steps other than use the app for awhile :-( |
@lambdageek do we have any better way to debug this? |
@chamons is there any way to get symbols for those mono methods? I tried to make an AOT build but still didn't get symbols. Also, this may anecdotal at the moment, the crash has only happened with the debugger attached. I had 1 instance run all night without a crash. |
Ah, it turns out that it's an assert triggering:
static gboolean
mono_threads_summarize_execute_internal (MonoContext *ctx, gchar **out, MonoStackHash *hashes, gboolean silent, gchar *working_mem, size_t provided_size, gboolean this_thread_controls)
{
static SummarizerGlobalState state;
int current_idx;
MonoNativeThreadId current = mono_native_thread_id_get ();
gboolean thread_given_control = summarizer_state_init (&state, current, ¤t_idx);
g_assert (this_thread_controls == thread_given_control); |
Not a super great way to debug, but you can try:
When you're done, I guess what I expect to see is that we're in some kind of wrapper calling from managed code to native code. Specifically we switched to GC Safe mode and we're about to call the actual native function, but now we called the GC polling function for some reason. You can probably get the crash to happen more often if you start a background thread that just does something like this:
|
@lambdageek Some good and bad news: Your It seems to only repro with the debugger attached. :-( Without the debugger, it seems to run fine even with your So it's good that it doesn't repro outside the dev environment, but bad because I can't use the debugger. |
We have a private sample and @lambdageek is going to take a look, so no more need-info. |
FWIW, I just ran into the error on a totally different app (with a similar name :-)). This time it was in OpenGL code from SkiaSharp. I makes me wonder if the "Cannot transition thread from STATE_BLOCKING with STATE_POLL" is a red-herring and just occurs when there is a native crash.
|
This is a different bug - in this case we collected a crash report on a thread that's in the middle of a pinvoke and then we try to call a debugger callback and thus in GC Safe mode and then we called the debugger callback to send an event over the debug protocol which tries to do another GC Safe transition. So in the above, the crash happens first, and then the bad state transition. In the original issue, the bad transition happens, and the runtime asserts/crashes. |
Ah yeah that makes sense, thanks for the explanation. I was going to mention that the debugger keeps deadlocking but it seems you know already :-) |
@lambdageek @chamons Any updates on this? Been a few months since I've been able to use the debugger... |
@lambdageek Was taking another stab today at reproducing. |
Cool, thanks! I've found that most OpenGL/Metal (not sure which) apps trigger it on Mac (with the debugger). |
Sorry about that, this fell off my radar a bit. I couldn't get the crash to happen with VS for Mac 8.5. (XM 6.14.1.39). Not sure if that means anything yet. Got a stack trace from VS for Mac 8.4 (XM 6.10.0.21), relevant frames below (with identifying info stripped). Not sure how the debugger plays into it yet. The debugger thread is idle.
So just taking this at face value, the call to the setter for SCNNode.Transform is where the bad things happen.
The setter ends up calling
it's difficult to reproduce in a smaller example because the GC actually needs to try and stop the world because that call to |
Thanks for the info that's very interesting. I updated to 8.5, but get the same crash:
|
Talked through this issue with @CoffeeFlux, and I have a guess for why this is only showing up when debugging. The We probably make different inlining decisions when we JIT this code with debugging on. Without debugging, since the size is known at JIT time, we probably unroll the loops in That said, the underlying issue is that the call to create a new array is in some marshalling code after we already switched to GC Safe mode, so we shouldn't be optimizing it to do a call to managed code. Update: Hm. Marshalling doesn't insert any calls to |
I'm getting this error for a brand new WatchOS app .. it crashes on launch. I've not added any code to it, it is just what VS produces by default. Release build, VS 8.5.1 (build 43), WatchOS 6.2 (17T529) [Edit: Seems like this is a known issue: https://github.com/mono/mono/issues/19372 ] These are the only messages produced by the app in the WatchOS console. It crashes immediately.
|
@praeclarum I don't know if this is related but a fix for the same error message was merged here |
The watchOS issue is separate – in both cases it's due to state transition issues, but in very different parts of the code. Also, Aleksey is out for the next few weeks, so I'm taking a look at this. |
The WatchOS fix won't apply, but I thought perhaps the underlying issue might be the similar
|
I ran into this error today while running on the latest macOS (10.15.2). I continually run into this error and wonder if anyone know what causes it and whether there is a way to recover. This error makes it impossible to debug since it just keeps crashing with these native exceptions.
Steps to Reproduce
Expected Behavior
No crash
Actual Behavior
Above mono error.
Environment
Build Logs
Build Log.txt
The text was updated successfully, but these errors were encountered: