Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assert failure in sgen-mono.c while running System.Drawing tests #44165

Closed
stephentoub opened this issue Nov 2, 2020 · 9 comments
Closed

Assert failure in sgen-mono.c while running System.Drawing tests #44165

stephentoub opened this issue Nov 2, 2020 · 9 comments
Assignees
Milestone

Comments

@stephentoub
Copy link
Member

https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-44136-merge-c86262a09b5b4ea99f/System.Drawing.Common.Tests/console.5aca01fc.log?sv=2019-07-07&se=2020-11-22T19%3A39%3A42Z&sr=c&sp=rl&sig=D2w6NnhCPSVf5cfFCGA%2BU%2By6nJ5llAgjBphBA5DUxMk%3D

    System.Drawing.Drawing2D.Tests.ColorBlendTests.Ctor_LargeCount_ThrowsOutOfMemoryException [SKIP]
      Condition(s) not met: "IsNotIntMaxValueArrayIndexSupported"
* Assertion at F:\workspace\_work\1\s\src\mono\mono\metadata\sgen-mono.c:2375, condition `stack == NULL || mono_handle_stack_is_empty (stack)' not met
@ghost
Copy link

ghost commented Nov 2, 2020

Tagging subscribers to this area: @CoffeeFlux
See info in area-owners.md if you want to be subscribed.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Nov 2, 2020
@ghost
Copy link

ghost commented Nov 3, 2020

Tagging subscribers to this area: @BrzVlad
See info in area-owners.md if you want to be subscribed.

@CoffeeFlux
Copy link
Contributor

@BrzVlad can you take a look? If it doesn't seem like a GC issue feel free to bounce it back.

@BrzVlad
Copy link
Member

BrzVlad commented Nov 4, 2020

  1. This doesn't look like a GC issue, rather an icall not clearing its handle stack
  2. I ran this suite 3000 times on windows without getting any crashes
  3. This is the list of icalls called by the suite, with their call count https://gist.github.com/BrzVlad/d2f121bfffd3aa0cda6bf8443acf893b
  4. I remember seeing this assertion in the past. Maybe @lambdageek has some more ideas

@CoffeeFlux CoffeeFlux added area-VM-meta-mono and removed area-GC-mono untriaged New issue has not been triaged by the area owner labels Nov 4, 2020
@ghost
Copy link

ghost commented Nov 4, 2020

Tagging subscribers to this area: @CoffeeFlux
See info in area-owners.md if you want to be subscribed.

@lambdageek
Copy link
Member

bah we should change

				g_assert (stack == NULL || mono_handle_stack_is_empty (stack));

to

				g_assertf (stack == NULL || mono_handle_stack_is_empty (stack), "skipped thread %p (reason %d) has non-empty handle stack", info, skip_reason);

so that we get a better assert message.


So there are two things happening here: we decided to skip the thread from a stop-the-world suspend for some reason, and the thread has a non-empty handle stack.

On Windows one reason we might skip a thread is if SuspendThread failed - which can happen transiently, I think - in that case we abort the attempt to suspend and skip the thread during STW (mono/mono#15486).

If that's what's happening, it's not surprising that the thread had a non-empty handle stack. it could have been doing any arbitrary work when we tried to preempt it.


I think this is another piece of evidence that we need to be able to back out of the whole STW and start over. (Similar to the pthread_kill transient failures - #32377 (comment)). Skipping the thread just invites the GC to miss roots. /cc @lateralusX

@SamMonoRT SamMonoRT modified the milestones: 6.0.0, 7.0.0 Jul 16, 2021
@SamMonoRT
Copy link
Member

Moving to 7.0.0

@SamMonoRT
Copy link
Member

Moving to 8.0.0 and assigning to @lateralusX

@SamMonoRT SamMonoRT modified the milestones: 7.0.0, 8.0.0 Aug 4, 2022
@lateralusX
Copy link
Member

Closing this issue for a couple of different reasons:

  • The issue didn't reproduce over a large number of runs.
  • A number of fixes have been done to get the library tests running on Windows after this issue was filed, CoreCLR runtime tests on Mono Windows x64. #64281, one of them explicitly fixed a issue in System.Drawing related tests.
  • The component and tests have moved out of .net8 repository.

@ghost ghost locked as resolved and limited conversation to collaborators Aug 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants