Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failure: Interop/COM/ComWrappers/WeakReference/WeakReferenceTest #81362

Closed
AndyAyersMS opened this issue Jan 30, 2023 · 8 comments · Fixed by #88537
Closed

Test failure: Interop/COM/ComWrappers/WeakReference/WeakReferenceTest #81362

AndyAyersMS opened this issue Jan 30, 2023 · 8 comments · Fixed by #88537
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI disabled-test The test is disabled in source code against the issue GCStress JitStress CLR JIT issues involving JIT internal stress modes Priority:2 Work that is important, but not critical for the release
Milestone

Comments

@AndyAyersMS
Copy link
Member

Fails under jit/gc stress on a variety of OS/Arch

https://dev.azure.com/dnceng-public/public/_build/results?buildId=152088&view=results

;; linux arm64

export DOTNET_TieredCompilation=0
export DOTNET_DbgEnableMiniDump=1
export DOTNET_EnableCrashReport=1
export DOTNET_DbgMiniDumpName=$HELIX_DUMP_FOLDER/coredump.%d.dmp
export DOTNET_GCStress=0xC
export DOTNET_JitStress=1

  Discovered:  Interop.COM.XUnitWrapper (found 4 test cases)
  Starting:    Interop.COM.XUnitWrapper (parallel test collections = on, max threads = 2)

    Interop/COM/ComWrappers/WeakReference/WeakReferenceTest/WeakReferenceTest.sh [FAIL]
      [createdump] waitpid() returned successfully (wstatus 00000000)
      /root/helix/work/workitem/e/Interop/COM/ComWrappers/WeakReference/WeakReferenceTest/WeakReferenceTest.sh: line 423:   729 Segmentation fault      (core dumped) $LAUNCHER $ExePath "${CLRTestExecutionArguments[@]}"
      
      Return code:      1
      Raw output file:      /root/helix/work/workitem/uploads/Reports/Interop.COM/ComWrappers/WeakReference/WeakReferenceTest/WeakReferenceTest.output.txt
      Raw output:
      BEGIN EXECUTION
      /root/helix/work/correlation/corerun -p System.Reflection.Metadata.MetadataUpdater.IsSupported=false WeakReferenceTest.dll ''
      Running ValidateGlobalInstanceTrackerSupport...
        -- Validate weak reference creation
          -- Validate RCW recreation
          -- Validate release
        -- Validate target reset
          -- Validate RCW recreation
          -- Validate release
      [createdump] Gathering state for process 729 corerun

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jan 30, 2023
@AndyAyersMS AndyAyersMS added GCStress JitStress CLR JIT issues involving JIT internal stress modes and removed area-Interop-coreclr untriaged New issue has not been triaged by the area owner labels Jan 30, 2023
@AndyAyersMS AndyAyersMS added this to the 8.0.0 milestone Jan 30, 2023
@AndyAyersMS AndyAyersMS added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 30, 2023
@ghost
Copy link

ghost commented Jan 30, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

Fails under jit/gc stress on a variety of OS/Arch

https://dev.azure.com/dnceng-public/public/_build/results?buildId=152088&view=results

;; linux arm64

export DOTNET_TieredCompilation=0
export DOTNET_DbgEnableMiniDump=1
export DOTNET_EnableCrashReport=1
export DOTNET_DbgMiniDumpName=$HELIX_DUMP_FOLDER/coredump.%d.dmp
export DOTNET_GCStress=0xC
export DOTNET_JitStress=1

  Discovered:  Interop.COM.XUnitWrapper (found 4 test cases)
  Starting:    Interop.COM.XUnitWrapper (parallel test collections = on, max threads = 2)

    Interop/COM/ComWrappers/WeakReference/WeakReferenceTest/WeakReferenceTest.sh [FAIL]
      [createdump] waitpid() returned successfully (wstatus 00000000)
      /root/helix/work/workitem/e/Interop/COM/ComWrappers/WeakReference/WeakReferenceTest/WeakReferenceTest.sh: line 423:   729 Segmentation fault      (core dumped) $LAUNCHER $ExePath "${CLRTestExecutionArguments[@]}"
      
      Return code:      1
      Raw output file:      /root/helix/work/workitem/uploads/Reports/Interop.COM/ComWrappers/WeakReference/WeakReferenceTest/WeakReferenceTest.output.txt
      Raw output:
      BEGIN EXECUTION
      /root/helix/work/correlation/corerun -p System.Reflection.Metadata.MetadataUpdater.IsSupported=false WeakReferenceTest.dll ''
      Running ValidateGlobalInstanceTrackerSupport...
        -- Validate weak reference creation
          -- Validate RCW recreation
          -- Validate release
        -- Validate target reset
          -- Validate RCW recreation
          -- Validate release
      [createdump] Gathering state for process 729 corerun

Author: AndyAyersMS
Assignees: -
Labels:

GCStress, JitStress, area-CodeGen-coreclr

Milestone: 8.0.0

@BruceForstall
Copy link
Member

New case reported first over in #60152:

Failed in Run: runtime-coreclr gcstress-extra 20230212.1

Failed tests:

coreclr windows arm64 Checked gcstress0xc_jitstress1 @ Windows.11.Arm64.Open
    - Interop\\COM\\ComWrappers\\WeakReference\\WeakReferenceTest\\WeakReferenceTest.cmd

Error message:


Assert failure(PID 8868 [0x000022a4], Thread: 168 [0x00a8]): !CREATE_CHECK_STRING(!"Detected use of a corrupted OBJECTREF. Possible GC hole.")

CORECLR! `Object::ValidateInner'::`1'::catch$12 + 0x138 (0x00007ffc`3c1d1438)
CORECLR! CallSettingFrame + 0x68 (0x00007ffc`3bc22e60)
CORECLR! _FrameHandler3::CxxCallCatchBlock + 0x134 (0x00007ffc`3c12e1f4)
NTDLL! RtlCaptureContext + 0x1B8 (0x00007ffc`a8462248)
CORECLR! Object::ValidateInner + 0x144 (0x00007ffc`3bd16a04)
CORECLR! OBJECTREF::OBJECTREF + 0x118 (0x00007ffc`3bd123e8)
CORECLR! MarshalNative::GCHandleInternalGet + 0x40 (0x00007ffc`3be20640)
<no module>! <no symbol> + 0x0 (0x00007ffb`dc7242d0)
   File: D:\a\_work\1\s\src\coreclr\vm\object.cpp Line: 600
   Image: C:\h\w\B6C109D3\p\corerun.exe


Return code:      1
Raw output file:      C:\h\w\B6C109D3\w\B057097C\uploads\Reports\Interop.COM\ComWrappers\WeakReference\WeakReferenceTest\WeakReferenceTest.output.txt
Raw output:
BEGIN EXECUTION
"C:\h\w\B6C109D3\p\corerun.exe" -p "System.Reflection.Metadata.MetadataUpdater.IsSupported=false"  WeakReferenceTest.dll 
Running ValidateNonComWrappers...
Running ValidateGlobalInstanceMarshalling...
 -- Validate weak reference creation
   -- Validate RCW recreation
   -- Validate release
 -- Validate target reset
   -- Validate RCW recreation
   -- Validate release
Expected: 100
Actual: -1073740286
END EXECUTION - FAILED
FAILED
Test Harness Exitcode is : 1
To run the test:
> set CORE_ROOT=C:\h\w\B6C109D3\p
> C:\h\w\B6C109D3\w\B057097C\e\Interop\COM\ComWrappers\WeakReference\WeakReferenceTest\WeakReferenceTest.cmd
Expected: True
Actual:   False

Stack trace:

  at Interop_COM._ComWrappers_WeakReference_WeakReferenceTest_WeakReferenceTest_._ComWrappers_WeakReference_WeakReferenceTest_WeakReferenceTest_cmd()
  at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
  at System.Reflection.MethodInvoker.Invoke(Object obj, IntPtr* args, BindingFlags invokeAttr)

@BruceForstall
Copy link
Member

BruceForstall commented Feb 13, 2023

I reproduced this on win-arm64 with GCStress=C. It reproduces reliably with JitStress=1 set and passes without JitStress set.

BruceForstall added a commit to BruceForstall/runtime that referenced this issue Feb 13, 2023
BruceForstall added a commit that referenced this issue Feb 14, 2023
@BruceForstall BruceForstall added disabled-test The test is disabled in source code against the issue and removed blocking-clean-ci-optional Blocking optional rolling runs labels Feb 24, 2023
@bogdan-patraucean
Copy link

I got a similar error for an app I have live in production.

XCEPTION_RECORD:  (.exr -1)
ExceptionAddress: 00007ffd6556e6e2 (combase!_tlgWriteTransfer_EtwEventWriteTransfer+0x000000000000003a)
   ExceptionCode: c00000fd (Stack overflow)
  ExceptionFlags: 00000001
NumberParameters: 2
   Parameter[0]: 0000000000000001
   Parameter[1]: 000000f45b265ff8

PROCESS_NAME:  Wintoys.exe

ERROR_CODE: (NTSTATUS) 0xc00000fd - A new guard page for the stack cannot be created.

EXCEPTION_CODE_STR:  c00000fd

EXCEPTION_PARAMETER1:  0000000000000001

EXCEPTION_PARAMETER2:  000000f45b265ff8

IP_ON_HEAP:  000002fde81cd228
The fault address in not in any loaded module, please check your build's rebase
log at <releasedir>\bin\build_logs\timebuild\ntrebase.log for module which may
contain the address if it were loaded.

FRAME_ONE_INVALID: 1

STACK_TEXT:  
000000f4`5b2684d0 000002fd`e81cd228     : 000000f4`5b268580 00000000`80004003 00007ff6`65890fc0 00000000`00000000 : 0x00007ff6`0600eef3
000000f4`5b2684d8 000000f4`5b268580     : 00000000`80004003 00007ff6`65890fc0 00000000`00000000 000034a1`0b8e6e92 : 0x000002fd`e81cd228
000000f4`5b2684e0 00000000`80004003     : 00007ff6`65890fc0 00000000`00000000 000034a1`0b8e6e92 00007ff6`65b10108 : 0x000000f4`5b268580
000000f4`5b2684e8 00007ff6`65890fc0     : 00000000`00000000 000034a1`0b8e6e92 00007ff6`65b10108 000000f4`5b26aaf8 : 0x80004003
000000f4`5b2684f0 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : Wintoys_exe!`anonymous namespace'::ManagedObjectWrapper_Release


FAULTING_SOURCE_LINE:  D:\a\_work\1\s\src\coreclr\interop\comwrappers.cpp

FAULTING_SOURCE_FILE:  D:\a\_work\1\s\src\coreclr\interop\comwrappers.cpp

FAULTING_SOURCE_LINE_NUMBER:  205

FAULTING_SOURCE_SRV_COMMAND:  https://raw.githubusercontent.com/dotnet/runtime/8042d61b17540e49e53569e3728d2faa1c596583/src/coreclr/interop/comwrappers.cpp

FAULTING_SOURCE_CODE:  
No source found for 'D:\a\_work\1\s\src\coreclr\interop\comwrappers.cpp'


SYMBOL_NAME:  wintoys_exe!`anonymous namespace'::ManagedObjectWrapper_Release+f45b268580

MODULE_NAME: Wintoys_exe

IMAGE_NAME:  Wintoys.exe

STACK_COMMAND:  dt ntdll!LdrpLastDllInitializer BaseDllName ; dt ntdll!LdrpFailureData ; ~0s; .ecxr ; kb

FAILURE_BUCKET_ID:  STACK_OVERFLOW_c00000fd_Wintoys.exe!_anonymous_namespace_::ManagedObjectWrapper_Release

OS_VERSION:  10.0.19041.1

BUILDLAB_STR:  vb_release

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

IMAGE_VERSION:  1.0.52.0

FAILURE_ID_HASH:  {f3afd990-bcaf-378c-2df8-31db50fc6300}

@markples markples added the Priority:2 Work that is important, but not critical for the release label Jun 8, 2023
@markples
Copy link
Member

@bogdan-patraucean This issue is about an assertion indicating possible GC reporting corruption when we put the code generator in a stress mode. In your copied output, I see a stack overflow message. I suspect that these are not related. Do you have other information that suggests that they are? Thanks.

@markples
Copy link
Member

This isn't reproing for me under win-x64. (Above references are to win-arm64 and linux-arm64, so perhaps it is architecture dependent but not OS dependent.)

@AustinWise
Copy link
Contributor

AustinWise commented Jun 27, 2023

EDIT: I tried reproducing this failure on win-x64 and win-arm64 and did not have success, including adding extra GC.Collect calls and using GC stress. So maybe the following is irrelevant.

Perhaps the handle th is being freed by the finalizer before the call to ComAwareWeakReference.GetTarget here:

#if FEATURE_COMINTEROP || FEATURE_COMWRAPPERS
if ((th & ComAwareBit) != 0)
return ComAwareWeakReference.GetTarget(th);
#endif
// unsafe cast is ok as the handle cannot be destroyed and recycled while we keep the instance alive
object? target = GCHandle.InternalGet(th);
// must keep the instance alive as long as we use the handle.
GC.KeepAlive(this);

Maybe adding a call to GC.KeepAlive(this) after the call to ComAwareWeakReference.GetTarget but before returning its result will fix the problem?

@markples
Copy link
Member

markples commented Jul 7, 2023

Thank you @AustinWise. I have confirmed that the failure occurred on this path and that KeepAlive fixes the test locally. I was confused for a bit after seeing "COM" and Linux failures, but it turns out that FEATURE_COMINTEROP is win-specific and FEATURE_COMWRAPPERS is set on Linux.

markples added a commit that referenced this issue Jul 12, 2023
Omission was noticed by @AustinWise. I reproed the failure and this fix for it on linux-x64.

Fixes #81362
@ghost ghost locked as resolved and limited conversation to collaborators Aug 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI disabled-test The test is disabled in source code against the issue GCStress JitStress CLR JIT issues involving JIT internal stress modes Priority:2 Work that is important, but not critical for the release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants