Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assert in DoubleAlign scenario with GCStress and TieredCompilation #36366

Closed
AaronRobinsonMSFT opened this issue May 13, 2020 · 12 comments · Fixed by #37116
Closed

Assert in DoubleAlign scenario with GCStress and TieredCompilation #36366

AaronRobinsonMSFT opened this issue May 13, 2020 · 12 comments · Fixed by #37116
Assignees
Labels
arch-x86 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI GCStress os-windows
Milestone

Comments

@AaronRobinsonMSFT
Copy link
Member

After the fix in #36357 is merged, a new assert is firing. This assert appears to be a problem with CodeGen on x86 when COMPlus_GCStress=0xC. If COMPlus_TieredCompilation=0 is set, the assert goes away.

Assert:

CORECLR! SKIP_PUSH_REG + 0x76 (0x5dd69de6)
CORECLR! UnwindEbpDoubleAlignFrameProlog + 0x1BD (0x5dd6a44d)
CORECLR! UnwindEbpDoubleAlignFrame + 0xE9 (0x5dd69ef9)
CORECLR! UnwindStackFrame + 0x394 (0x5dd6acf4)
CORECLR! DoGcStress + 0x2A4 (0x5dcb5386)
CORECLR! OnGcCoverageInterrupt + 0x12F (0x5dcb593a)
CORECLR! IsGcMarker + 0x93 (0x5d9fbab3)
CORECLR! CLRVectoredExceptionHandlerShim + 0xC5 (0x5d9ef1d5)
NTDLL! LdrSetDllManifestProber + 0xF6 (0x77e1bc46)
NTDLL! RtlUnwind + 0x1CB (0x77e1812b)
    File: D:\runtime\src\coreclr\src\vm\eetwain.cpp Line: 3021
    Image: D:\runtime\artifacts\tests\coreclr\Windows_NT.x86.Checked\Tests\Core_Root\corerun.exe

The below function failing when executed in the following loop:

for (int i = 0; i < 1000; ++i)
{
// Create a native wrapper for the managed object.
IntPtr testWrapper = cw.GetOrCreateComInterfaceForObject(new Test(), CreateComInterfaceFlags.TrackerSupport);
// Pass the managed object to the native object.
int id = trackerObj.AddObjectRef(testWrapper);
// Retain the managed object wrapper ptr.
testWrapperIds.Add(id);
}

Failing stack:

0:000> kb
 # ChildEBP RetAddr  Args to Child              
00 00afc97c 79b89de6 7a0793e8 00000bcd 7a07a540 coreclr!DbgAssertDialog+0x20f [D:\runtime\src\coreclr\src\utilcode\debug.cpp @ 698] 
01 00afc998 79b8a44d 0a834144 00000003 00afce44 coreclr!SKIP_PUSH_REG+0x76 [D:\runtime\src\coreclr\src\vm\eetwain.cpp @ 3022] 
02 00afc9c4 79b89ef9 00afcad4 00afce44 0a834144 coreclr!UnwindEbpDoubleAlignFrameProlog+0x1bd [D:\runtime\src\coreclr\src\vm\eetwain.cpp @ 3842] 
03 00afc9ec 79b8acf4 00afcad4 00afcb08 00afce44 coreclr!UnwindEbpDoubleAlignFrame+0xe9 [D:\runtime\src\coreclr\src\vm\eetwain.cpp @ 3984] 
04 00afca94 79ad5386 00afcad4 00afcb08 00000008 coreclr!UnwindStackFrame+0x394 [D:\runtime\src\coreclr\src\vm\eetwain.cpp @ 4101] 
05 00afd050 79ad593a 00afd2a4 00000001 02eb0cc0 coreclr!DoGcStress+0x2a4 [D:\runtime\src\coreclr\src\vm\gccover.cpp @ 1611] 
06 00afd0a0 7981bab3 00afd2a4 9555cb78 00afd170 coreclr!OnGcCoverageInterrupt+0x12f [D:\runtime\src\coreclr\src\vm\gccover.cpp @ 1431] 
07 00afd0ec 7980f1d5 00afd2a4 00afd254 9555cac0 coreclr!IsGcMarker+0x93 [D:\runtime\src\coreclr\src\vm\excep.cpp @ 6565] 
08 00afd154 77e1bc46 00afd170 00afd2a4 00afd254 coreclr!CLRVectoredExceptionHandlerShim+0xc5 [D:\runtime\src\coreclr\src\vm\excep.cpp @ 7949] 
09 00afd1a4 77e1812b 00000000 05261e68 02e76600 ntdll!RtlpCallVectoredHandlers+0xd5 [minkernel\ntdll\vectxcpt.c @ 206] 
0a (Inline) -------- -------- -------- -------- ntdll!RtlCallVectoredExceptionHandlers+0xa [minkernel\ntdll\vectxcpt.c @ 339] 
0b 00afd23c 77e242c6 00afd254 00afd2a4 00afd254 ntdll!RtlDispatchException+0x6f [minkernel\ntos\rtl\i386\exdsptch.c @ 881] 
0c 00afd704 08d0fcdb 12345678 7a00743c ffffffff ntdll!KiUserExceptionDispatcher+0x26 [minkernel\ntos\rtl\i386\userdisp.asm @ 604] 
0d 00afd704 08d08f4c 00000000 00000000 00000000 ComWrappersTests!ComWrappersTests.Common.ITrackerObjectWrapper.AddObjectRef(IntPtr)+0x65b
0e 00afd77c 08d05d0c 00000000 00000000 00000000 ComWrappersTests!ComWrappersTests.Program.ValidateRuntimeTrackerScenario()+0x12c
0f 00afd7a8 798821a1 02e76858 00afdbd4 79a7b53d ComWrappersTests!ComWrappersTests.Program.Main(System.String[])+0x1c
10 00afd7b4 79a7b53d 00afde7c 02e76600 00afde7c coreclr!CallDescrWorkerInternal+0x34 [D:\runtime\src\coreclr\src\vm\i386\asmhelpers.asm @ 607] 
11 00afdbd4 79a7b695 00afde7c abcdefab abcdefab coreclr!CallDescrWorker+0xd7 [D:\runtime\src\coreclr\src\vm\callhelpers.cpp @ 129] 
12 00afdc50 79a7bd1a 00afde7c 00000000 79fdd288 coreclr!CallDescrWorkerWithHandler+0xf6 [D:\runtime\src\coreclr\src\vm\callhelpers.cpp @ 72] 
13 00afdeac 79a7575e 00afdfac 00afdfbc 00000008 coreclr!MethodDescCallSite::CallTargetWorker+0x64f [D:\runtime\src\coreclr\src\vm\callhelpers.cpp @ 554] 
14 (Inline) -------- -------- -------- -------- coreclr!MethodDescCallSite::Call_RetArgSlot+0x99 [D:\runtime\src\coreclr\src\vm\callhelpers.h @ 459] 
15 00afdfd4 79a75348 00afe080 9555fb84 00afe1e0 coreclr!RunMainInternal+0x1b0 [D:\runtime\src\coreclr\src\vm\assembly.cpp @ 1491] 
16 00afe010 79a753f6 00afe080 9555fbf0 00afe1e0 coreclr!``RunMain'::`29'::__Body::Run'::`5'::__Body::Run+0x40 [D:\runtime\src\coreclr\src\vm\assembly.cpp @ 1561] 
17 00afe064 79a7552c 00afe080 9555fb50 0893bb34 coreclr!`RunMain'::`29'::__Body::Run+0x5d [D:\runtime\src\coreclr\src\vm\assembly.cpp @ 1561] 
18 00afe0c4 79a72484 0893bb34 00000001 00afe1e0 coreclr!RunMain+0xd1 [D:\runtime\src\coreclr\src\vm\assembly.cpp @ 1561] 
19 00afe420 798080d9 00afe570 00000001 9555fe1c coreclr!Assembly::ExecuteMainMethod+0x1ba [D:\runtime\src\coreclr\src\vm\assembly.cpp @ 1671] 
1a 00afe588 797cd8f2 02e11a70 00000001 02e92cf8 coreclr!CorHost2::ExecuteAssembly+0x4a9 [D:\runtime\src\coreclr\src\vm\corhost.cpp @ 390] 
1b 00afe5e8 00b2a69a 02e11a70 00000001 00000000 coreclr!coreclr_execute_assembly+0x92 [D:\runtime\src\coreclr\src\dlls\mscoree\unixinterface.cpp @ 397] 
1c 00afe60c 00b2a5e6 02e11a70 00000001 00000000 CoreRun!HostEnvironment::ExecuteAssembly+0x2a [D:\runtime\src\coreclr\src\hosts\corerun\corerun.cpp @ 354] 
1d 00afe9d8 00b2ed7e 00affadc 00aff964 00afea30 CoreRun!ExecuteAssembly+0x35a [D:\runtime\src\coreclr\src\hosts\corerun\corerun.cpp @ 644] 
1e 00affa94 00b2f966 00000001 02e0612c 00affadc CoreRun!TryRun+0x469 [D:\runtime\src\coreclr\src\hosts\corerun\corerun.cpp @ 792] 
1f 00affae4 00b4d4d3 00000002 02e06128 02e0c620 CoreRun!wmain+0x98 [D:\runtime\src\coreclr\src\hosts\corerun\corerun.cpp @ 873] 
20 00affb04 00b4d3a7 7e24ea5c 00b4d530 00b4d530 CoreRun!invoke_main+0x33 [d:\agent\_work\7\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 90] 
21 00affb60 00b4d24d 00affb70 00b4d538 00affb80 CoreRun!__scrt_common_main_seh+0x157 [d:\agent\_work\7\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288] 
22 00affb68 00b4d538 00affb80 75906359 02c88000 CoreRun!__scrt_common_main+0xd [d:\agent\_work\7\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 331] 
23 00affb70 75906359 02c88000 75906340 00affbdc CoreRun!wmainCRTStartup+0x8 [d:\agent\_work\7\s\src\vctools\crt\vcstartup\src\startup\exe_wmain.cpp @ 17] 
24 00affb80 77e17c14 02c88000 48565845 00000000 kernel32!BaseThreadInitThunk+0x19 [base\win32\client\thread.c @ 64] 
25 00affbdc 77e17be4 ffffffff 77e38feb 00000000 ntdll!__RtlUserThreadStart+0x2f [minkernel\ntdll\rtlstrt.c @ 1153] 
26 00affbec 00000000 00000000 00000000 00000000 ntdll!_RtlUserThreadStart+0x1b [minkernel\ntdll\rtlstrt.c @ 1070] 

Method causing assert:

0:000> !ip2md 08d0fcdb 
MethodDesc:   08d5fdd0
Method Name:          ComWrappersTests.Common.ITrackerObjectWrapper.AddObjectRef(IntPtr)
Class:                08d9926c
MethodTable:          08d5ff50
mdToken:              06000020
Module:               0893a5f4
IsJitted:             yes
Current CodeAddr:     08d0fcd8
Version History:
  ILCodeVersion:      00000000
  ReJIT ID:           0
  IL Addr:            08cb271c
     CodeAddr:           08d0fcd8  (OptimizedTier1)
     NativeCodeVersion:  02EB0CC0
     CodeAddr:           08d0f680  (QuickJitted)
     NativeCodeVersion:  00000000

0:000> !DumpIL /i 08cb271c
ilAddr = 08CB271C
IL_0000: ldarg.0 
IL_0001: ldflda TOKEN 400000e
IL_0006: ldfld TOKEN 400001c
IL_000b: ldarg.0 
IL_000c: ldfld TOKEN 400000d
IL_0011: ldarg.1 
IL_0012: ldloca.s VAR OR ARG 0
IL_0014: callvirt TOKEN 6000042
IL_0019: stloc.1 
IL_001a: ldloc.1 
IL_001b: brfalse.s IL_0029
IL_001d: ldstr TOKEN 700002b5
IL_0022: ldloc.1 
IL_0023: newobj TOKEN a00002c
IL_0028: throw 
IL_0029: ldloc.0 
IL_002a: ret 

0:000> !U /d 08d0f680
Normal JIT generated code
ComWrappersTests.Common.ITrackerObjectWrapper.AddObjectRef(IntPtr)
ilAddr is 08CB271C pImport is 1C5C7778
Begin 08D0F680, size 95
>>> 08d0f680 55              push    ebp
08d0f681 8bec            mov     ebp,esp
08d0f683 83ec24          sub     esp,24h
08d0f686 c5d857e4        vxorps  xmm4,xmm4,xmm4
08d0f68a c5fa7f65dc      vmovdqu xmmword ptr [ebp-24h],xmm4
08d0f68f 33c0            xor     eax,eax
08d0f691 8945ec          mov     dword ptr [ebp-14h],eax
08d0f694 8945f0          mov     dword ptr [ebp-10h],eax
08d0f697 8945f4          mov     dword ptr [ebp-0Ch],eax
08d0f69a 894dfc          mov     dword ptr [ebp-4],ecx
08d0f69d 8955f8          mov     dword ptr [ebp-8],edx
08d0f6a0 8b4dfc          mov     ecx,dword ptr [ebp-4]
08d0f6a3 8b4910          mov     ecx,dword ptr [ecx+10h]
08d0f6a6 894de8          mov     dword ptr [ebp-18h],ecx
08d0f6a9 8b4dfc          mov     ecx,dword ptr [ebp-4]
08d0f6ac 8b4904          mov     ecx,dword ptr [ecx+4]
08d0f6af 894de4          mov     dword ptr [ebp-1Ch],ecx
08d0f6b2 8b4df8          mov     ecx,dword ptr [ebp-8]
08d0f6b5 51              push    ecx
08d0f6b6 8d4df4          lea     ecx,[ebp-0Ch]
08d0f6b9 51              push    ecx
08d0f6ba 8b4de8          mov     ecx,dword ptr [ebp-18h]
08d0f6bd 894ddc          mov     dword ptr [ebp-24h],ecx
08d0f6c0 8b4ddc          mov     ecx,dword ptr [ebp-24h]
08d0f6c3 8b4904          mov     ecx,dword ptr [ecx+4]
08d0f6c6 8b55e4          mov     edx,dword ptr [ebp-1Ch]
08d0f6c9 8b45dc          mov     eax,dword ptr [ebp-24h]
08d0f6cc ff500c          call    dword ptr [eax+0Ch]
08d0f6cf 8945f0          mov     dword ptr [ebp-10h],eax
08d0f6d2 837df000        cmp     dword ptr [ebp-10h],0
08d0f6d6 7436            je      ComWrappersTests!ComWrappersTests.Common.ITrackerObjectWrapper.AddObjectRef(IntPtr)+0x8e (08d0f70e)
08d0f6d8 b9ac3d830a      mov     ecx,0A833DACh (MT: System.Runtime.InteropServices.COMException)
08d0f6dd e87edebf70      call    coreclr!JIT_New (7990d560)
08d0f6e2 8945ec          mov     dword ptr [ebp-14h],eax
08d0f6e5 b9b5020000      mov     ecx,2B5h
08d0f6ea baf4a59308      mov     edx,893A5F4h
08d0f6ef e8ecb3bf70      call    coreclr!JIT_StrCns (7990aae0)
08d0f6f4 8945e0          mov     dword ptr [ebp-20h],eax
08d0f6f7 8b55f0          mov     edx,dword ptr [ebp-10h]
08d0f6fa 52              push    edx
08d0f6fb 8b55e0          mov     edx,dword ptr [ebp-20h]
08d0f6fe 8b4dec          mov     ecx,dword ptr [ebp-14h]
08d0f701 e882f6ffff      call    CLRStub[MethodDescPrestub]@ba904f0308d0ed88 (08d0ed88)
08d0f706 8b4dec          mov     ecx,dword ptr [ebp-14h]
08d0f709 e8720dc070      call    coreclr!IL_Throw (79910480)
08d0f70e 8b45f4          mov     eax,dword ptr [ebp-0Ch]
08d0f711 8be5            mov     esp,ebp
08d0f713 5d              pop     ebp
08d0f714 c3              ret

0:000> !U /d 08d0fcd8
Normal JIT generated code
ComWrappersTests.Common.ITrackerObjectWrapper.AddObjectRef(IntPtr)
ilAddr is 08CB271C pImport is 08F18A70
Begin 08D0FCD8, size 63
>>> 08d0fcd8 55              push    ebp
08d0fcd9 8bec            mov     ebp,esp
08d0fcdb 57              push    edi
08d0fcdc 56              push    esi (gcstress)
08d0fcdd 83ec08          sub     esp,8 (gcstress)
08d0fce0 33c0            xor     eax,eax (gcstress)
08d0fce2 8945f4          mov     dword ptr [ebp-0Ch],eax (gcstress)
08d0fce5 8b4110          mov     eax,dword ptr [ecx+10h]
08d0fce8 8b4904          mov     ecx,dword ptr [ecx+4]
08d0fceb 894df0          mov     dword ptr [ebp-10h],ecx
08d0fcee 52              push    edx
08d0fcef 8d55f4          lea     edx,[ebp-0Ch]
08d0fcf2 52              push    edx
08d0fcf3 8b4804          mov     ecx,dword ptr [eax+4]
08d0fcf6 8b55f0          mov     edx,dword ptr [ebp-10h]
08d0fcf9 ff500c          call    dword ptr [eax+0Ch] (gcstress)
08d0fcfc 8bf0            mov     esi,eax
08d0fcfe 85f6            test    esi,esi
08d0fd00 750a            jne     ComWrappersTests!ComWrappersTests.Common.ITrackerObjectWrapper.AddObjectRef(IntPtr)+0x68c (08d0fd0c)
08d0fd02 8b45f4          mov     eax,dword ptr [ebp-0Ch]
08d0fd05 8d65f8          lea     esp,[ebp-8]
08d0fd08 5e              pop     esi (gcstress)
08d0fd09 5f              pop     edi (gcstress)
08d0fd0a 5d              pop     ebp (gcstress)
08d0fd0b c3              ret (gcstress)
08d0fd0c b9ac3d830a      mov     ecx,0A833DACh (MT: System.Runtime.InteropServices.COMException)
08d0fd11 e84ad8bf70      call    coreclr!JIT_New (7990d560)
08d0fd16 8bf8            mov     edi,eax
08d0fd18 b9b5020000      mov     ecx,2B5h
08d0fd1d baf4a59308      mov     edx,893A5F4h
08d0fd22 e8b9adbf70      call    coreclr!JIT_StrCns (7990aae0)
08d0fd27 8bd0            mov     edx,eax
08d0fd29 8bcf            mov     ecx,edi
08d0fd2b e808f0ffff      call    CLRStub[MethodDescPrestub]@ba904f0308d0ed38 (08d0ed38)
08d0fd30 89773c          mov     dword ptr [edi+3Ch],esi
08d0fd33 8bcf            mov     ecx,edi
08d0fd35 e84607c070      call    coreclr!IL_Throw (79910480)
08d0fd3a cc              int     3

/cc @jkotas @AndyAyersMS

@AaronRobinsonMSFT AaronRobinsonMSFT added arch-x86 os-windows GCStress area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels May 13, 2020
@AaronRobinsonMSFT AaronRobinsonMSFT added this to the 5.0 milestone May 13, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label May 13, 2020
@AaronRobinsonMSFT AaronRobinsonMSFT removed this from the 5.0 milestone May 13, 2020
@AndyAyersMS
Copy link
Member

From a quick look, either we need to update eetwain to understand the new prolog zeroing sequences, or the jit is not properly reporting the callee saved registers. Latter seems more likely. I'll investigate.

@AndyAyersMS AndyAyersMS removed the untriaged New issue has not been triaged by the area owner label May 14, 2020
@AndyAyersMS AndyAyersMS added this to the 5.0 milestone May 14, 2020
@AndyAyersMS AndyAyersMS self-assigned this May 27, 2020
@AndyAyersMS
Copy link
Member

I don't see the same failure mode. @AaronRobinsonMSFT can you verify in the below I am running the correct test?

C:\bugs\r36366>set COMPlus_GCStress=0xC

C:\bugs\r36366>c:\repos\runtime0\artifacts\tests\coreclr\Windows_NT.x86.Checked\Tests\Core_Root\corerun.exe C:\repos\runtime0\artifacts\tests\coreclr\Windows_NT.x86.Release\interop\com\ComWrappers\api\ComWrappersTests\ComWrappersTests.dll
Running ValidateComInterfaceCreation...
Running ValidateFallbackQueryInterface...
Running ValidateCreateObjectCachingScenario...

Assert failure(PID 16868 [0x000041e4], Thread: 45300 [0xb0f4]): Thread::IsObjRefValid(&objref)

CORECLR! OBJECTREF::operator= + 0x3D (0x7a7829b2)
CORECLR! `anonymous namespace'::CallCreateObject + 0x4A5 (0x7a93b74e)
CORECLR! `anonymous namespace'::TryGetOrCreateObjectForComInstanceInternal + 0x47A (0x7a943fe8)
CORECLR! ComWrappersNative::TryGetOrCreateObjectForComInstance + 0x1D2 (0x7a9436e2)

@AaronRobinsonMSFT
Copy link
Member Author

@AndyAyersMS Your steps above seem correct. I am still able to reproduce this issue at 0c043d5. My steps and output are below.

D:\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Interop\COM\ComWrappers\API\ComWrappersTests
>set COMPlus_GCStress=0xC

D:\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Interop\COM\ComWrappers\API\ComWrappersTests
>ComWrappersTests.cmd -coreroot D:\runtime\artifacts\tests\coreclr\Windows_NT.x86.Checked\Tests\Core_Root
BEGIN EXECUTION
 "D:\runtime\artifacts\tests\coreclr\Windows_NT.x86.Checked\Tests\Core_Root\corerun.exe" ComWrappersTests.dll
Running ValidateComInterfaceCreation...
Running ValidateFallbackQueryInterface...
Running ValidateCreateObjectCachingScenario...
Running ValidatePrecreatedExternalWrapper...
Running ValidateIUnknownImpls...
Running ValidateBadComWrapperImpl...
Running ValidateRuntimeTrackerScenario...

Assert failure(PID 17140 [0x000042f4], Thread: 5940 [0x1734]): CheckInstrBytePattern(base[offset] & 0xF8, 0x50, base[offset])

CORECLR! SKIP_PUSH_REG + 0x76 (0x7b9b0996)
CORECLR! UnwindEbpDoubleAlignFrameProlog + 0x1BD (0x7b9b0ffd)
CORECLR! UnwindEbpDoubleAlignFrame + 0xE9 (0x7b9b0aa9)
CORECLR! UnwindStackFrame + 0x394 (0x7b9b18a4)
CORECLR! DoGcStress + 0x2A4 (0x7b8fbe36)
CORECLR! OnGcCoverageInterrupt + 0x12F (0x7b8fc3ea)
CORECLR! IsGcMarker + 0x93 (0x7b64bf03)
CORECLR! CLRVectoredExceptionHandlerShim + 0xC5 (0x7b63f625)
NTDLL! LdrSetDllManifestProber + 0xF6 (0x7731bc56)
NTDLL! RtlUnwind + 0x1CB (0x7731813b)
    File: D:\runtime\src\coreclr\src\vm\eetwain.cpp Line: 3021
    Image: D:\runtime\artifacts\tests\coreclr\Windows_NT.x86.Checked\Tests\Core_Root\CoreRun.exe

Expected: 100
Actual: -1073740286
END EXECUTION - FAILED
FAILED

@AndyAyersMS
Copy link
Member

I'm on a fairly old build, let me move up and retry.

@AndyAyersMS
Copy link
Member

Moved up to ef72b95 and now can repro, intermittently.

@AndyAyersMS
Copy link
Member

AndyAyersMS commented May 28, 2020

Have a theory about what's going on. The key managed method here is ComputeVtables.

When run normally, this test case doesn't run long enough for ComputeVtables to tier up. But when run under GCStress things slow down enough that the method gets rejitted.

The tier0 code has the following prolog & gc info

0845c628 55              push    ebp
0845c629 8bec            mov     ebp,esp
0845c62b 83ec74          sub     esp,74h
0845c62e c5f877          vzeroupper
0845c631 c5d857e4        vxorps  xmm4,xmm4,xmm4
0845c635 b8a0ffffff      mov     eax,0FFFFFFA0h

GC Info for method TestComWrappers:ComputeVtables(System.Object,int,byref):int:this
GC info size =  26
Method info block:
    method      size   = 01A8
    prolog      size   = 50
    epilog      size   =  6
    epilog     count   =  1
    epilog      end    = yes
    callee-saved regs  = EBP

The Tier1 method has:

0a368750 55              push    ebp
0a368751 8bec            mov     ebp,esp
0a368753 57              push    edi
0a368754 56              push    esi (gcstress)
0a368755 83ec3c          sub     esp,3Ch (gcstress)
0a368758 c5f877          vzeroupper (gcstress)
0a36875b c5d857e4        vxorps  xmm4,xmm4,xmm4 (gcstress)

Method info block:
    method      size   = 0101
    prolog      size   = 41
    epilog      size   =  9
    epilog     count   =  1
    epilog      end    = yes
    callee-saved regs  = EDI ESI EBP
    ebp frame          = yes

and as you can see we've just stress interrupted at the push edi.

When we go to do stress and discover we're in the prolog, for x86 only, we unwind the stack via UnwindStackFrame. Here we need to switch the unwinder over to looking at the un-instrumented version of the code as x86 unwinding does disassembly and gc stress has potentially munged the code. GC stress saves off the original version, so we just need to find that version, and to do this, UnwindStackFrame it calls GetSavedMethodCode(). However this latter method was never adapted for tiering and picks up the Tier0 version of the code from the method desc, rather than figuring out which native code version is in play and using that to find the appropriate version of code for unwinding.

So when we unwind we are using the Tier1 gc info and the Tier0 code, and this is what leads to the assert.

Should be fairly simple to fix, we just need to update GetSavedMethodCode() to be tiering-aware. Affected path is x86 only.

@AaronRobinsonMSFT
Copy link
Member Author

@AndyAyersMS Thanks for the analysis. I do have a question about updating GetSaveMethodCode(). That method always seems to call GetStartAddress() except for the GC stress on non 64-bit - is that the branch you are talking about?

TADDR EECodeInfo::GetSavedMethodCode()
{
CONTRACTL {
// All EECodeInfo methods must be NOTHROW/GC_NOTRIGGER since they can
// be used during GC.
NOTHROW;
GC_NOTRIGGER;
HOST_NOCALLS;
SUPPORTS_DAC;
} CONTRACTL_END;
#ifndef HOST_64BIT
#if defined(HAVE_GCCOVER)
_ASSERTE (!m_pMD->m_GcCover || GCStress<cfg_instr>::IsEnabled());
if (GCStress<cfg_instr>::IsEnabled()
&& m_pMD->m_GcCover)
{
_ASSERTE(m_pMD->m_GcCover->savedCode);
// Make sure we return the TADDR of savedCode here. The byte array is not marshaled automatically.
// The caller is responsible for any necessary marshaling.
return PTR_TO_MEMBER_TADDR(GCCoverageInfo, m_pMD->m_GcCover, savedCode);
}
#endif //defined(HAVE_GCCOVER)
#endif
return GetStartAddress();
}
TADDR EECodeInfo::GetStartAddress()
{
CONTRACTL {
NOTHROW;
GC_NOTRIGGER;
HOST_NOCALLS;
SUPPORTS_DAC;
} CONTRACTL_END;
return m_pJM->JitTokenToStartAddress(m_methodToken);
}

@AndyAyersMS
Copy link
Member

Yes, exactly that bit. It grabs the GC coverage info from the MethodDesc, but (typically?) that will be the coverage info for the initially jitted version of the code.

Simple fix attempt hits a lock level violation, so might need to wrangle things around a bit.

@AndyAyersMS
Copy link
Member

@kouvel any suggestions on how best to fix this?

@AaronRobinsonMSFT
Copy link
Member Author

@kouvel The lock issue I am seeing is below. I am unsure the reason for CrstReadyToRunEntryPointToMethodDescMap being in this code path.

Assert failure(PID 16860 [0x000041dc], Thread: 10772 [0x2a14]): Consistency check failed: Crst Level violation: Can't take level 10 lock CrstCodeVersioning because you already holding level 4 lock CrstReadyToRunEntryPointToMethodDescMap
FAILED: false

CORECLR! CHECK::Trigger + 0x310 (0x502c85d3)
CORECLR! CrstBase::IsSafeToTake + 0x30A (0x50323af9)
CORECLR! CrstBase::Enter + 0x152 (0x503235cf)
CORECLR! CrstBase::AcquireLock + 0xD (0x502f18c0)
CORECLR! CodeVersionManager::LockHolder::LockHolder + 0x25 (0x5032c20f)
CORECLR! EECodeInfo::GetNativeCodeVersion + 0xC7 (0x50417d62)
CORECLR! EECodeInfo::GetSavedMethodCode + 0x8C (0x50418223)
CORECLR! UnwindStackFrame + 0xD3 (0x506715e3)
CORECLR! StackFrameIterator::NextRaw + 0x5BD (0x50388adb)
CORECLR! StackFrameIterator::Next + 0x46 (0x503884da)
    File: D:\runtime\src\coreclr\src\vm\crst.cpp Line: 766
    Image: D:\runtime\artifacts\tests\coreclr\Windows_NT.x86.Checked\Tests\Core_Root\CoreRun.exe

@AaronRobinsonMSFT
Copy link
Member Author

The following is taking the CrstReadyToRunEntryPointToMethodDescMap lock.

void ReadyToRunInfo::SetMethodDescForEntryPointInNativeImage(PCODE entryPoint, MethodDesc *methodDesc)
{
CONTRACTL
{
PRECONDITION(!m_isComponentAssembly);
}
CONTRACTL_END;
CrstHolder ch(&m_Crst);
if ((TADDR)m_entryPointToMethodDescMap.LookupValue(PCODEToPINSTR(entryPoint), (LPVOID)PCODEToPINSTR(entryPoint)) == (TADDR)INVALIDENTRY)
{
m_entryPointToMethodDescMap.InsertValue(PCODEToPINSTR(entryPoint), methodDesc);
}
}

@kouvel
Copy link
Member

kouvel commented May 28, 2020

Hmm I'm not sure why CrstReadyToRunEntryPointToMethodDescMap would be held during the unwind, it only seems to initialize or modify the hash map under the lock.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-x86 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI GCStress os-windows
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants