Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failure: Microsoft.Extensions.Hosting.WindowsServiceLifetimeTests.ExceptionOnStartIsPropagated #107671

Closed
v-wenyuxu opened this issue Sep 11, 2024 · 21 comments · Fixed by #107820
Assignees
Labels
arch-arm64 area-Extensions-Hosting blocking-clean-ci-optional Blocking optional rolling runs in-pr There is an active PR which will close this issue when it is merged JitStress CLR JIT issues involving JIT internal stress modes os-windows
Milestone

Comments

@v-wenyuxu
Copy link

Failed in: runtime-coreclr libraries-jitstress 20240910.1

Failed tests:

net9.0-windows-Release-arm64-jitstress1_tiered-Windows.11.Arm64.Open
    - Microsoft.Extensions.Hosting.WindowsServiceLifetimeTests.ExceptionOnStartIsPropagated
net9.0-windows-Release-arm64-jitminopts-Windows.11.Arm64.Open
    - Microsoft.Extensions.Hosting.WindowsServiceLifetimeTests.ExceptionOnStartIsPropagated

Error message:

 Microsoft.DotNet.RemoteExecutor.RemoteExecutionException : Half-way through waiting for remote process.
Memory load: 38

Image Name                     PID Services                                    
========================= ======== ============================================
svchost.exe                    928 BrokerInfrastructure, DcomLaunch, Power,    
                                   SystemEventsBroker                          
svchost.exe                    476 RpcEptMapper, RpcSs                         
svchost.exe                    528 LSM                                         
svchost.exe                   1060 TermService                                 
svchost.exe                   1068 CryptSvc                                    
svchost.exe                   1120 lmhosts                                     
svchost.exe                   1240 NcbService                                  
svchost.exe                   1248 TimeBrokerSvc                               
svchost.exe                   1308 AppIDSvc                                    
svchost.exe                   1444 nsi                                         
svchost.exe                   1500 Schedule                                    
svchost.exe                   1520 netprofm                                    
svchost.exe                   1548 CoreMessagingRegistrar                      
svchost.exe                   1568 ProfSvc                                     
svchost.exe                   1612 UmRdpService                                
svchost.exe                   1684 UserManager                                 
svchost.exe                   1752 vmicheartbeat                               
svchost.exe                   1852 vmickvpexchange                             
svchost.exe                   1900 vmicshutdown                                
svchost.exe                   1916 LanmanWorkstation                           
svchost.exe                   1968 vmictimesync                                
svchost.exe                   2000 DispBrokerDesktopSvc                        
svchost.exe                   2088 Dnscache                                    
svchost.exe                   2172 EventLog                                    
svchost.exe                   2204 SessionEnv                                  
svchost.exe                   2272 CertPropSvc                                 
svchost.exe                   2324 EventSystem                                 
svchost.exe                   2336 SysMain                                     
svchost.exe                   2364 Themes                                      
svchost.exe                   2464 Dhcp                                        
svchost.exe                   2548 SENS                                        
svchost.exe                   2632 AudioEndpointBuilder                        
svchost.exe                   2664 FontCache                                   
svchost.exe                   2720 Audiosrv                                    
svchost.exe                   2768 WinHttpAutoProxySvc                         
svchost.exe                   2844 TextInputManagementService                  
svchost.exe                   2972 DusmSvc                                     
svchost.exe                   3008 Wcmsvc                                      
svchost.exe                   3048 ShellHWDetection                            
svchost.exe                   3088 BFE, mpssvc                                 
svchost.exe                   3184 DiagTrack                                   
svchost.exe                   3196 DPS                                         
svchost.exe                   3204 iphlpsvc                                    
svchost.exe                   3288 StateRepository                             
svchost.exe                   3312 LanmanServer                      

Stack trace:

   at Microsoft.DotNet.RemoteExecutor.RemoteInvokeHandle.Dispose(Boolean disposing) in /_/src/Microsoft.DotNet.RemoteExecutor/src/RemoteInvokeHandle.cs:line 225
   at Microsoft.DotNet.RemoteExecutor.RemoteInvokeHandle.Dispose() in /_/src/Microsoft.DotNet.RemoteExecutor/src/RemoteInvokeHandle.cs:line 55
   at Microsoft.Extensions.Hosting.WindowsServiceTester.Dispose(Boolean disposing) in /_/src/libraries/Microsoft.Extensions.Hosting.WindowsServices/tests/WindowsServiceTester.cs:line 147
   at System.ComponentModel.Component.Dispose() in /_/src/libraries/System.ComponentModel.Primitives/src/System/ComponentModel/Component.cs:line 73
   at Microsoft.Extensions.Hosting.WindowsServiceLifetimeTests.ExceptionOnStartIsPropagated() in /_/src/libraries/Microsoft.Extensions.Hosting.WindowsServices/tests/WindowsServiceLifetimeTests.cs:line 85
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(Object obj, BindingFlags invokeAttr) in /_/src/libraries/System.Private.CoreLib/src/System/Reflection/MethodBaseInvoker.cs:line 57
@v-wenyuxu v-wenyuxu added arch-arm64 os-windows JitStress CLR JIT issues involving JIT internal stress modes blocking-clean-ci-optional Blocking optional rolling runs labels Sep 11, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Sep 11, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Sep 11, 2024
@jeffschwMSFT jeffschwMSFT added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 11, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Sep 11, 2024
@JulieLeeMSFT JulieLeeMSFT added this to the 9.0.0 milestone Sep 11, 2024
@JulieLeeMSFT JulieLeeMSFT removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Sep 11, 2024
@amanasifkhalid
Copy link
Member

Looks like the same issue as #107589

@kunalspathak
Copy link
Member

It seems this was recently in the past #105377, but was closed for some reason. @buyaa-n do you remember?

Don't know what team owns it, it could be just network failure, whatever it is not runtime issue, so closing

@kunalspathak
Copy link
Member

#88027 also seems to be related which was getting reported since last year and the specific test case was disabled, but the underlying issue is not fixed.

@kunalspathak
Copy link
Member

Alright, #106494 reported the real issue, but even that was closed and is getting tracked in #106371, that @kouvel said was fixed. I am assigning it to Kount, because the underlying test failure is identical to other issues.

@kouvel
Copy link
Member

kouvel commented Sep 11, 2024

I don't think this is the same as #106494 or #106371. The error appears to be a timeout in a test that uses RemoteExecutor, but it looks like a different test. The test associated with #106371 was a new test and was disabled.

@buyaa-n
Copy link
Contributor

buyaa-n commented Sep 11, 2024

It seems this was recently in the past #105377, but was closed for some reason. @buyaa-n do you remember?

Don't know what team owns it, it could be just network failure, whatever it is not runtime issue, so closing

Basically, closed because the WindowsServiceLifetimeTests test was passed and the failure happened in RemoteExecutor which might be just a network issue

@kouvel kouvel removed their assignment Sep 11, 2024
@kunalspathak
Copy link
Member

not able to repro this locally.

image

@v-wenyuxu
Copy link
Author

Failed in: runtime-coreclr libraries-jitstress 20240911.1

Failed tests:

net9.0-windows-Release-arm64-jitminopts-Windows.11.Arm64.Open
    - Microsoft.Extensions.Hosting.WindowsServiceLifetimeTests.ExceptionOnStartIsPropagated

Error message:

 Microsoft.DotNet.RemoteExecutor.RemoteExecutionException : Half-way through waiting for remote process.
Memory load: 38

Image Name                     PID Services                                    
========================= ======== ============================================
svchost.exe                    932 BrokerInfrastructure, DcomLaunch, Power,    
                                   SystemEventsBroker                          
svchost.exe                    468 RpcEptMapper, RpcSs                         
svchost.exe                    548 LSM                                         
svchost.exe                   1072 CryptSvc                                    
svchost.exe                   1080 TermService                                 
svchost.exe                   1136 lmhosts                                     
svchost.exe                   1268 NcbService                                  
svchost.exe                   1316 AppIDSvc                                    
svchost.exe                   1340 TimeBrokerSvc                               
svchost.exe                   1424 nsi                                         
svchost.exe                   1508 CoreMessagingRegistrar                      
svchost.exe                   1516 netprofm                                    
svchost.exe                   1540 Schedule                                    
svchost.exe                   1564 UmRdpService                                
svchost.exe                   1600 ProfSvc                                     
svchost.exe                   1772 UserManager                                 
svchost.exe                   1824 LanmanWorkstation                           
svchost.exe                   1880 vmicheartbeat                               
svchost.exe                   1920 vmickvpexchange                             
svchost.exe                   1968 vmicshutdown                                
svchost.exe                   1988 DispBrokerDesktopSvc                        
svchost.exe                   2028 vmictimesync                                
svchost.exe                   1580 SessionEnv                                  
svchost.exe                   2132 Dnscache                                    
svchost.exe                   2184 CertPropSvc                                 
svchost.exe                   2224 EventLog                                    
svchost.exe                   2304 SysMain                                     
svchost.exe                   2312 EventSystem                                 
svchost.exe                   2332 Themes                                      
svchost.exe                   2548 SENS                                        
svchost.exe                   2576 Dhcp                                        
svchost.exe                   2644 AudioEndpointBuilder                        
svchost.exe                   2664 FontCache                                   
svchost.exe                   2740 WinHttpAutoProxySvc                         
svchost.exe                   2788 Audiosrv                                    
svchost.exe                   2872 TextInputManagementService                  
svchost.exe                   2940 DusmSvc                                     
svchost.exe                   2948 Wcmsvc                                      
svchost.exe                   3036 ShellHWDetection                            
svchost.exe                   2468 BFE, mpssvc                                 
svchost.exe                   3176 DiagTrack                                   
svchost.exe                   3184 DPS                                         
svchost.exe                   3192 iphlpsvc                                    
svchost.exe                   3252 LanmanServer                                
svchost.exe                   3292 StateRepository                   

Stack trace:

   at Microsoft.DotNet.RemoteExecutor.RemoteInvokeHandle.Dispose(Boolean disposing) in /_/src/Microsoft.DotNet.RemoteExecutor/src/RemoteInvokeHandle.cs:line 225
   at Microsoft.DotNet.RemoteExecutor.RemoteInvokeHandle.Dispose() in /_/src/Microsoft.DotNet.RemoteExecutor/src/RemoteInvokeHandle.cs:line 55
   at Microsoft.Extensions.Hosting.WindowsServiceTester.Dispose(Boolean disposing) in /_/src/libraries/Microsoft.Extensions.Hosting.WindowsServices/tests/WindowsServiceTester.cs:line 147
   at System.ComponentModel.Component.Dispose() in /_/src/libraries/System.ComponentModel.Primitives/src/System/ComponentModel/Component.cs:line 73
   at Microsoft.Extensions.Hosting.WindowsServiceLifetimeTests.ExceptionOnStartIsPropagated() in /_/src/libraries/Microsoft.Extensions.Hosting.WindowsServices/tests/WindowsServiceLifetimeTests.cs:line 85
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(Object obj, BindingFlags invokeAttr) in /_/src/libraries/System.Private.CoreLib/src/System/Reflection/MethodBaseInvoker.cs:line 57

@kunalspathak
Copy link
Member

@buyaa-n - do you think we need similar fix as in #105666 for ExceptionOnStartIsPropagated because the failure is similar to #84438 ?

@buyaa-n
Copy link
Contributor

buyaa-n commented Sep 12, 2024

Yeah, the failure looks similar #84438, I think it might need fix similar to #84447 CC @ericstj

@kunalspathak
Copy link
Member

This test started failing very recently. The test ran on 9/6 without problem https://dev.azure.com/dnceng-public/public/_build/results?buildId=800541&view=results. So definitely do not need a backport.

@kunalspathak kunalspathak modified the milestones: 9.0.0, 10.0.0 Sep 12, 2024
@ericstj
Copy link
Member

ericstj commented Sep 12, 2024

Here remote executor is actually tracking the service process running. I've been seeing failures of these tests from JIT stress Arm64 pipelines (only) that don't reproduce on local x64 machine, nor arm64 lab machines.

I suspect something about the queue or machine setup for these JIT stress runs is busted.

The stack you're seeing here is just the remote executor stack that's waiting for the service process to exit. To understand what's actually going on you need to see what the service process is doing.

If I look at https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-212737db1d264eb29f/Microsoft.Extensions.Hosting.WindowsServices.Tests/1/console.534963dd.log?helixlogtype=result I see a dump was created. I'm having trouble getting from helix though.

@ericstj
Copy link
Member

ericstj commented Sep 12, 2024

So the next action would be to look at that dump and answer the question "Why is this process hung?"

You can use runfo to get all the files from the test:

dotnet tool install --global --prerelease runfo
runfo get-helix-payload -j 212737db-1d26-4eb2-9f07-e787360695d7 -w Microsoft.Extensions.Hosting.WindowsServices.Tests -o .\

I had a look and none of the tests seem to be running any user code. I can't see any of the managed stack info (how can I load SOS when debugging ARM64 on amd64?)

Here's all the native stacks
 # RetAddr               : Args to Child                                                           : Call Site
00 00007ffe`71424270     : 00000001`7637ce00 00000001`08207004 00000198`c1827850 00007ffe`08a1a320 : ntdll!NtWaitForMultipleObjects+0x4
01 00007ffe`082ed644     : 00000001`7637ce00 00000001`08207004 00000198`c1827850 00007ffe`08a1a320 : KERNELBASE!WaitForMultipleObjectsEx+0xd0
02 00007ffe`082edd38     : 00000021`7637d1d0 00007ffe`082edd38 00000000`00000001 00000198`c18244f0 : coreclr!Thread::DoAppropriateAptStateWait+0x10c [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 3187] 
03 00007ffe`082ed7c8     : 00000001`c1750300 00000000`00000001 00000198`c18244f0 00000001`00000000 : coreclr!Thread::DoAppropriateWaitWorker+0x508 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 3367] 
04 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : coreclr!Thread::DoAppropriateWait::__l11::__Body::Run+0x1c [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 3036] 
05 00007ffe`0843a38c     : 00000021`7637d2e0 00007ffe`0843a38c 00000021`00000001 00000021`7637d3c0 : coreclr!Thread::DoAppropriateWait+0x150 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 3038] 
06 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : coreclr!CLREventBase::WaitEx+0xdc [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 459] 
07 00007ffe`082eae28     : 00000021`7637d330 00007ffe`082eae28 00000021`7637d3c0 00007ffe`0883f000 : coreclr!CLREventBase::Wait+0xf4 [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 412] 
08 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : coreclr!Thread::Wait+0x3c [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 3850] 
09 00007ffe`082e8ef4     : 00000021`7637d360 00007ffe`082e8ef4 000001d9`577be288 000001d9`577be2a0 : coreclr!Thread::Block+0xb0 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 3807] 
0a 00007ffe`082e8a24     : 00000021`7637d4b0 00007ffe`082e8a24 00000198`ffffffff 00000198`c1824400 : coreclr!SyncBlock::Wait+0x4b4 [D:\a\_work\1\s\src\coreclr\vm\syncblk.cpp @ 2879] 
0b 00007ffe`08686ba4     : 00000021`7637d4e0 00007ffe`08686ba4 00007ffe`088eb900 00000198`c1827850 : coreclr!ObjHeader::Wait+0x84 [D:\a\_work\1\s\src\coreclr\vm\syncblk.cpp @ 2265] 
0c (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : coreclr!Object::Wait+0xc [D:\a\_work\1\s\src\coreclr\vm\object.h @ 338] 
0d 00007ffe`05b29dec     : 00000021`7637d590 00007ffe`05b29dec 00000021`7637d5c8 00000021`7637d428 : coreclr!Monitor_Wait+0x144 [D:\a\_work\1\s\src\coreclr\classlibnative\bcltype\objectnative.cpp @ 149] 
0e 00007ffe`05b422fc     : 00000021`7637d670 00007ffe`05b422fc 00000000`00000000 068d99f8`00000000 : System_Private_CoreLib+0x4d9dec
0f 00007ffe`05b6f308     : 00000021`7637d6f0 00007ffe`05b6f308 00000198`c4c30288 00000198`c4c30298 : System_Private_CoreLib+0x4f22fc
10 00007ffe`05b6efdc     : 00000021`7637d750 00007ffe`05b6efdc 00000198`c4c30270 00000198`c4c301e0 : System_Private_CoreLib+0x51f308
11 00007ffe`05c00698     : 00000021`7637d7a0 00007ffe`05c00698 00000198`c4c301e0 00000000`00000000 : System_Private_CoreLib+0x51efdc
12 00007ffd`a8cfd6b8     : 00000021`7637d7d0 00007ffd`a8cfd6b8 00000021`7637d478 00000021`7637dc50 : System_Private_CoreLib+0x5b0698
13 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00007ffd`a8cfd6b8
 # RetAddr               : Args to Child                                                           : Call Site
00 00007ffe`71424270     : 00000001`00000020 00000000`c17f0140 00000000`00000021 00000000`00000001 : ntdll!NtWaitForMultipleObjects+0x4
01 00007ffe`0860a4c4     : 00000001`00000020 00000000`c17f0140 00000000`00000021 00000000`00000001 : KERNELBASE!WaitForMultipleObjectsEx+0xd0
02 00007ffe`085743d4     : 00000000`00008018 00000000`ffffffff 00000198`c1847630 00000000`00000190 : coreclr!ds_ipc_poll+0x184 [D:\a\_work\1\s\src\native\eventpipe\ds-ipc-pal-namedpipe.c @ 270] 
03 00007ffe`08587098     : 00000000`00000000 00000000`00000000 00000000`ffffffff 00000198`c1847630 : coreclr!ds_ipc_stream_factory_get_next_available_stream+0x344 [D:\a\_work\1\s\src\native\eventpipe\ds-ipc.c @ 402] 
04 00007ffe`74142310     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : coreclr!server_thread+0xd8 [D:\a\_work\1\s\src\native\eventpipe\ds-server.c @ 129] 
05 00007ffe`75715b2c     : 00000021`767ffbe0 00007ffe`75715b2c 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0x30
06 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x3c
 # RetAddr               : Args to Child                                                           : Call Site
00 00007ffe`71424270     : 00000003`00000000 00000000`00000021 00000198`c17f0140 00000198`c193d000 : ntdll!NtWaitForMultipleObjects+0x4
01 00007ffe`08603400     : 00000003`00000000 00000000`00000021 00000198`c17f0140 00000198`c193d000 : KERNELBASE!WaitForMultipleObjectsEx+0xd0
02 00007ffe`08604a24     : 00000021`7697fc40 00000198`c1822f20 00000021`7697fbd8 00007ffe`00000000 : coreclr!DebuggerRCThread::MainLoop+0x188 [D:\a\_work\1\s\src\coreclr\debug\ee\rcthread.cpp @ 927] 
03 00007ffe`08604b5c     : 00000021`7697fd40 00007ffe`08604b5c ffffffff`fffffffe 00000198`c1822f20 : coreclr!DebuggerRCThread::ThreadProc+0x34c [D:\a\_work\1\s\src\coreclr\debug\ee\rcthread.cpp @ 730] 
04 00007ffe`74142310     : 00000021`7697fd60 00007ffe`74142310 00000000`00000000 00000000`00000000 : coreclr!DebuggerRCThread::ThreadProcStatic+0x7c [D:\a\_work\1\s\src\coreclr\debug\ee\rcthread.cpp @ 1321] 
05 00007ffe`75715b2c     : 00000021`7697fda0 00007ffe`75715b2c 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0x30
06 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x3c
 # RetAddr               : Args to Child                                                           : Call Site
00 00007ffe`71424270     : 00000002`087ff6b0 00000000`c18453a0 00000021`76aff6c0 00007ffe`08349fd4 : ntdll!NtWaitForMultipleObjects+0x4
01 00007ffe`0837a8c8     : 00000002`087ff6b0 00000000`c18453a0 00000021`76aff6c0 00007ffe`08349fd4 : KERNELBASE!WaitForMultipleObjectsEx+0xd0
02 00007ffe`08379c2c     : 00000021`76affc30 00007ffe`08379c2c 00007ffe`08800000 00000198`c18453a0 : coreclr!FinalizerThread::WaitForFinalizerEvent+0xc0 [D:\a\_work\1\s\src\coreclr\vm\finalizerthread.cpp @ 232] 
03 00007ffe`082f192c     : 00000000`00000000 00000198`c1934000 00007ffe`759ea2c0 00000198`c17f02c0 : coreclr!FinalizerThread::FinalizerThreadWorker+0x1ec [D:\a\_work\1\s\src\coreclr\vm\finalizerthread.cpp @ 321] 
04 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : coreclr!ManagedThreadBase_DispatchInner+0x18 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 7100] 
05 00007ffe`082f3440     : 00000021`76affd90 00007ffe`082f3440 00000198`c18a7700 00000198`c18453a0 : coreclr!ManagedThreadBase_DispatchMiddle+0x9c [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 7144] 
06 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : coreclr!ManagedThreadBase_DispatchOuter::__l11::__Body::Run::__l5::__Body::Run+0x8 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 7302] 
07 00007ffe`082f1a0c     : 00000021`76affde0 00007ffe`082f1a0c 00000198`00000000 00000021`76affdf8 : coreclr!`ManagedThreadBase_DispatchOuter'::`11'::__Body::Run+0x28 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 7304] 
08 00007ffe`082ef070     : 00000021`76affe60 00007ffe`082ef070 00007ffe`00000001 00000021`76affe10 : coreclr!ManagedThreadBase_DispatchOuter+0xb4 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 7326] 
09 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : coreclr!ManagedThreadBase_NoADTransition+0x18 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 7372] 
0a 00007ffe`0837964c     : 00000021`76affe90 00007ffe`0837964c 00007ffe`08379a40 00000021`76affe70 : coreclr!ManagedThreadBase::FinalizerBase+0x20 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 7391] 
0b 00007ffe`74142310     : 00000021`76affef0 00007ffe`74142310 00000000`00000000 00000000`00000000 : coreclr!FinalizerThread::FinalizerThreadStart+0x18c [D:\a\_work\1\s\src\coreclr\vm\finalizerthread.cpp @ 464] 
0c 00007ffe`75715b2c     : 00000021`76afff30 00007ffe`75715b2c 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0x30
0d 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x3c
 # RetAddr               : Args to Child                                                           : Call Site
00 00007ffe`71401e48     : 00000021`76dff3f0 00007ffe`7353a464 00000000`00000360 00000000`00000360 : ntdll!NtWaitForSingleObject+0x4
01 00007ffe`7353a464     : 00000021`76dff3f0 00007ffe`7353a464 00000000`00000360 00000000`00000360 : KERNELBASE!WaitForSingleObjectEx+0x88
02 00007ffe`73539c04     : 00000000`00000000 00000000`00000360 00000000`00000000 00000000`00000c09 : sechost!RegisterServiceCtrlHandlerExW+0xc04
03 00007ffe`73539830     : 00000021`76dff468 00007ffe`00000001 000001d8`00000000 00000000`00000000 : sechost!RegisterServiceCtrlHandlerExW+0x3a4
04 00007ffd`a8d008b8     : 00000021`76dff610 00007ffd`a8d008b8 00000198`c188d700 00000000`00000000 : sechost!StartServiceCtrlDispatcherW+0x80
05 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00007ffd`a8d008b8
 # RetAddr               : Args to Child                                                           : Call Site
00 00007ffe`71424270     : 00000001`ffffffff 00000001`ffffffff ffffff00`00000000 00000003`00000000 : ntdll!NtWaitForMultipleObjects+0x4
01 00007ffe`082ed644     : 00000001`ffffffff 00000001`ffffffff ffffff00`00000000 00000003`00000000 : KERNELBASE!WaitForMultipleObjectsEx+0xd0
02 00007ffe`082edd38     : 00000021`772bf740 00007ffe`082edd38 00000000`00000001 00000021`772bf860 : coreclr!Thread::DoAppropriateAptStateWait+0x10c [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 3187] 
03 00007ffe`082ed7c8     : 00000001`c1750300 00000000`00000001 00000021`772bf860 00000003`00000060 : coreclr!Thread::DoAppropriateWaitWorker+0x508 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 3367] 
04 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : coreclr!Thread::DoAppropriateWait::__l11::__Body::Run+0x1c [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 3036] 
05 00007ffe`0834b998     : 00000021`772bf850 00007ffe`0834b998 00000021`00000001 00000000`00000000 : coreclr!Thread::DoAppropriateWait+0x150 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 3038] 
06 00007ffe`05b31594     : 00000021`772bf8e0 00007ffe`05b31594 00000000`000003d0 00000021`772bf920 : coreclr!WaitHandle_WaitOneCore+0x128 [D:\a\_work\1\s\src\coreclr\vm\comwaithandle.cpp @ 29] 
07 00007ffe`05b5fd88     : 00000021`772bf9d0 00007ffe`05b5fd88 00000021`772bfe60 00007ffe`068d9cc8 : System_Private_CoreLib+0x4e1594
08 00007ffe`081b1e64     : 00000021`772bfb10 00007ffe`081b1e64 00000000`00000000 00000000`00000000 : System_Private_CoreLib+0x50fd88
09 00007ffe`0832c98c     : 00000021`772bfb30 00007ffe`0832c98c 000001d9`57678cb0 00000000`00000000 : coreclr!CallDescrWorkerInternal+0x84 [D:\a\_work\1\s\artifacts\obj\coreclr\windows.arm64.Checked\vm\wks\CallDescrWorkerARM64.asm @ 5766] 
0a 00007ffe`0832d1d8     : 00000021`772bfc20 00007ffe`0832d1d8 00000000`00000000 00000021`772bfc70 : coreclr!CallDescrWorkerWithHandler+0x124 [D:\a\_work\1\s\src\coreclr\vm\callhelpers.cpp @ 63] 
0b 00007ffe`083439f4     : 00000000`00000000 00007ffe`745f4228 00000021`772bfcb0 00007ffe`00000000 : coreclr!DispatchCallSimple+0xf8 [D:\a\_work\1\s\src\coreclr\vm\callhelpers.cpp @ 248] 
0c 00007ffe`082f192c     : 00000198`c4c34750 00000198`c4c34750 00000198`c4c34750 000001d9`57678cb0 : coreclr!ThreadNative::KickOffThread_Worker+0x124 [D:\a\_work\1\s\src\coreclr\vm\comsynchronizable.cpp @ 148] 
0d (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : coreclr!ManagedThreadBase_DispatchInner+0x18 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 7100] 
0e 00007ffe`082f3440     : 00000021`772bfde0 00007ffe`082f3440 00000198`c18a7700 000001d9`57678cb0 : coreclr!ManagedThreadBase_DispatchMiddle+0x9c [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 7144] 
0f (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : coreclr!ManagedThreadBase_DispatchOuter::__l11::__Body::Run::__l5::__Body::Run+0x8 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 7302] 
10 00007ffe`082f1a0c     : 00000021`772bfe30 00007ffe`082f1a0c 000001d9`00000000 00000021`772bfe48 : coreclr!`ManagedThreadBase_DispatchOuter'::`11'::__Body::Run+0x28 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 7304] 
11 00007ffe`082f1604     : 00000021`772bfeb0 00007ffe`082f1604 00000000`00000001 00000021`772bfe60 : coreclr!ManagedThreadBase_DispatchOuter+0xb4 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 7326] 
12 (Inline Function)     : --------`-------- --------`-------- --------`-------- --------`-------- : coreclr!ManagedThreadBase_FullTransition+0x14 [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 7348] 
13 00007ffe`08343888     : 00000021`772bff60 00007ffe`08343888 00007ffe`083438d0 00000021`772bff10 : coreclr!ManagedThreadBase::KickOff+0x1c [D:\a\_work\1\s\src\coreclr\vm\threads.cpp @ 7383] 
14 00007ffe`74142310     : 00000000`00000000 00000000`00000000 00007ffe`08a12000 00007ffe`08864a60 : coreclr!ThreadNative::KickOffThread+0x188 [D:\a\_work\1\s\src\coreclr\vm\comsynchronizable.cpp @ 221] 
15 00007ffe`75715b2c     : 00000021`772bffd0 00007ffe`75715b2c 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0x30
16 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x3c

@kunalspathak
Copy link
Member

I had a look and none of the tests seem to be running any user code. I can't see any of the managed stack info (how can I load SOS when debugging ARM64 on amd64?)

Yes, I was at similar conclusion when looking at the trace of this one and #107589. Additionally, I couldn't get SOS to load in my windbg to confirm what is going on.

@v-wenyuxu
Copy link
Author

Failed in: runtime-coreclr libraries-jitstress 20240912.1

Failed tests:

net9.0-windows-Release-arm64-jitminopts-Windows.11.Arm64.Open
    - Microsoft.Extensions.Hosting.WindowsServiceLifetimeTests.ExceptionOnStartIsPropagated

Error message:

 Microsoft.DotNet.RemoteExecutor.RemoteExecutionException : Half-way through waiting for remote process.
Memory load: 35

Image Name                     PID Services                                    
========================= ======== ============================================
svchost.exe                    928 BrokerInfrastructure, DcomLaunch, Power,    
                                   SystemEventsBroker                          
svchost.exe                    488 RpcEptMapper, RpcSs                         
svchost.exe                    548 LSM                                         
svchost.exe                   1048 CryptSvc                                    
svchost.exe                   1064 TermService                                 
svchost.exe                   1132 lmhosts                                     
svchost.exe                   1160 NcbService                                  
svchost.exe                   1288 nsi                                         
svchost.exe                   1296 AppIDSvc                                    
svchost.exe                   1344 TimeBrokerSvc                               
svchost.exe                   1416 netprofm                                    
svchost.exe                   1516 Schedule                                    
svchost.exe                   1552 ProfSvc                                     
svchost.exe                   1560 vmicheartbeat                               
svchost.exe                   1580 vmickvpexchange                             
svchost.exe                   1596 vmicshutdown                                
svchost.exe                   1644 UmRdpService                                
svchost.exe                   1656 vmictimesync                                
svchost.exe                   1712 CoreMessagingRegistrar                      
svchost.exe                   1856 UserManager                                 
svchost.exe                   2016 EventLog                                    
svchost.exe                   1764 LanmanWorkstation                           
svchost.exe                   1852 SysMain                                     
svchost.exe                   1788 EventSystem                                 
svchost.exe                   2068 Themes                                      
svchost.exe                   2136 Dnscache                                    
svchost.exe                   2236 DispBrokerDesktopSvc                        
svchost.exe                   2252 SENS                                        
svchost.exe                   2340 AudioEndpointBuilder                        
svchost.exe                   2360 FontCache                                   
svchost.exe                   2388 SessionEnv                                  
svchost.exe                   2480 CertPropSvc                                 
svchost.exe                   2508 Audiosrv                                    
svchost.exe                   2588 Dhcp                                        
svchost.exe                   2652 TextInputManagementService                  
svchost.exe                   2752 DusmSvc                                     
svchost.exe                   2776 Wcmsvc                                      
svchost.exe                   2852 ShellHWDetection                            
svchost.exe                   2904 WinHttpAutoProxySvc                         
svchost.exe                   1196 BFE, mpssvc                                 
svchost.exe                   3160 DiagTrack                                   
svchost.exe                   3172 iphlpsvc                                    
svchost.exe                   3168 DPS                                         
svchost.exe                   3248 StateRepository                             
svchost.exe                   3288 LanmanServer                      

Stack trace:

   at Microsoft.DotNet.RemoteExecutor.RemoteInvokeHandle.Dispose(Boolean disposing) in /_/src/Microsoft.DotNet.RemoteExecutor/src/RemoteInvokeHandle.cs:line 225
   at Microsoft.DotNet.RemoteExecutor.RemoteInvokeHandle.Dispose() in /_/src/Microsoft.DotNet.RemoteExecutor/src/RemoteInvokeHandle.cs:line 55
   at Microsoft.Extensions.Hosting.WindowsServiceTester.Dispose(Boolean disposing) in /_/src/libraries/Microsoft.Extensions.Hosting.WindowsServices/tests/WindowsServiceTester.cs:line 147
   at System.ComponentModel.Component.Dispose() in /_/src/libraries/System.ComponentModel.Primitives/src/System/ComponentModel/Component.cs:line 73
   at Microsoft.Extensions.Hosting.WindowsServiceLifetimeTests.ExceptionOnStartIsPropagated() in /_/src/libraries/Microsoft.Extensions.Hosting.WindowsServices/tests/WindowsServiceLifetimeTests.cs:line 85
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(Object obj, BindingFlags invokeAttr) in /_/src/libraries/System.Private.CoreLib/src/System/Reflection/MethodBaseInvoker.cs:line 57

@kunalspathak
Copy link
Member

Something odd going since last month, given that we had to point fix the tests to make them pass on jit stress pipelines

@ericstj
Copy link
Member

ericstj commented Sep 13, 2024

Ok, so I got @hoyosjs's help to get the managed stacks. I had to build an amd64_arm64 cross dac and then tell the debugger to use it. Here's the managed state:

0:004> !clrstack -all
OS Thread Id: 0x1f4
        Child SP               IP Call Site
000000217637D5C8 00007ffe756af924 [InlinedCallFrame: 000000217637d5c8] 
000000217637D5C8 00007ffe05b29dcc [InlinedCallFrame: 000000217637d5c8] 
000000217637D590 00007ffe05b29dcc System.Threading.Monitor.Wait(System.Object, Int32) [/_/src/coreclr/System.Private.CoreLib/src/System/Threading/Monitor.CoreCLR.cs @ 157]
000000217637D670 00007ffe05b422fc System.Threading.ManualResetEventSlim.Wait(Int32, System.Threading.CancellationToken) [/_/src/libraries/System.Private.CoreLib/src/System/Threading/ManualResetEventSlim.cs @ 581]
000000217637D6F0 00007ffe05b6f308 System.Threading.Tasks.Task.SpinThenBlockingWait(Int32, System.Threading.CancellationToken) [/_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs @ 3073]
000000217637D750 00007ffe05b6efdc System.Threading.Tasks.Task.InternalWaitCore(Int32, System.Threading.CancellationToken) [/_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs @ 3001]
000000217637D7A0 00007ffe05c00698 System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task, System.Threading.Tasks.ConfigureAwaitOptions) [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/TaskAwaiter.cs @ 104]
000000217637D7D0 00007ffda8cfd6b8 Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions.Run(Microsoft.Extensions.Hosting.IHost) [/_/src/libraries/Microsoft.Extensions.Hosting.Abstractions/src/HostingAbstractionsHostExtensions.cs @ 53]
000000217637D800 00007ffda8cb9c8c Microsoft.Extensions.Hosting.UseWindowsServiceTests+c.b__4_0() [/_/src/libraries/Microsoft.Extensions.Hosting.WindowsServices/tests/UseWindowsServiceTests.cs @ 43]
000000217637DCD8 00007ffe081b1e64 [HelperMethodFrame_PROTECTOBJ: 000000217637dcd8] System.RuntimeMethodHandle.InvokeMethod(System.Object, Void**, System.Signature, Boolean)
000000217637DE70 00007ffe05c0d4cc System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(System.Object, System.Reflection.BindingFlags) [/_/src/libraries/System.Private.CoreLib/src/System/Reflection/MethodBaseInvoker.cs @ 57]
000000217637DEB0 00007ffe05c2a9f0 System.Reflection.RuntimeMethodInfo.Invoke(System.Object, System.Reflection.BindingFlags, System.Reflection.Binder, System.Object[], System.Globalization.CultureInfo) [/_/src/libraries/System.Private.CoreLib/src/System/Reflection/RuntimeMethodInfo.cs @ 126]
000000217637DF00 00007ffda8cb4b8c Microsoft.DotNet.RemoteExecutor.Program.Main(System.String[]) [/_/src/Microsoft.DotNet.RemoteExecutor/src/Program.cs @ 57]
OS Thread Id: 0x2014
        Child SP               IP Call Site
0000002176AFFE10 00007ffe756af924 [DebuggerU2MCatchHandlerFrame: 0000002176affe10] 
OS Thread Id: 0xd64
        Child SP               IP Call Site
0000002176DFF638 00007ffe756af3b4 [InlinedCallFrame: 0000002176dff638] Interop+Advapi32.g____PInvoke|35_0(IntPtr)
0000002176DFF638 00007ffda8d008b8 [InlinedCallFrame: 0000002176dff638] Interop+Advapi32.g____PInvoke|35_0(IntPtr)
0000002176DFF610 00007ffda8d008b8 Interop+Advapi32.StartServiceCtrlDispatcher(IntPtr) [/_/artifacts/obj/System.ServiceProcess.ServiceController/Release/net9.0-windows/Microsoft.Interop.LibraryImportGenerator/Microsoft.Interop.LibraryImportGenerator/LibraryImports.g.cs @ 531]
0000002176DFF6E0 00007ffda8d000d0 System.ServiceProcess.ServiceBase.Run(System.ServiceProcess.ServiceBase[]) [/_/src/libraries/System.ServiceProcess.ServiceController/src/System/ServiceProcess/ServiceBase.cs @ 605]
0000002176DFF750 00007ffda8cfff6c System.ServiceProcess.ServiceBase.Run(System.ServiceProcess.ServiceBase) [/_/src/libraries/System.ServiceProcess.ServiceController/src/System/ServiceProcess/ServiceBase.cs @ 656]
0000002176DFF780 00007ffda8cffd94 Microsoft.Extensions.Hosting.WindowsServices.WindowsServiceLifetime.Run() [/_/src/libraries/Microsoft.Extensions.Hosting.WindowsServices/src/WindowsServiceLifetime.cs @ 100]
0000002176DFFB10 00007ffe081b1e64 [DebuggerU2MCatchHandlerFrame: 0000002176dffb10] 
OS Thread Id: 0x2038
        Child SP               IP Call Site
00000021772BF920 00007ffe756af924 [InlinedCallFrame: 00000021772bf920] 
00000021772BF920 00007ffe05b31570 [InlinedCallFrame: 00000021772bf920] 
00000021772BF8E0 00007ffe05b31570 System.Threading.WaitHandle.WaitOneNoCheck(Int32, Boolean, System.Object, WaitHandleWaitSourceMap) [/_/src/libraries/System.Private.CoreLib/src/System/Threading/WaitHandle.cs @ 183]
00000021772BF9D0 00007ffe05b5fd88 System.Threading.PortableThreadPool+GateThread.GateThreadStart() [/_/src/libraries/System.Private.CoreLib/src/System/Threading/PortableThreadPool.GateThread.cs @ 48]
00000021772BFE60 00007ffe081b1e64 [DebuggerU2MCatchHandlerFrame: 00000021772bfe60] 

So this looks to me like the service is still running. It should have thrown an exception on start and exited. The dump shows that it's the correct service running. It's almost as if the exception was swallowed. I'll look more closely at how this is supposed to work and see if I can build a theory.

@ericstj
Copy link
Member

ericstj commented Sep 14, 2024

This looks wrong:

000000217637D800 00007ffda8cb9c8c Microsoft.Extensions.Hosting.UseWindowsServiceTests+c.b__4_0() [/_/src/libraries/Microsoft.Extensions.Hosting.WindowsServices/tests/UseWindowsServiceTests.cs @ 43]
000000217637DCD8 00007ffe081b1e64 [HelperMethodFrame_PROTECTOBJ: 000000217637dcd8] System.RuntimeMethodHandle.InvokeMethod(System.Object, Void**, System.Signature, Boolean)
000000217637DE70 00007ffe05c0d4cc System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(System.Object, System.Reflection.BindingFlags) [/_/src/libraries/System.Private.CoreLib/src/System/Reflection/MethodBaseInvoker.cs @ 57]
000000217637DEB0 00007ffe05c2a9f0 System.Reflection.RuntimeMethodInfo.Invoke(System.Object, System.Reflection.BindingFlags, System.Reflection.Binder, System.Object[], System.Globalization.CultureInfo) [/_/src/libraries/System.Private.CoreLib/src/System/Reflection/RuntimeMethodInfo.cs @ 126]
000000217637DF00 00007ffda8cb4b8c Microsoft.DotNet.RemoteExecutor.Program.Main(System.String[]) [/_/src/Microsoft.DotNet.RemoteExecutor/src/Program.cs @ 57]

This callstack has the entrypoint running this service:

public void CanCreateService()
{
using var serviceTester = WindowsServiceTester.Create(() =>
{
using IHost host = new HostBuilder()
.UseWindowsService()
.Build();
host.Run();
});

And if I check the commandline args:

String:      Microsoft.Extensions.Hosting.WindowsServices.Tests, Version=9.0.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60
String:      Microsoft.Extensions.Hosting.UseWindowsServiceTests+<>c
String:      <CanCreateService>b__4_0
String:      C:\h\w\A23808C4\t\54acuhno.ogc

However if I check the service name:

String:      ExceptionOnStartIsPropagated

So it looks to me like the service was registered with the wrong command line arguments!

I think I found the problem. We create a single RemoteInvokeOptions that we share across all calls to register the service.

private static RemoteInvokeOptions remoteInvokeOptions = new RemoteInvokeOptions() { Start = false };

Seems OK right? An options class should just immutable state to pass parameters around.

Not so fast! This options type has a ProcessStartInfo
https://github.com/dotnet/arcade/blob/bd749a18dd72b8fcb5257ae01a4faf6dd487cc5f/src/Microsoft.DotNet.RemoteExecutor/src/RemoteInvokeOptions.cs#L21

And that instance is actually mutated before it's copied out into the final returned handle:
https://github.com/dotnet/arcade/blob/bd749a18dd72b8fcb5257ae01a4faf6dd487cc5f/src/Microsoft.DotNet.RemoteExecutor/src/RemoteExecutor.cs#L434-L467

So we've had this latent threading bug here forever. If we have multiple tests running they'll race at mutating that psi and the one that actually gets returned and used to register the service could be wrong.

Fix here is simple - don't reuse the RemoteInvokeOptions instance.

@ericstj ericstj assigned ericstj and unassigned kunalspathak Sep 14, 2024
@ericstj ericstj added area-Extensions-Hosting and removed area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Sep 14, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-extensions-hosting
See info in area-owners.md if you want to be subscribed.

@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Sep 14, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Oct 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm64 area-Extensions-Hosting blocking-clean-ci-optional Blocking optional rolling runs in-pr There is an active PR which will close this issue when it is merged JitStress CLR JIT issues involving JIT internal stress modes os-windows
Projects
None yet
8 participants