Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime crashing on startup on emulated x64-on-ARM64 Windows #100425

Closed
v-ainigao opened this issue Mar 14, 2024 · 28 comments · Fixed by #102333
Closed

Runtime crashing on startup on emulated x64-on-ARM64 Windows #100425

v-ainigao opened this issue Mar 14, 2024 · 28 comments · Fixed by #102333

Comments

@v-ainigao
Copy link

Repro steps:

  1. Install .net9preview3 X64 SDK from https://aka.ms/dotnet/9.0.1xx/daily/dotnet-sdk-win-x64.exe
  2. Modify the environment variable path to x64 "C:\Program Files\dotnet\x64".
  3. Create new console app.
    dotnet new console

Expected Result:
Project can be created successfully.

Actual Result:
Creating project or build fails and will be blocking.
image
image

dotnet --info
.NET SDK:
Version: 9.0.100-preview.3.24163.23
Commit: 33549194e2
Workload version: 9.0.100-manifests.42eb89a3
MSBuild version: 17.10.0-preview-24162-02+0326fd7c9

Runtime Environment:
OS Name: Windows
OS Version: 10.0.22631
OS Platform: Windows
RID: win-x64
Base Path: C:\Program Files\dotnet\x64\sdk\9.0.100-preview.3.24163.23\

.NET workloads installed:
There are no installed workloads to display.

Host:
Version: 9.0.0-preview.3.24162.24
Architecture: x64
Commit: e4fceb3

.NET SDKs installed:
9.0.100-preview.3.24163.23 [C:\Program Files\dotnet\x64\sdk]

.NET runtimes installed:
Microsoft.AspNetCore.App 9.0.0-preview.3.24162.20 [C:\Program Files\dotnet\x64\shared\Microsoft.AspNetCore.App]
Microsoft.NETCore.App 9.0.0-preview.3.24162.24 [C:\Program Files\dotnet\x64\shared\Microsoft.NETCore.App]
Microsoft.WindowsDesktop.App 9.0.0-preview.3.24162.12 [C:\Program Files\dotnet\x64\shared\Microsoft.WindowsDesktop.App]

Other architectures found:
arm64 [C:\Program Files\dotnet]
x86 [C:\Program Files (x86)\dotnet]
registered at [HKLM\SOFTWARE\dotnet\Setup\InstalledVersions\x86\InstallLocation]

Environment variables:
Not set

global.json file:
Not found

Learn more:
https://aka.ms/dotnet/info

Download .NET:
https://aka.ms/dotnet/download

Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Mar 14, 2024
@v-ainigao
Copy link
Author

v-ainigao commented Mar 14, 2024

Install .net9preview3 X86 SDK create project on Arm64 OS is able to work normally.
image

@richaverma1
Copy link

@v-ainigao is this a regression from .NET9 Preview 2?

@marcpopMSFT
Copy link
Member

@elinor-fung @agocke since this appears to be a host crash

@elinor-fung
Copy link
Member

@v-ainigao do you have a dump you could share?

Do other operations - for example dotnet new list, dotnet build, dotnet tool install - work? Does running an app that has already been built work (dotnet <appDll>)?

Since dotnet --info seems to work, this should not be a problem with the host itself.

@v-ainigao
Copy link
Author

@v-ainigao is this a regression from .NET9 Preview 2?

This issue does not repro with previous versions of net9preview3 and .net9preview2.

@v-ainigao
Copy link
Author

v-ainigao commented Mar 15, 2024

@v-ainigao do you have a dump you could share?

Do other operations - for example dotnet new list, dotnet build, dotnet tool install - work? Does running an app that has already been built work (dotnet <appDll>)?

Since dotnet --info seems to work, this should not be a problem with the host itself.

This is the latest .net9preview3 x64 SDK from GitHub today running on a newly built ARM64OS.
This memory dump file has been placed in this path. \mlangfs1\public\v-aini#39513dump file.zip
dotnet new console/dotnet new list/dotnet build/dotnet tool install is not working.
image
image
Running already built application not working.
image
But dotnet --info works fine.
image

@elinor-fung
Copy link
Member

elinor-fung commented Mar 15, 2024

Running already built application not working.

The screenshot shows dotnet build. Did you also check just running the application? For example dotnet bin\Release\net9.0\a.dll.

This memory dump file has been placed in this path. \mlangfs1\public\v-aini#39513dump file.zip

The main thread is waiting to suspend the runtime for a GC:

00  ARM64EC 0000000d`ca17cf80 00007ffe`14419094     ntdll!#ZwWaitForSingleObject+0x14
01  ARM64EC 0000000d`ca17cf90 00007ffe`1454d4b4     KERNELBASE!WaitForSingleObjectEx+0x84 [minkernel\kernelbase\synch.c @ 1328] 
02  ARM64EC 0000000d`ca17d020 00007ffd`a21ee540     KERNELBASE!$ientry_thunk$cdecl$i8$i8m8i8i8+0x24
03    AMD64 (Inline Function) --------`--------     coreclr!CLREventWaitHelper2+0x6 [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 372] 
04    AMD64 0000000d`ca17d0d0 00007ffd`a221a13e     coreclr!CLREventWaitHelper+0x20 [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 397] 
05    AMD64 (Inline Function) --------`--------     coreclr!CLREventBase::WaitEx+0xf [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 466] 
06    AMD64 (Inline Function) --------`--------     coreclr!CLREventBase::Wait+0xf [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 412] 
07    AMD64 0000000d`ca17d130 00007ffd`a2238912     coreclr!ThreadSuspend::SuspendRuntime+0x31e [D:\a\_work\1\s\src\coreclr\vm\threadsuspend.cpp @ 3600] 
08    AMD64 0000000d`ca17d230 00007ffd`a223862a     coreclr!ThreadSuspend::SuspendEE+0x116 [D:\a\_work\1\s\src\coreclr\vm\threadsuspend.cpp @ 5778] 
09    AMD64 0000000d`ca17d310 00007ffd`a2238177     coreclr!GCToEEInterface::SuspendEE+0x26 [D:\a\_work\1\s\src\coreclr\vm\gcenv.ee.cpp @ 51] 
0a    AMD64 0000000d`ca17d340 00007ffd`a22384be     coreclr!WKS::GCHeap::GarbageCollectGeneration+0xf3 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 50661] 
0b    AMD64 0000000d`ca17d390 00007ffd`a22fac8a     coreclr!WKS::gc_heap::trigger_gc_for_alloc+0x26 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 18788] 
0c    AMD64 0000000d`ca17d3c0 00007ffd`a22a99f5     coreclr!WKS::gc_heap::try_allocate_more_space+0x5126e [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 18915] 
0d    AMD64 0000000d`ca17d420 00007ffd`a21de818     coreclr!WKS::gc_heap::allocate_more_space+0x31 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 19415] 
0e    AMD64 (Inline Function) --------`--------     coreclr!WKS::gc_heap::allocate+0x5a [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 19446] 
0f    AMD64 0000000d`ca17d450 00007ffd`a21c730d     coreclr!WKS::GCHeap::Alloc+0x88 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 49629] 
10    AMD64 (Inline Function) --------`--------     coreclr!Alloc+0xb6 [D:\a\_work\1\s\src\coreclr\vm\gchelpers.cpp @ 227] 
11    AMD64 (Inline Function) --------`--------     coreclr!AllocateObject+0x128 [D:\a\_work\1\s\src\coreclr\vm\gchelpers.cpp @ 1095] 
12    AMD64 (Inline Function) --------`--------     coreclr!AllocateObject+0x12e [D:\a\_work\1\s\src\coreclr\vm\gchelpers.h @ 68] 
13    AMD64 0000000d`ca17d490 00007ffd`a031ef0b     coreclr!JIT_New+0x20d [D:\a\_work\1\s\src\coreclr\vm\jithelpers.cpp @ 2481] 

@dotnet/dotnet-diag I can't seem to get sos working for the dump. This is x64 dotnet running on arm64 Windows. The identified target architecture 0xa641 isn't listed as an IMAGE_FILE_MACHINE value - any ideas?

SOS does not support the current target architecture '' (0xa641). A 32 bit target may require a 32 bit debugger or vice versa. In general, try to use the same bitness for the debugger and target process.

@mikem8361 mikem8361 assigned mikem8361 and unassigned mikem8361 Mar 16, 2024
@mikem8361
Copy link
Member

You must be running on a arm64ec (0xa641). There is already a diagnostic repo issue and a PR that Juan started but still needs some finishing touches to fix this.

@v-jieyan2
Copy link

v-jieyan2 commented Mar 19, 2024

This problem does not appear on winserver 2022.
image
image
image

@v-ainigao
Copy link
Author

v-ainigao commented Mar 21, 2024

This issue still occurs today on ARM64 win11.
This memory dump file has been placed in this path. \mlangfs1\public\v-aini#39513dump file(2)\dumpfile.zip
CLI version: .net9.0.100-preview.3.24165.20(runtime-9.0.0-preview.3.24162.31)
image
image

@v-ainigao
Copy link
Author

This issue still repro's on .net9.0.100-preview.3.24175.24(runtime-9.0.0-preview.3.24172.9) x64 SDK on ARM64OS.
image
image

@richaverma1
Copy link

@marcpopMSFT is this blocking .NET 9 Preview 3 release? It is a regression from Preview 2.

@marcpopMSFT
Copy link
Member

@richaverma1 do other scenarios work (ie build, running an app with dotnet, etc)?

since it's the host crashing and Elinor appears blocked analyzing the dump because of an unrelated issue, not sure how to tell how critical this is without knowing what additional scenarios might be impacted.

CC @baronfel @elinor-fung

@marcpopMSFT
Copy link
Member

If it's limited to workload installs when running the x64 sdk on an arm64 machine, I wouldn't make it a blocker.

@richaverma1
Copy link

@marcpopMSFT it is failing on the basic scenario of creating a console app. But on ARM64 OS with x64 SDK. Not sure how many customers use that config.

@agocke agocke transferred this issue from dotnet/sdk Mar 28, 2024
@agocke agocke changed the title [NETSDKE2E][ARM64] Install .net9preview3 X64 SDK create project on Arm64 OS, the system will be blocking. Runtime crashing on startup on emulated x64-on-ARM64 Windows Mar 28, 2024
@v-ainigao
Copy link
Author

v-ainigao commented Apr 1, 2024

This issue also repro's on .net9.0.100-preview.4.24181.1(runtime-9.0.0-preview.4.24178.9).
image
image

@nagilson
Copy link
Member

nagilson commented Apr 1, 2024

You must be running on a arm64ec (0xa641). There is already a diagnostic repo issue and a PR that Juan started but still needs some finishing touches to fix this.

Thank you for letting us know. Is there any update on that? @mikem8361

@mikem8361
Copy link
Member

We haven't had a chance to make any progress on this PR/arm64ec support.

@nagilson
Copy link
Member

nagilson commented Apr 1, 2024

@marcpopMSFT I wouldn't say this is blocking but if it's shipping like this we should probably create a known issue for it.

@v-ainigao
Copy link
Author

Today I did some sanity checks on the latest .net9preview4[9.0.100-preview.4.24209.11(runtime-9.0.0-preview.4.24204.3)] in CLI. This issue is fixed, so close it.
image

@dotnet-policy-service dotnet-policy-service bot removed the untriaged New issue has not been triaged by the area owner label Apr 10, 2024
@v-ainigao
Copy link
Author

Today I reproduced this issue again on the latest .net9.0.100-preview.4.24215.10(runtime-9.0.0-preview.4.24211.4)x64 SDK on ARM64OS, so I reopened it, can you help me take a look? @marcpopMSFT
This memory dump file has been placed in this path. \mlangfs1\public\v-aini\dump.zip
image
dotnet --info is well.
image

@v-ainigao v-ainigao reopened this Apr 16, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Apr 16, 2024
@v-ainigao
Copy link
Author

v-ainigao commented May 7, 2024

This issue also repro's on arm64os with .net9preview5 x64sdk.
image

@tommcdon
Copy link
Member

tommcdon commented May 7, 2024

I was able to reproduce the issue with an x64 sdk running on Windows arm64. The issue does not reproduce DOTNET_LegacyExceptionHandling=1 so the problem is evidently related to the new EH model. @janvorli would you mind taking a look?

@v-ainigao
Copy link
Author

This issue still repro's in the latest net9preview4 x64sdk on arm64OS.
SDK version:9.0.100-preview.4.24260.3
image

@janvorli
Copy link
Member

Windows developers helped me to investigate the issue. It turns out that it is a windows bug that is fixed in 24H2. So, the recommendation is to update Windows to this version. We will also add a workaround to runtime that will prevent using the Windows API with the bug when running x64 emulated on arm64 Windows on older Windows versions. But please note that this workaround may have some performance consequences in GC runtime suspension time in case of a lot of threads.

@v-ainigao
Copy link
Author

Windows developers helped me to investigate the issue. It turns out that it is a windows bug that is fixed in 24H2. So, the recommendation is to update Windows to this version. We will also add a workaround to runtime that will prevent using the Windows API with the bug when running x64 emulated on arm64 Windows on older Windows versions. But please note that this workaround may have some performance consequences in GC runtime suspension time in case of a lot of threads.

Thank you for your explanation, but I can't create a 24h2 machine on Azure yet.

janvorli added a commit to janvorli/runtime that referenced this issue May 16, 2024
In ARM64 windows older than 24H2, the special APC is broken when running
x64 emulation. The callback that gets invoked doesn't get an argument
with correct CONTEXT of the interrupted location. This change disables
using the special APC for runtime suspension when running on the
affected Windows versions.

Close dotnet#100425
janvorli added a commit that referenced this issue May 16, 2024
)

* Workaround broken special APC when running x64 on arm64 windows

In ARM64 windows older than 24H2, the special APC is broken when running
x64 emulation. The callback that gets invoked doesn't get an argument
with correct CONTEXT of the interrupted location. This change disables
using the special APC for runtime suspension when running on the
affected Windows versions.

Close #100425

* Make the same fix for NativeAOT
@v-ainigao
Copy link
Author

This issue does not repro in the net9preview5 x64sdk on arm64OS.
SDK version:9.0.100-preview.5.24271.7(runtime-9.0.0-preview.5.24268.2)
image

Ruihan-Yin pushed a commit to Ruihan-Yin/runtime that referenced this issue May 30, 2024
…et#102333)

* Workaround broken special APC when running x64 on arm64 windows

In ARM64 windows older than 24H2, the special APC is broken when running
x64 emulation. The callback that gets invoked doesn't get an argument
with correct CONTEXT of the interrupted location. This change disables
using the special APC for runtime suspension when running on the
affected Windows versions.

Close dotnet#100425

* Make the same fix for NativeAOT
@github-actions github-actions bot locked and limited conversation to collaborators Jun 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.