-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support .NET on Apple Silicon with Rosetta 2 emulation #44897
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
I've modified the details on the |
@richlander Is this issue also for .NET Core 3.1? |
.NET Core 3.1 support would be highly appreciated as it is the current LTS version thus being the version AWS Lambda supports as a runtime until next LTS version (.net 6). |
@sdmaclea I'm the main developer of Rosetta 2, and I'm particularly interested in the last item that you mention:
What's the simplest set of commands I can run on a runtime checkout to see these failures? I'm not completely clear on which combination of Release/Debug components I should be using. |
@zwarich So the exact number of failing tests was actually 76. Rerunning the tests with #45226 fixed most of them. The most of the remaining failed tests look like they are possibly related to division emulation. These are the currently failing tests: # Divide modulo failures
./artifacts/tests/coreclr/OSX.x64.Release/JIT/IL_Conformance/Old/Conformance_Base/rem_i8/rem_i8.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/IL_Conformance/Old/Conformance_Base/div_i8/div_i8.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/Directed/coverage/oldtests/ovfldiv1_il_d/ovfldiv1_il_d.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/Directed/coverage/oldtests/ovflrem1_il_d/ovflrem1_il_d.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/Directed/coverage/oldtests/ovfldiv1_il_r/ovfldiv1_il_r.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/Directed/coverage/oldtests/ovflrem1_il_r/ovflrem1_il_r.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/CodeGenBringUpTests/DivConst_r/DivConst_r.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/CodeGenBringUpTests/ModConst_r/ModConst_r.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/CodeGenBringUpTests/DivConst_ro/DivConst_ro.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/CodeGenBringUpTests/ModConst_do/ModConst_do.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/CodeGenBringUpTests/DivConst_d/DivConst_d.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/CodeGenBringUpTests/ModConst_ro/ModConst_ro.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/CodeGenBringUpTests/DivConst_do/DivConst_do.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/CodeGenBringUpTests/ModConst_d/ModConst_d.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/jit64/regress/vsw/543645/test/test.sh
# Rosetta 2 CPU ID is not recognized (not surprising).
./artifacts/tests/coreclr/OSX.x64.Release/JIT/HardwareIntrinsics/X86/X86Base/CpuId_r/CpuId_r.sh
./artifacts/tests/coreclr/OSX.x64.Release/JIT/HardwareIntrinsics/X86/X86Base/CpuId_ro/CpuId_ro.sh
# assertion failed: GPR thread_set_state is unsupported while in sa_tramp
./artifacts/tests/coreclr/OSX.x64.Release/JIT/Regression/CLR-x86-JIT/V1.1-M1-Beta1/b143840/b143840/b143840.sh
./artifacts/tests/coreclr/OSX.x64.Release/baseservices/exceptions/regressions/V1/SEH/VJ/ExternalException/ExternalException.sh The simplest set of commands to build and reproduce the failed tests. # build on macos x64 (Intel)
# checkout https://github.com/dotnet/runtime/tree/release/5.0
git checkout origin/release/5.0
# Install build dependencies (once) using homebrew
brew bundle --no-lock --file eng/Brewfile
# Apply patches as needed
# See https://github.com/dotnet/runtime/pull/45226
# Possibly see https://github.com/dotnet/runtime/issues/45222#issuecomment-734016047
#build the .NET core runtime
./build.sh clr+libs -x64 -c release
# build the tests
src/coreclr/build-test.sh -release -priority1
# compress tests on build host
tar -zcf rosetta5_4k.tgz artifacts/tests/coreclr/OSX.x64.Release artifacts/bin/coreclr/OSX.x64.Release artifacts/tests/coreclr/OSX.x64.Release/Tests/Core_Root artifacts/obj/coreclr/OSX.x64.Release/tests
# uncompress on Apple Silicon
tar -xf rosetta5_4k.tgz
# The individual test can be run like
export test=./artifacts/tests/coreclr/OSX.x64.Release/JIT/IL_Conformance/Old/Conformance_Base/rem_i8/rem_i8.sh
chmod u+x $test
$test -coreroot=$PWD/artifacts/tests/coreclr/OSX.x64.Release/Tests/Core_Root |
These actually pass on an M1 machine (rather than a DTK).
I was running without the activations-via-signals patch, where you see a different issue, which can be worked around by using This assertion indicates that one thread was trying to use There might be other issues that I am not aware of, but I have investigated all of the ones mentioned here and I believe they can all be addressed in Rosetta. There should be no workarounds required in .NET (at least on M1, as opposed to the DTK). @sdmaclea What commands/scripts should I use to test further whether .NET is working correctly, e.g. longer tests and stress tests? |
@zwarich thank you so much for all the insight! Regarding the
Is there a way to workaround such an issue so that we can still redirect the failing thread to our code that can handle the exception? It seems that if thread_set_state returned an error code instead of asserting, we could possibly just resume the thread without changing the context. The signal handler would execute then and after it returns, the hardware exception triggering instruction would be re-executed, the whole process above will repeat, but this time the thread_set_state would succeed. |
@zwarich Thanks. We have lots of different stress test modes. We will try to light them up in CI when we get sufficient M1 hardware. I am not sure which of them to point you at. There are a lot of tests. Many of them are focused on functional correctness and stressing the JIT or the GC. I am not sure how the coverage would be in terms of Rosetta emulation coverage.
One of the most difficult tests is actually getting the SDK stable enough to build a large project. So building the runtime on M1 might be a good smoke test. This was failing with deadlock 90% of the time. (Presumably due to the X86_FLOAT_STATE64 issue) So you could build the native runtime on M1. Almost the same instruction I gave you above. # build .NET 6.0 master on macos arm64 on (M1)
# checkout https://github.com/dotnet/runtime
git checkout origin/master
# Install build dependencies (once) using homebrew
# Last I checked I had to hack this a bit to get it to work on DTK
# Last time I cheated and did
#`arch -arch x86_64 brew bundle --no-lock --file eng/Brewfile`
# for at least a subset of the dependencies
brew bundle --no-lock --file eng/Brewfile
# Apply patches as needed
# See https://github.com/dotnet/runtime/pull/45226
# Possibly see https://github.com/dotnet/runtime/issues/45222#issuecomment-734016047
#build the .NET core runtime
./build.sh clr+libs -arm64 -c release
# build the tests
src/coreclr/build-test.sh -release -arm64 -priority1 |
I'd be surprised if JIT stress modes would exhibit uniquely challenging behavior on Rosetta. However, to run the most common JIT stress, set environment variable @sandreenko Can also advise. GC Stress would likely be more "stressful" to the system. The two most common settings are setting environment variable btw, running the tests locally can be done with |
for GC you'd want to run the GC functional tests + stress which is documented here (it looks like when the test src was moved, this file was not moved) - https://github.com/dotnet/coreclr/blob/release/3.0/tests/src/GC/Stress/stress_run_readme.txt this should be in the same place in the new test src location. |
@danwalmsley thank you! |
I installed the |
I cannot get .NET 5 ASP.NET amd64 container images to work on M1. It is apparently supposed to work. An Arm64 image works: rich@MacBook-Air ~ % docker run --rm -p 8080:80 mcr.microsoft.com/dotnet/samples:aspnetapp A small sample works with amd64: docker run --platform linux/amd64 --rm -p 8080:80 mcr.microsoft.com/dotnet/samples My experience with the rich@MacBook-Air ~ % docker run --platform linux/amd64 --rm -p 8080:80 mcr.microsoft.com/dotnet/samples:aspnetapp
Unhandled exception. System.IO.IOException: Function not implemented
at System.IO.FileSystemWatcher.StartRaisingEvents()
at System.IO.FileSystemWatcher.StartRaisingEventsIfNotDisposed()
at System.IO.FileSystemWatcher.set_EnableRaisingEvents(Boolean value)
at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.TryEnableFileSystemWatcher()
at Microsoft.Extensions.FileProviders.Physical.PhysicalFilesWatcher.CreateFileChangeToken(String filter)
at Microsoft.Extensions.FileProviders.PhysicalFileProvider.Watch(String filter)
at Microsoft.Extensions.Configuration.FileConfigurationProvider.<.ctor>b__1_0()
at Microsoft.Extensions.Primitives.ChangeToken.OnChange(Func`1 changeTokenProducer, Action changeTokenConsumer)
at Microsoft.Extensions.Configuration.FileConfigurationProvider..ctor(FileConfigurationSource source)
at Microsoft.Extensions.Configuration.Json.JsonConfigurationSource.Build(IConfigurationBuilder builder)
at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build()
at Microsoft.Extensions.Hosting.HostBuilder.BuildAppConfiguration()
at Microsoft.Extensions.Hosting.HostBuilder.Build()
at aspnetapp.Program.Main(String[] args) in /source/aspnetapp/Program.cs:line 16
qemu: uncaught target signal 6 (Aborted) - core dumped |
@richlander Based on:
I would assume docker is using qemu rather than Rosetta to make this work. |
It appears to be unsigned.
$ codesign -dvvv dotnet
Executable=/Users/stmaclea/git/dotnet-sdk-5.0.103-osx-x64/dotnet
Identifier=dotnet-55554944b2da1d4ea11a33dbabc8bfe88ecd1722
Format=Mach-O thin (x86_64)
CodeDirectory v=20100 size=960 flags=0x2(adhoc) hashes=22+5 location=embedded
Hash type=sha256 size=32
CandidateCDHash sha256=a73826b25ee1b05734de9ca3560c399a8fccac4d
CandidateCDHashFull sha256=a73826b25ee1b05734de9ca3560c399a8fccac4df0ce0a774e547440891312ee
Hash choices=sha256
CMSDigest=a73826b25ee1b05734de9ca3560c399a8fccac4df0ce0a774e547440891312ee
CMSDigestType=2
CDHash=a73826b25ee1b05734de9ca3560c399a8fccac4d
Signature=adhoc
Info.plist=not bound
TeamIdentifier=not set
Sealed Resources=none
Internal requirements count=0 size=12 libraries are unsigned. $ find . -name \*.dylib -print -exec codesign -dvvv '{}' ';'
./host/fxr/5.0.3/libhostfxr.dylib
./host/fxr/5.0.3/libhostfxr.dylib: code object is not signed at all
./shared/Microsoft.NETCore.App/5.0.3/libcoreclr.dylib
./shared/Microsoft.NETCore.App/5.0.3/libcoreclr.dylib: code object is not signed at all
./shared/Microsoft.NETCore.App/5.0.3/libSystem.Native.dylib
./shared/Microsoft.NETCore.App/5.0.3/libSystem.Native.dylib: code object is not signed at all
./shared/Microsoft.NETCore.App/5.0.3/libSystem.IO.Compression.Native.dylib
./shared/Microsoft.NETCore.App/5.0.3/libSystem.IO.Compression.Native.dylib: code object is not signed at all
./shared/Microsoft.NETCore.App/5.0.3/libSystem.Security.Cryptography.Native.Apple.dylib
./shared/Microsoft.NETCore.App/5.0.3/libSystem.Security.Cryptography.Native.Apple.dylib: code object is not signed at all
./shared/Microsoft.NETCore.App/5.0.3/libmscordaccore.dylib
./shared/Microsoft.NETCore.App/5.0.3/libmscordaccore.dylib: code object is not signed at all
./shared/Microsoft.NETCore.App/5.0.3/libSystem.Net.Security.Native.dylib
./shared/Microsoft.NETCore.App/5.0.3/libSystem.Net.Security.Native.dylib: code object is not signed at all
./shared/Microsoft.NETCore.App/5.0.3/libmscordbi.dylib
./shared/Microsoft.NETCore.App/5.0.3/libmscordbi.dylib: code object is not signed at all
./shared/Microsoft.NETCore.App/5.0.3/libhostpolicy.dylib
./shared/Microsoft.NETCore.App/5.0.3/libhostpolicy.dylib: code object is not signed at all
./shared/Microsoft.NETCore.App/5.0.3/libSystem.Security.Cryptography.Native.OpenSsl.dylib
./shared/Microsoft.NETCore.App/5.0.3/libSystem.Security.Cryptography.Native.OpenSsl.dylib: code object is not signed at all
./shared/Microsoft.NETCore.App/5.0.3/libdbgshim.dylib
./shared/Microsoft.NETCore.App/5.0.3/libdbgshim.dylib: code object is not signed at all
./shared/Microsoft.NETCore.App/5.0.3/libclrjit.dylib
./shared/Microsoft.NETCore.App/5.0.3/libclrjit.dylib: code object is not signed at all
./packs/Microsoft.NETCore.App.Host.osx-x64/5.0.3/runtimes/osx-x64/native/libnethost.dylib
./packs/Microsoft.NETCore.App.Host.osx-x64/5.0.3/runtimes/osx-x64/native/libnethost.dylib: code object is not signed at all |
Thanks on that. Sent mail to folks to get a resolution on the signing. |
Hm, I downloaded and installed a fresh 5.0.103 from dot.net and everything checks out. Example ...
|
Hmm I downloaded using the link @richlander provided. https://download.visualstudio.microsoft.com/download/pr/3de2d949-fcb5-4586-a217-2c33854d295f/943f0d92252338e11fd11b002a3a3861/dotnet-sdk-5.0.103-osx-x64.tar.gz |
Where are the download.visualstudio.com links populated from? |
https://github.com/dotnet/core/blob/master/release-notes/5.0/releases.json is the one which drives the downloads. These are generated during the release process and the bits are tested and picked from the release file drops. |
I believe we only sign and notarize pkgs. @mmitche, can you confirm that? |
I believe that is correct. I don't think you can notarize a non-pkg. |
We need to create an issue for this somewhere.
|
I think it still makes sense to distribute macos tar.gz. We've been doing that forever and I was always using that package and never had any issues except of the M1. It still works on my Mac Mini x64 with BigSur installed and SIP enabled. Our builds use the .tar.gz packages too without problems. |
Ack. We should create an issue. So we can have a formal discussion in an appropriate place. |
cc @mmitche |
I think this could be done in 5.0 reasonably easily, and I think it would apply to the current 6.0 process. Otherwise we could modify the post-build signing in 6.0 to handle tar.gz files. |
Is this still relevant for the .NET 5 milestone? (The EOS for this version is already on May 8, 2022). |
@richlander for that q. |
.NET 6 is the first fully supported SDK for Rosetta 2 emulation. See dotnet/sdk#22380 for details. |
Apple has announced plans to transition its Mac hardware line to a new Arm64-based chip that they refer to as “Apple Silicon”.
Initial .NET support will be through .NET running on the Rosetta 2 emulator. Longer term native support for Apple Silicon is planned for .NET 6.
While it is hoped that Rosetta 2 emulation will just work, the .NET runtime is complicated and real issues will make this a non-trivial task.
Current known issues
Apple Silicon uses a 16K memory page size. The .NET 5 stack probe code doesn't handle this yet. [release/5.0] Use minimum supported PAGE_SIZE as stack probe step #45226. Per Apple this only affects the DTK and is fixed on M1 Silicon. Edit: I've verified that on real M1 device, the page size is 4K and the related issue doesn't occur.
Rosetta 2 emulation crashes with a fatal failure when calling with
thread_get_state
x86_FLOAT_STATE64
. This is because the emulator does not emulate AVX support, but the function should simply return an error. Edit: This is fixed in the macOS 11.2 beta release.Rosetta 2 emulation doesn't populate
exceptionState.__trapno
for other kernel entry than hardware exceptions (for example for syscalls). This means we fail to inject code necessary for garbage collection and sometimes deadlock. Edit: This is at least partially fixed in the macOS 11.2 beta release, but I can still see hangs during .NET runtime / tests managed parts compilation. It is being investigated at the Apple side.With [release/5.0] Use minimum supported PAGE_SIZE as stack probe step #45226 & janvorli@aee81ac 19 runtime tests are failing under Rosetta 2 emulation which pass on macOS native x64All the coreclr Pri 1 tests are now passing except two tests (mentioned in the comments below) that are failing with:
assertion failed: GPR thread_set_state is unsupported while in sa_tramp. (ThreadContextRegisterState.cpp:1250 thread_set_state_gpr_64)
Debugging using VS Code doesn't work. It was partially fixed in the macOS 11.2 beta release, now it is possible to successfully run an application under the debugger and break on a breakpoint. But attempt to single step or continue from that state still fails. It is caused by iret instruction emulation that doesn't honor the trace flag. Edit: This is fixed in macOS 11.2 beta 2
New issue in macOS 11.2 beta - dotnet build and posibly other .NET applications often fail with
assertion failed [abi_info.u.translated_code.instruction_extents.kind == InstructionOffsetKind::Syscall]: on sigreturn exit path but instruction isn't marked as a syscall (ThreadContextRegisterState.cpp:381 x86_gpr_state_from_arm_state)
Edit: This is fixed in macOS 11.2 beta 2The text was updated successfully, but these errors were encountered: