Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Make sure static ctor flag check uses a volatile load #105832

Merged
merged 3 commits into from
Aug 4, 2024

Conversation

jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Aug 1, 2024

Otherwise the hardware is allowed to reorder loads of its static fields to happen before the initialization check.

Fixes #105441 (comment)

Otherwise the hardware is allowed to reorder loads of its static fields
to happen before the initialization check.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Aug 1, 2024
@jakobbotsch

This comment was marked as resolved.

This comment was marked as resolved.

@jakobbotsch

This comment was marked as resolved.

This comment was marked as resolved.

@jakobbotsch

This comment was marked as resolved.

This comment was marked as resolved.

@AndyAyersMS
Copy link
Member

The test build hit that msbuild error, I don't seem to be able to restart it for some reason.

@jakobbotsch
Copy link
Member Author

/azp run runtime, runtime-coreclr outerloop

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@AndyAyersMS
Copy link
Member

This is getting a similar looking failure as in #105441 -- not surprising since the bug is in the 9p6 runtime jit.

We will need to modify the test script for crossgen2 to run against live runtime. May not be simple if crossgen2 really needs the capabilities of a full runtime host, though I believe it will run ok with corerun as host.

Running CrossGen2:  dotnet /root/helix/work/correlation/crossgen2/crossgen2.dll @/root/helix/work/workitem/e/JIT/opt/JIT.opt/switchWithSideEffects.dll.rsp  
Unhandled exception. ILCompiler.CodeGenerationFailedException: Code generation failed for method '[switchWithSideEffects]SwitchWithSideEffects.Method12(S2&,float32&,S2&,uint32,S1,int64&,int64&,S1,S2&)'
 ---> System.NullReferenceException: Object reference not set to an instance of an object.
   at Internal.JitInterface.CorInfoImpl._beginInlining(IntPtr thisHandle, IntPtr* ppException, CORINFO_METHOD_STRUCT_* inlinerHnd, CORINFO_METHOD_STRUCT_* inlineeHnd) in /_/src/coreclr/tools/Common/JitInterface/CorInfoImpl_generated.cs:line 154
   --- End of inner exception stack trace -

@EgorBo
Copy link
Member

EgorBo commented Aug 2, 2024

We will need to modify the test script for crossgen2 to run against live runtime. May not be simple if crossgen2 really needs the capabilities of a full runtime host, though I believe it will run ok with corerun as host.

I think it should be simple enough for it to be built with live NativeAOT, perhaps @MichalStrehovsky @jkotas know how?

@jkotas
Copy link
Member

jkotas commented Aug 2, 2024

I think it should be simple enough for it to be built with live NativeAOT, perhaps @MichalStrehovsky @jkotas know how?

The shipping crossgen2 binary is built with live built native AOT. Switching crossgen2 tests to use that would require a bunch of yaml and test script changes. It may also require some extra work to mitigate the throughput impact - checked crossgen2 running on release runtime may be a lot faster than checked crossgen2 runtime on checked live-built runtime.

@jkotas
Copy link
Member

jkotas commented Aug 2, 2024

There is a good chance that the bug does not repro in native AOT version of crossgen2. The act of switching to native AOT can make the bug to go away and we would not be able to tell whether the bug is fixed by running crossgen2.

If we are just looking for workarounds, the easiest workaround may be to enable tiered compilation for crossgen2.

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Aug 2, 2024

This is getting a similar looking failure as in #105441 -- not surprising since the bug is in the 9p6 runtime jit.

Right...

I was definitely hoping for a simple way to validate that this fixes the problem, but looks like that may be hard. I'll see if I can repro the bug on my M1 with a decently high frequency and validate it manually.

@jakobbotsch
Copy link
Member Author

Haven't been able to validate the fix on my M1 -- I haven't seen a repro of the original problem on it.

Anyway, the dumps over in #105827 definitely show that the array field is null right after initialization, so it seems very likely for this to be the issue, and for now we can probably take it on faith that it solves the original problem.

@jakobbotsch jakobbotsch marked this pull request as ready for review August 2, 2024 14:45
@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib

@jakobbotsch jakobbotsch merged commit dee8a8b into dotnet:main Aug 4, 2024
108 checks passed
@jakobbotsch jakobbotsch deleted the acquire-class-init-flags branch August 4, 2024 00:22
@MichalStrehovsky
Copy link
Member

We will need to modify the test script for crossgen2 to run against live runtime. May not be simple if crossgen2 really needs the capabilities of a full runtime host, though I believe it will run ok with corerun as host.

I think it should be simple enough for it to be built with live NativeAOT, perhaps @MichalStrehovsky @jkotas know how?

If you want to use the same nativeAOT-compiled crossgen2 in testing that we ship in nugets, that's a hard fix, #80154 has been trying to do that for a long time.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test failure: NullReferenceException in Internal.JitInterface.CorInfoImpl._beginInlining
5 participants