Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

codegenarmarch.cpp:3201 Assertion failed '(gcInfo.gcRegGCrefSetCur & killMask) == 0' #65395

Closed
EgorBo opened this issue Feb 15, 2022 · 7 comments · Fixed by #65432
Closed

codegenarmarch.cpp:3201 Assertion failed '(gcInfo.gcRegGCrefSetCur & killMask) == 0' #65395

EgorBo opened this issue Feb 15, 2022 · 7 comments · Fixed by #65432
Assignees
Labels
arch-arm32 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs JitStress CLR JIT issues involving JIT internal stress modes os-linux Linux OS (any supported distro)
Milestone

Comments

@EgorBo
Copy link
Member

EgorBo commented Feb 15, 2022

I incorrectly closed these as dups of #65311
#65283
#65284
#65285
#65286

but it looks like it's a different assert:

Assert failure(PID 3755 [0x00000eab], Thread: 3755 [0x0eab]): Assertion failed '(gcInfo.gcRegGCrefSetCur & killMask) == 0' in 'System.Collections.Concurrent.ConcurrentDictionary`2[GenericLookupKey,__Canon][ILCompiler.DependencyAnalysis.ReadyToRunSymbolNodeFactory+GenericLookupKey,System.__Canon]:GrowTable(Tables[GenericLookupKey,__Canon]):this' during 'Generate code' (IL size 515)

File: /__w/1/s/src/coreclr/jit/codegenarmarch.cpp Line: 3201
@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Feb 15, 2022
@ghost
Copy link

ghost commented Feb 15, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

I incorrectly closed these as dups of #65311
#65283
#65284
#65285
#65286

but it looks like it's a different assert:

Assert failure(PID 3755 [0x00000eab], Thread: 3755 [0x0eab]): Assertion failed '(gcInfo.gcRegGCrefSetCur & killMask) == 0' in 'System.Collections.Concurrent.ConcurrentDictionary`2[GenericLookupKey,__Canon][ILCompiler.DependencyAnalysis.ReadyToRunSymbolNodeFactory+GenericLookupKey,System.__Canon]:GrowTable(Tables[GenericLookupKey,__Canon]):this' during 'Generate code' (IL size 515)

File: /__w/1/s/src/coreclr/jit/codegenarmarch.cpp Line: 3201
Author: EgorBo
Assignees: -
Labels:

area-CodeGen-coreclr, untriaged

Milestone: -

@EgorBo
Copy link
Member Author

EgorBo commented Feb 15, 2022

cc @dotnet/jit-contrib

@EgorBo EgorBo added arch-arm32 JitStress CLR JIT issues involving JIT internal stress modes os-linux Linux OS (any supported distro) labels Feb 15, 2022
@BruceForstall BruceForstall added this to the 7.0.0 milestone Feb 15, 2022
@BruceForstall BruceForstall added blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs and removed untriaged New issue has not been triaged by the area owner labels Feb 15, 2022
@echesakov
Copy link
Contributor

I have seen this in #64857 - bisecting now

@BruceForstall
Copy link
Member

Note that these are all JitStressRegs=2/3

@echesakov
Copy link
Contributor

echesakov commented Feb 15, 2022

Looks that these asserts were changed in 3fc8b09 (#63763) and they were failing since then.

Here is how to reproduce the issue

D:\echesako\src\runtime>src\coreclr\scripts\superpmi.py replay -arch x86 -target_arch arm -target_os Linux -jit_name clrjit_universal_arm_x86.dll -filter libraries.crossgen2 -jitoption JitStressRegs=2 -jitoption TieredCompilation=0 -compile 146598 -log_level debug
================ Logging to D:\echesako\spmi\superpmi.25.log
Using JIT/EE Version from jiteeversionguid.h: 63009f0c-662a-485b-bac1-ff67be6c7f9d
Using clrjit_universal_arm_x86.dll from Core_Root: D:\echesako\src\runtime\artifacts\tests\coreclr\windows.x86.Checked\Tests\Core_Root\clrjit_universal_arm_x86.dll
Found download cache directory "D:\echesako\spmi\mch\63009f0c-662a-485b-bac1-ff67be6c7f9d.Linux.arm" and --force_download not set; skipping download
SuperPMI replay
------------------------------------------------------------
Start time: 11:53:06
JIT Path: D:\echesako\src\runtime\artifacts\tests\coreclr\windows.x86.Checked\Tests\Core_Root\clrjit_universal_arm_x86.dll
Using MCH files:
  D:\echesako\spmi\mch\63009f0c-662a-485b-bac1-ff67be6c7f9d.Linux.arm\libraries.crossgen2.Linux.arm.checked.mch
Using superpmi.exe from Core_Root: D:\echesako\src\runtime\artifacts\tests\coreclr\windows.x86.Checked\Tests\Core_Root\superpmi.exe

Temp Location: C:\Users\echesako\AppData\Local\Temp\tmpo7iy7m3s

Running SuperPMI replay of D:\echesako\spmi\mch\63009f0c-662a-485b-bac1-ff67be6c7f9d.Linux.arm\libraries.crossgen2.Linux.arm.checked.mch
Invoking: D:\echesako\src\runtime\artifacts\tests\coreclr\windows.x86.Checked\Tests\Core_Root\superpmi.exe -v ewmi -r C:\Users\echesako\AppData\Local\Temp\tmpo7iy7m3s\repro -c 146598 -target arm -jitoption JitStressRegs=2 -jitoption TieredCompilation=0 -f C:\Users\echesako\AppData\Local\Temp\tmpo7iy7m3s\libraries.crossgen2.Linux.arm.checked.mch_fail.mcl -metricsSummary C:\Users\echesako\AppData\Local\Temp\tmpo7iy7m3s\libraries.crossgen2.Linux.arm.checked.mch_metrics.csv D:\echesako\src\runtime\artifacts\tests\coreclr\windows.x86.Checked\Tests\Core_Root\clrjit_universal_arm_x86.dll D:\echesako\spmi\mch\63009f0c-662a-485b-bac1-ff67be6c7f9d.Linux.arm\libraries.crossgen2.Linux.arm.checked.mch
ERROR: Exception thrown: DebugBreak or AV Exception 123
ERROR: main method 146598 of size 500 failed to load and compile correctly.
ISSUE: <ASSERT> #146598 D:\echesako\src\runtime\src\coreclr\jit\codegenarmarch.cpp (3196) - Assertion failed '(gcInfo.gcRegGCrefSetCur & killMask) == 0' in 'Microsoft.CodeAnalysis.CSharp.Binder:BindIndexerOrIndexedPropertyAccess(Microsoft.CodeAnalysis.CSharp.CSharpSyntaxNode,Microsoft.CodeAnalysis.CSharp.BoundExpression,Microsoft.CodeAnalysis.ArrayBuilder`1[Microsoft.CodeAnalysis.CSharp.Symbols.PropertySymbol],Microsoft.CodeAnalysis.CSharp.AnalyzedArguments,Microsoft.CodeAnalysis.DiagnosticBag):Microsoft.CodeAnalysis.CSharp.BoundExpression:this' during 'Generate code' (IL size 500)

Compilation failures
Method numbers with compilation failures:
146598
Replay summary:
  Replay failures in 1 MCH files:
    D:\echesako\spmi\mch\63009f0c-662a-485b-bac1-ff67be6c7f9d.Linux.arm\libraries.crossgen2.Linux.arm.checked.mch
Finish time: 11:53:06
Elapsed time: 0:00:00.087880

cc @jakobbotsch

@jakobbotsch jakobbotsch self-assigned this Feb 15, 2022
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Feb 15, 2022
jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Feb 15, 2022
Revert these particular fragments back to what they were before CFG.
This should fix outerloop but I will need to take a closer look tomorrow
as this particular code is not right for the CFG validator.

Fix dotnet#65395
@jakobbotsch
Copy link
Member

Slightly smaller repro with arm altjit and JitStress=2, JitStressRegs=2:

using System.Numerics;
using System.Runtime.CompilerServices;

public class Program
{
    public static void Main()
    {
        Matrix<float> foo = default;
        foo = foo * foo;
    }

    public struct Matrix<T> where T : struct
    {
        private T[] _data;
        public int xCount, yCount;
        private int _xTileCount;
        private int _yTileCount;
        private int _flattenedCount;
        private static readonly int s_tileSize = Vector<T>.Count;

        public Matrix(int theXCount, int theYCount)
        {
            // Round up the dimensions so that we don't have to deal with remnants.
            int vectorCount = Vector<T>.Count;
            _xTileCount = (theXCount + vectorCount - 1) / vectorCount;
            _yTileCount = (theYCount + vectorCount - 1) / vectorCount;
            xCount = _xTileCount * vectorCount;
            yCount = _yTileCount * vectorCount;
            _flattenedCount = xCount * yCount;
            _data = new T[_flattenedCount];
        }


        [MethodImpl(MethodImplOptions.NoInlining)]
        public static void Transpose(Matrix<T> m, int xStart, int yStart, Vector<T>[] result)
        {
        }

        public static Matrix<T> operator *(Matrix<T> left, Matrix<T> right)
        {
            Matrix<T> result = new Matrix<T>(left.xCount, right.yCount);
            Vector<T>[] temp = new Vector<T>[s_tileSize];
            T[] temp2 = new T[s_tileSize];
            Transpose(right, 0, 0, temp);
            Vector<T> leftTileRow = new Vector<T>(left._data, left.yCount);

            for (int n = 0; n < s_tileSize; n++)
            {
                temp2[n] = Vector.Dot(leftTileRow, temp[n]);
            }
            new Vector<T>(result._data, result.yCount);
            return result;
        }
    }
}

@jakobbotsch
Copy link
Member

In the stress mode we end up with a series of spills/reload around setting up call arguments:

0000EE  9839           ldr     r0, [sp+0xe4]    // [V02 arg1]
0000F0  993A           ldr     r1, [sp+0xe8]    // [V02 arg1+0x04]
0000F2  9A3B           ldr     r2, [sp+0xec]    // [V02 arg1+0x08]
0000F4  9B3C           ldr     r3, [sp+0xf0]    // [V02 arg1+0x0c]
0000F6  900D           str     r0, [sp+0x34]    // [TEMP_02]
0000F8  F642 10B0      movw    r0, 0x29b0
0000FC  F6C0 0073      movt    r0, 0x873
000100  900C           str     r0, [sp+0x30]    // [TEMP_01]
000102  980D           ldr     r0, [sp+0x34]    // [TEMP_02]
000104  F8DD C030      ldr     r12, [sp+0x30]   // [TEMP_01]
000108  47E0           blx     r12              // Matrix`1[Single][System.Single]:Transpose(Matrix`1[Single],int,int,System.Numerics.Vector`1[System.Single][])

It looks like genConsumeArgSplitStruct assumes that genCall will handle clearing the GC info:

void CodeGen::genConsumeArgSplitStruct(GenTreePutArgSplit* putArgNode)
{
assert(putArgNode->OperGet() == GT_PUTARG_SPLIT);
assert(putArgNode->gtHasReg());
genUnspillRegIfNeeded(putArgNode);
// Skip updating GC info
// GC info for all argument registers will be cleared in caller
genCheckConsumeNode(putArgNode);
}

We can either go back to this or update GC info as part of consuming the arg split node. I am more inclined to do the latter to put it in line with how we do GC reporting for other arg nodes.

jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Feb 16, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 17, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Mar 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm32 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs JitStress CLR JIT issues involving JIT internal stress modes os-linux Linux OS (any supported distro)
Projects
None yet
4 participants