Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Assertion failed 'endOffset > startOffset' during 'Emit GC+EH tables' #106379

Closed
amanasifkhalid opened this issue Aug 14, 2024 · 4 comments · Fixed by #107568
Closed

JIT: Assertion failed 'endOffset > startOffset' during 'Emit GC+EH tables' #106379

amanasifkhalid opened this issue Aug 14, 2024 · 4 comments · Fixed by #107568
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@amanasifkhalid
Copy link
Member

amanasifkhalid commented Aug 14, 2024

Hit by Antigen on Linux arm64:

// Found by Antigen

using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;
using System.Runtime.Intrinsics;
using System.Runtime.Intrinsics.Arm;
using System.Runtime.Intrinsics.X86;
using System.Numerics;
public class TestClass
{
    public struct S1
    {
        public struct S1_D1_F1
        {
            public int int_0;
        }
        public sbyte sbyte_1;
    }
    public struct S2
    {
        public Vector<uint> v_uint_2;
    }
    static Vector64<short> s_v64_short_20 = Vector64.Create(((short)(4)), -5, 4, 2);
    static Vector64<int> s_v64_int_22 = Vector64<int>.AllBitsSet;
    static Vector64<long> s_v64_long_24 = Vector64<long>.AllBitsSet;
    static Vector128<int> s_v128_int_32 = Vector128.CreateScalar(((int)(2)));
    static S1 s_s1_52 = new S1();
    static S2 s_s2_53 = new S2();
    Vector64<short> v64_short_71 = Vector64.Create(((short)(4)), -2, -1, 4);
    Vector64<long> v64_long_75 = Vector64<long>.Zero;
    Vector128<int> v128_int_83 = Vector128.Create(((int)(0)), 2, 4, -1);
    S1 s1_103 = new S1();
    S2 s2_104 = new S2();
    static int s_loopInvariant = 8;
    public S2 Method4(out S1 p_s1_198, S2 p_s2_199, ref S2 p_s2_200, out S1 p_s1_201, ref S1 p_s1_202, S2 p_s2_203, S2 p_s2_204, out S1 p_s1_205)
    {
        unchecked
        {
            p_s1_198 = s1_103;
            p_s1_201 = p_s1_198;
            p_s1_205 = s1_103;
            try
            {
                s_v64_long_24 = ((Vector64<long>)(((Vector64<long>)(s_v64_long_24 = ((Vector64<long>)(v64_long_75 = v64_long_75)))) | s_v64_long_24));
            }
            finally
            {
                int __loopvar0 = s_loopInvariant, __loopSecondaryVar0_0 = s_loopInvariant - 9;
                for (;;)
                {
                    if (__loopvar0 >= s_loopInvariant + 3)
                        break;
                    try
                    {
                        v128_int_83 -= ((Vector128<int>)(((Vector128<int>)(((Vector128<int>)(s_v128_int_32 -= v128_int_83)) | ((Vector128<int>)(s_v128_int_32 | Vector128.WithUpper(s_v128_int_32, s_v64_int_22))))) | v128_int_83));
                    }
                    finally
                    {
                        v64_short_71 *= ((Vector64<short>)(s_v64_short_20 *= ((Vector64<short>)(Vector64.LessThanOrEqual(v64_short_71, s_v64_short_20) - ((Vector64<short>)(((Vector64<short>)(s_v64_short_20 * s_v64_short_20)) - ((Vector64<short>)(Vector64<short>.AllBitsSet * v64_short_71))))))));
                    }
                }
            }
            return s_s2_53;
        }
    }
    public void Method0()
    {
        unchecked
        {
            S1 s1_2942 = new S1();
            S2 s2_2943 = new S2();
            s_s2_53 = Method4(out s1_103, s_s2_53, ref s2_104, out s_s1_52, ref s1_103, s2_2943, s2_104, out s1_2942);
        }
    }
    public static void Main(string[] args)
    {
        new TestClass().Method0();
    }
}
/*
Got output diff:
--------- Baseline ---------  

Environment:



--------- Test ---------  

Environment:

set DOTNET_JitStress=1

Assert failure(PID 8212 [0x00002014], Thread: 14468 [0x3884]): Assertion failed 'endOffset > startOffset' in 'TestClass:Method4(byref,TestClass+S2,byref,byref,byref,TestClass+S2,TestClass+S2,byref):TestClass+S2:this' during 'Emit GC+EH tables' (IL size 295; hash 0x801cd1ad; Tier0-FullOpts)
    File: C:\wk\runtime\src\coreclr\jit\unwindarmarch.cpp:1645
    Image: C:\aman\Core_Root\corerun.exe
*/

cc @dotnet/jit-contrib

@amanasifkhalid amanasifkhalid added this to the 9.0.0 milestone Aug 14, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Aug 14, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@amanasifkhalid
Copy link
Member Author

On ARM platforms, the emitter splits up exceptionally large methods and reports EH info in 512KB chunks. Under rare stress circumstances (0.5% odds of firing when JitStress is enabled), the emitter will instead try to split the method into fragments of at most 4 bytes in size. This stress mode fired for the above example, and the emitter's splitting logic created a zero-size fragment at the end of the method body.

Since having a zero-sized fragment will trigger asserts, I suppose it is preferable to not respect the desired chunk size, and skip the last split if it will create a zero-sized fragment. Doing so doesn't have correctness consequences under stress modes, as far as I know. But under normal circumstances, if we have a candidate chunk that is exactly 512KB and we decide to not split it to avoid creating a zero-sized fragment, we will probably break ISA invariants around EH data. This seems like a highly unlikely scenario (TODO comments suggest this chunk limit is 1MB on ARM64, making it even more unlikely there), and fixing it would probably require padding out the last chunk to not be zero-sized; I don't think we need to fix this yet.

Since this only impacts stress modes for the time being, and the odds of this specific stress mode firing are quite low, I don't think we need to backport a fix for this, but I'm happy to do so if I get pushback.

@amanasifkhalid amanasifkhalid modified the milestones: 9.0.0, 10.0.0 Sep 9, 2024
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Sep 9, 2024
@BruceForstall
Copy link
Member

I was finally able to repro this myself and investigate.

The case here is quite weird: there is a loop within a 'try' block in a funclet that includes a nested 'finally' handler funclet that physically precedes the loop's "main" funclet. The alignment code considers aligning this loop, so it inserts "align" pseudo-instructions before the first block of the loop, which happens to be the first funclet (which is a single-block 'finally' funclet). Thus, the alignment instructions are "between" the end of the main body and the beginning of the funclet region. The IR when placing the 'align' instructions is:

***************  Natural loop graph
L00 header: BB08
  Members (5): BB06;[BB08..BB11]
  Entry: BB07 -> BB08
  Exit: BB08 -> BB12
  Back: BB11 -> BB08

Aligning L00 that starts at BB06, weight=900.0001 >= 300.
Marking BB05 before the loop with BBF_HAS_ALIGN for loop at BB06
Found 1 candidates for loop alignment

*************** Finishing PHASE Place 'align' instructions
Trees after Place 'align' instructions

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight    IBC [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1     100 [000..026)-> *(1)                    (always)                     i LIR IBC
BB02 [0001]  1  1    BB01                  1     100 [026..04F)-> *(1)                    (always) T1      try { }     i LIR IBC keep
BB03 [0009]  1       BB02                  1     100 [???..???)-> BB07(1)                 (callf )                     i LIR IBC internal
BB04 [0010]  1       BB12                  1     100 [???..???)-> *(1)                    (callfr)                     i LIR IBC internal xentry
BB05 [0008]  1       BB04                  1     100 [118..11E)                           (return)                     i LIR IBC xentry has-align
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ funclets follow
BB06 [0006]  2     0 BB10                  9.00  900 [0BC..117)-> BB11(1)                 (finret)    H0 F finally { } i LIR IBC keep xentry flet align
BB07 [0002]  2     1 BB03                  1.00  100 [04F..05B)-> *(1)                    (always)    H1 F finally {   i LIR IBC keep xentry flet
BB08 [0003]  2     1 BB07,BB11            10.00 1000 [05B..069)-> BB12(0.1),*(0.9)        ( cond )    H1               i LIR IBC loophead xentry bwd bwd-target
BB09 [0005]  1  0  1 BB08                  9.00  900 [069..0BC)-> *(1)                    (always) T0 H1   try { }     i LIR IBC keep xentry bwd bwd-src
BB10 [0011]  1     1 BB09                  9.00  900 [???..???)-> BB06(1)                 (callf )    H1               i LIR IBC internal xentry
BB11 [0012]  1     1 BB06                  9.00  900 [???..???)-> BB08(1)                 (callfr)    H1               i LIR IBC internal xentry
BB12 [0007]  1     1 BB08                  1.00  100 [117..118)-> BB04(1)                 (finret)    H1   }           i LIR IBC xentry
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

during codegen, 16 bytes of 'align' instructions are inserted after the main function, in a new IG. Later, loop alignment decides to not align the loop, so it zeros out these alignment instructions, leaving a zero-sized IG.

It seems unlikely there are any scenarios besides the alignment case described above that will create a zero-sized IG. It does seem like it would be possible to hit this in real code, though of course it would be extremely rare.

@BruceForstall
Copy link
Member

@kunalspathak The above comment describes a situation that surprised me somewhat, that arises from loops now being able to contain multiple different funclets, that are laid out in non-contiguous "arbitrary" fashion compared to the loop itself. Do you think the alignment code will "do the right thing" in all the possible cases? E.g., what if there are multiple loops meriting alignment, each containing different funclets; inserting one alignment pad might conflict with and/or prevent the other loop's alignment pad.

@github-actions github-actions bot locked and limited conversation to collaborators Oct 20, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI in-pr There is an active PR which will close this issue when it is merged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants