Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Sve.VectorTableLookup() #103989

Merged
merged 4 commits into from
Jun 26, 2024

Conversation

SwapnilGaikwad
Copy link
Contributor

Contribute towards #99957.

Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@SwapnilGaikwad
Copy link
Contributor Author

@a74nh @kunalspathak @dotnet/arm64-contrib @arch-arm64-sve

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jun 25, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

@SwapnilGaikwad
Copy link
Contributor Author

Some of the stress tests are failing without tiered compilation. It fails to load results correctly from the stack. Not sure if this is a known issue or to do with the PR.

Stress test results
===================Running default===================
------------------- {} -------------------
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_float() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_double() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_sbyte() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_short() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_int() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_long() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_byte() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_ushort() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_uint() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_ulong() : 7
===================Running jitstress===================
------------------- {'JitMinOpts': '1'} -------------------
------------------- {'JitStress': '1'} -------------------
Test failed:
..........................................
..........................................
Sve.VectorTableLookup<UInt32>(Vector<UInt32>, Vector<UInt32>): RunBasicScenario_Load failed:
    left: (89522, 207830, 829429, 64753)
   right: (2, 4, 4, 5)
  result: (0, 0, 0, 0)
..........................................
System.Exception: One or more scenarios did not complete as expected.
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_uint() in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.VectorTableLookup.uint.cs:line 62
   at Program.<<Main>$>g__TestExecutor3461|0_3462(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 86757
------------------- {'JitStress': '2'} -------------------
Test failed:
..........................................
..........................................
Sve.VectorTableLookup<UInt16>(Vector<UInt16>, Vector<UInt16>): RunBasicScenario_Load failed:
    left: (51791, 60947, 32856, 46948, 62939, 46141, 51821, 55843)
   right: (14, 12, 11, 13, 2, 10, 6, 12)
  result: (51791, 51791, 51791, 51791, 51791, 51791, 51791, 51791)
..........................................
System.Exception: One or more scenarios did not complete as expected.
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_ushort() in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.VectorTableLookup.ushort.cs:line 62
   at Program.<<Main>$>g__TestExecutor3460|0_3461(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 86733
------------------- {'JitStress': '1', 'TieredCompilation': '1'} -------------------
------------------- {'JitStress': '2', 'TieredCompilation': '1'} -------------------
------------------- {'TailcallStress': '1'} -------------------
------------------- {'ReadyToRun': '0'} -------------------
===================Running jitstressregs===================
------------------- {'JitStressRegs': '1'} -------------------
------------------- {'JitStressRegs': '2'} -------------------
------------------- {'JitStressRegs': '3'} -------------------
------------------- {'JitStressRegs': '4'} -------------------
------------------- {'JitStressRegs': '8'} -------------------
------------------- {'JitStressRegs': '0x10'} -------------------
------------------- {'JitStressRegs': '0x80'} -------------------
------------------- {'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStressRegs': '0x2000'} -------------------
===================Running jitstress2-jitstressregs===================
------------------- {'JitStress': '2', 'JitStressRegs': '1'} -------------------
Test failed:
..........................................
..........................................
Sve.VectorTableLookup<UInt16>(Vector<UInt16>, Vector<UInt16>): RunBasicScenario_Load failed:
    left: (2945, 38808, 36847, 63854, 45928, 47191, 21003, 2168)
   right: (14, 6, 3, 8, 1, 0, 3, 1)
  result: (2945, 2945, 2945, 2945, 2945, 2945, 2945, 2945)
..........................................
System.Exception: One or more scenarios did not complete as expected.
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_ushort() in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.VectorTableLookup.ushort.cs:line 62
   at Program.<<Main>$>g__TestExecutor3460|0_3461(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 86733
------------------- {'JitStress': '2', 'JitStressRegs': '2'} -------------------
Test failed:
..........................................
..........................................
Sve.VectorTableLookup<UInt16>(Vector<UInt16>, Vector<UInt16>): RunBasicScenario_Load failed:
    left: (35417, 18738, 19120, 56266, 54154, 3297, 30870, 64220)
   right: (3, 10, 4, 6, 5, 9, 3, 14)
  result: (35417, 35417, 35417, 35417, 35417, 35417, 35417, 35417)
..........................................
System.Exception: One or more scenarios did not complete as expected.
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_ushort() in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.VectorTableLookup.ushort.cs:line 62
   at Program.<<Main>$>g__TestExecutor3460|0_3461(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 86733
------------------- {'JitStress': '2', 'JitStressRegs': '3'} -------------------
Test failed:
..........................................
..........................................
Sve.VectorTableLookup<UInt16>(Vector<UInt16>, Vector<UInt16>): RunBasicScenario_Load failed:
    left: (18664, 61837, 60036, 6464, 42446, 1442, 48064, 22271)
   right: (4, 1, 3, 14, 5, 10, 5, 8)
  result: (18664, 18664, 18664, 18664, 18664, 18664, 18664, 18664)
..........................................
System.Exception: One or more scenarios did not complete as expected.
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_ushort() in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.VectorTableLookup.ushort.cs:line 62
   at Program.<<Main>$>g__TestExecutor3460|0_3461(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 86733
------------------- {'JitStress': '2', 'JitStressRegs': '4'} -------------------
Test failed:
..........................................
..........................................
Sve.VectorTableLookup<UInt16>(Vector<UInt16>, Vector<UInt16>): RunBasicScenario_Load failed:
    left: (1941, 41084, 39600, 39940, 19511, 32251, 56315, 19880)
   right: (8, 1, 13, 7, 6, 14, 15, 8)
  result: (1941, 1941, 1941, 1941, 1941, 1941, 1941, 1941)
..........................................
System.Exception: One or more scenarios did not complete as expected.
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_ushort() in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.VectorTableLookup.ushort.cs:line 62
   at Program.<<Main>$>g__TestExecutor3460|0_3461(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 86733
------------------- {'JitStress': '2', 'JitStressRegs': '8'} -------------------
Test failed:
..........................................
..........................................
Sve.VectorTableLookup<UInt16>(Vector<UInt16>, Vector<UInt16>): RunBasicScenario_Load failed:
    left: (13635, 22647, 22004, 43430, 50490, 32554, 4114, 39613)
   right: (5, 9, 10, 7, 2, 1, 10, 4)
  result: (13635, 13635, 13635, 13635, 13635, 13635, 13635, 13635)
..........................................
System.Exception: One or more scenarios did not complete as expected.
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_ushort() in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.VectorTableLookup.ushort.cs:line 62
   at Program.<<Main>$>g__TestExecutor3460|0_3461(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 86733
------------------- {'JitStress': '2', 'JitStressRegs': '0x10'} -------------------
Test failed:
..........................................
..........................................
Sve.VectorTableLookup<UInt16>(Vector<UInt16>, Vector<UInt16>): RunBasicScenario_Load failed:
    left: (15405, 42755, 10930, 20318, 46988, 30754, 18167, 36219)
   right: (7, 7, 8, 14, 0, 8, 9, 6)
  result: (15405, 15405, 15405, 15405, 15405, 15405, 15405, 15405)
..........................................
System.Exception: One or more scenarios did not complete as expected.
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_ushort() in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.VectorTableLookup.ushort.cs:line 62
   at Program.<<Main>$>g__TestExecutor3460|0_3461(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 86733
------------------- {'JitStress': '2', 'JitStressRegs': '0x80'} -------------------
Test failed:
..........................................
..........................................
Sve.VectorTableLookup<UInt16>(Vector<UInt16>, Vector<UInt16>): RunBasicScenario_Load failed:
    left: (13535, 30888, 4737, 58628, 1890, 12837, 60028, 17997)
   right: (10, 1, 3, 13, 1, 4, 3, 14)
  result: (13535, 13535, 13535, 13535, 13535, 13535, 13535, 13535)
..........................................
System.Exception: One or more scenarios did not complete as expected.
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_ushort() in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.VectorTableLookup.ushort.cs:line 62
   at Program.<<Main>$>g__TestExecutor3460|0_3461(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 86733
------------------- {'JitStress': '2', 'JitStressRegs': '0x1000'} -------------------
Test failed:
..........................................
..........................................
Sve.VectorTableLookup<UInt16>(Vector<UInt16>, Vector<UInt16>): RunBasicScenario_Load failed:
    left: (63705, 30556, 31009, 37154, 63659, 37081, 40913, 49876)
   right: (5, 5, 1, 15, 4, 3, 12, 12)
  result: (63705, 63705, 63705, 63705, 63705, 63705, 63705, 63705)
..........................................
System.Exception: One or more scenarios did not complete as expected.
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_ushort() in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.VectorTableLookup.ushort.cs:line 62
   at Program.<<Main>$>g__TestExecutor3460|0_3461(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 86733
------------------- {'JitStress': '2', 'JitStressRegs': '0x2000'} -------------------
Test failed:
..........................................
..........................................
Sve.VectorTableLookup<UInt16>(Vector<UInt16>, Vector<UInt16>): RunBasicScenario_Load failed:
    left: (39329, 16349, 40825, 25564, 54383, 42243, 59327, 33437)
   right: (2, 15, 11, 10, 3, 9, 5, 3)
  result: (39329, 39329, 39329, 39329, 39329, 39329, 39329, 39329)
..........................................
System.Exception: One or more scenarios did not complete as expected.
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_VectorTableLookup_ushort() in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.VectorTableLookup.ushort.cs:line 62
   at Program.<<Main>$>g__TestExecutor3460|0_3461(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/user/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 86733

@kunalspathak kunalspathak added the arm-sve Work related to arm64 SVE/SVE2 support label Jun 25, 2024
@kunalspathak
Copy link
Member

if this is a known issue or to do with the PR

can you share the disassembly for failing test?

@SwapnilGaikwad
Copy link
Contributor Author

can you share the disassembly for failing test?

Assembly for a failing test
; Assembly listing for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_VectorTableLookup_ushort:RunBasicScenario_Load():this (FullOpts)
; Emitting BLENDED_CODE for generic ARM64 - Unix
; FullOpts code
; optimized code
; fp based frame
; fully interruptible
; No PGO data
; 0 inlinees with PGO data; 5 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
;
;  V00 this         [V00,T00] ( 10, 10   )     ref  ->  x19         this class-hnd single-def <JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_VectorTableLookup_ushort>
;* V01 loc0         [V01    ] (  0,  0   )  simd16  ->  zero-ref    HFA(simd16)  <System.Numerics.Vector`1[ushort]>
;  V02 loc1         [V02,T11] (  2,  2   )  simd16  ->   d8         HFA(simd16)  <System.Numerics.Vector`1[ushort]>
;  V03 tmp0         [V03,T10] (  1,  1   )     int  ->  [fp+0x2C]  do-not-enreg[V] "GSCookie dummy"
;# V04 OutArgs      [V04    ] (  1,  1   )  struct ( 0) [sp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
;* V05 tmp2         [V05    ] (  0,  0   )  simd16  ->  zero-ref    "impAppendStmt"
;  V06 tmp3         [V06,T04] (  2,  4   )    long  ->  x20         "impAppendStmt"
;* V07 tmp4         [V07    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
;  V08 tmp5         [V08,T01] (  3,  6   )   byref  ->  x20         single-def "Inlining Arg"
;* V09 tmp6         [V09    ] (  0,  0   )    long  ->  zero-ref    ld-addr-op "Inline stloc first use temp"
;  V10 tmp7         [V10,T05] (  2,  4   )    long  ->   x0         "Inlining Arg"
;  V11 tmp8         [V11,T03] (  3,  6   )    long  ->   x1         "Inlining Arg"
;  V12 tmp9         [V12,T02] (  3,  6   )   byref  ->  x21         single-def "Inlining Arg"
;* V13 tmp10        [V13    ] (  0,  0   )    long  ->  zero-ref    ld-addr-op "Inline stloc first use temp"
;  V14 tmp11        [V14,T06] (  2,  4   )    long  ->   x0         "argument with side effect"
;  V15 tmp12        [V15,T07] (  2,  4   )    long  ->  x21         "argument with side effect"
;  V16 tmp13        [V16,T08] (  2,  4   )    long  ->   x3         "argument with side effect"
;  V17 GsCookie     [V17    ] (  1,  1   )    long  ->  [fp+0x30]  do-not-enreg[X] addr-exposed "GSSecurityCookie"
;  V18 cse0         [V18,T09] (  3,  3   )    mask  ->   p0         "CSE #02"
;  TEMP_01                                  simd16  ->  [fp+0x1C]
;
; Lcl frame size = 40

G_M53463_IG01:  ;; offset=0x0000
            stp     fp, lr, [sp, #-0x60]!
            stp     d8, d9, [sp, #0x38]
            stp     x19, x20, [sp, #0x48]
            str     x21, [sp, #0x58]
            mov     fp, sp
            movz    x1, #0x5678
            movk    x1, #0x1234 LSL #16
            movk    x1, #0xDEF0 LSL #32
            movk    x1, #0x9ABC LSL #48
            str     x1, [fp, #0x30]     // [V17 GsCookie]
            mov     x19, x0
                                                ;; size=44 bbWeight=1 PerfScore 8.00
G_M53463_IG02:  ;; offset=0x002C
            movz    x0, #0xBAE0
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG03:  ;; offset=0x0030
            movk    x0, #0x8029 LSL #16
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG04:  ;; offset=0x0034
            movk    x0, #0xFFFF LSL #32
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG05:  ;; offset=0x0038
            movz    x1, #0xEA48      // code for TestLibrary.TestFramework:BeginScenario(System.String)
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG06:  ;; offset=0x003C
            movk    x1, #0x35DC LSL #16
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG07:  ;; offset=0x0040
            movk    x1, #0xFFFF LSL #32
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG08:  ;; offset=0x0044
            ldr     x1, [x1]
                                                ;; size=4 bbWeight=1 PerfScore 3.00
G_M53463_IG09:  ;; offset=0x0048
            blr     x1
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG10:  ;; offset=0x004C
            ldrsb   wzr, [x19]
                                                ;; size=4 bbWeight=1 PerfScore 3.00
G_M53463_IG11:  ;; offset=0x0050
            add     x0, x19, #48
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG12:  ;; offset=0x0054
            movz    x1, #0x4EA0      // code for JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_VectorTableLookup_ushort+DataTable:get_inArray1Ptr():ulong:this
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG13:  ;; offset=0x0058
            movk    x1, #0x35DC LSL #16
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG14:  ;; offset=0x005C
            movk    x1, #0xFFFF LSL #32
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG15:  ;; offset=0x0060
            ldr     x1, [x1]
                                                ;; size=4 bbWeight=1 PerfScore 3.00
G_M53463_IG16:  ;; offset=0x0064
            blr     x1
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG17:  ;; offset=0x0068
            ptrue   p0.h
                                                ;; size=4 bbWeight=1 PerfScore 2.00
G_M53463_IG18:  ;; offset=0x006C
            ld1h    { z8.h }, p0/z, [x0]
                                                ;; size=4 bbWeight=1 PerfScore 8.00
G_M53463_IG19:  ;; offset=0x0070
            str     q8, [fp, #0x1C]     // [TEMP_01]
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG20:  ;; offset=0x0074
            add     x0, x19, #48
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG21:  ;; offset=0x0078
            movz    x1, #0x4EB8      // code for JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_VectorTableLookup_ushort+DataTable:get_inArray2Ptr():ulong:this
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG22:  ;; offset=0x007C
            movk    x1, #0x35DC LSL #16
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG23:  ;; offset=0x0080
            movk    x1, #0xFFFF LSL #32
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG24:  ;; offset=0x0084
            ldr     x1, [x1]
                                                ;; size=4 bbWeight=1 PerfScore 3.00
G_M53463_IG25:  ;; offset=0x0088
            blr     x1
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG26:  ;; offset=0x008C
            ld1h    { z16.h }, p0/z, [x0]
                                                ;; size=4 bbWeight=1 PerfScore 8.00
G_M53463_IG27:  ;; offset=0x0090
            ldr     q8, [fp, #0x1C]     // [TEMP_01]
                                                ;; size=4 bbWeight=1 PerfScore 2.00
G_M53463_IG28:  ;; offset=0x0094
            tbl     z8.h, { z8.h }, z16.h
                                                ;; size=4 bbWeight=1 PerfScore 2.00
G_M53463_IG29:  ;; offset=0x0098
            add     x0, x19, #48
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG30:  ;; offset=0x009C
            movz    x1, #0x4ED0      // code for JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_VectorTableLookup_ushort+DataTable:get_outArrayPtr():ulong:this
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG31:  ;; offset=0x00A0
            movk    x1, #0x35DC LSL #16
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG32:  ;; offset=0x00A4
            movk    x1, #0xFFFF LSL #32
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG33:  ;; offset=0x00A8
            ldr     x1, [x1]
                                                ;; size=4 bbWeight=1 PerfScore 3.00
G_M53463_IG34:  ;; offset=0x00AC
            mov     v9.d[0], v8.d[1]
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG35:  ;; offset=0x00B0
            blr     x1
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG36:  ;; offset=0x00B4
            mov     v8.d[1], v9.d[0]
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG37:  ;; offset=0x00B8
            str     q8, [x0]
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG38:  ;; offset=0x00BC
            add     x20, x19, #48
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG39:  ;; offset=0x00C0
            add     x0, x20, #32
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG40:  ;; offset=0x00C4
            movz    x1, #0x4468      // code for System.Runtime.InteropServices.GCHandle:AddrOfPinnedObject():long:this
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG41:  ;; offset=0x00C8
            movk    x1, #0x35B1 LSL #16
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG42:  ;; offset=0x00CC
            movk    x1, #0xFFFF LSL #32
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG43:  ;; offset=0x00D0
            ldr     x1, [x1]
                                                ;; size=4 bbWeight=1 PerfScore 3.00
G_M53463_IG44:  ;; offset=0x00D4
            blr     x1
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG45:  ;; offset=0x00D8
            ldr     x1, [x20, #0x18]
                                                ;; size=4 bbWeight=1 PerfScore 3.00
G_M53463_IG46:  ;; offset=0x00DC
            sub     x2, x1, #1
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG47:  ;; offset=0x00E0
            add     x0, x1, x0
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG48:  ;; offset=0x00E4
            sub     x0, x0, #1
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG49:  ;; offset=0x00E8
            bic     x20, x0, x2
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG50:  ;; offset=0x00EC
            add     x21, x19, #48
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG51:  ;; offset=0x00F0
            add     x0, x21, #40
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG52:  ;; offset=0x00F4
            movz    x1, #0x4468      // code for System.Runtime.InteropServices.GCHandle:AddrOfPinnedObject():long:this
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG53:  ;; offset=0x00F8
            movk    x1, #0x35B1 LSL #16
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG54:  ;; offset=0x00FC
            movk    x1, #0xFFFF LSL #32
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG55:  ;; offset=0x0100
            ldr     x1, [x1]
                                                ;; size=4 bbWeight=1 PerfScore 3.00
G_M53463_IG56:  ;; offset=0x0104
            blr     x1
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG57:  ;; offset=0x0108
            ldr     x1, [x21, #0x18]
                                                ;; size=4 bbWeight=1 PerfScore 3.00
G_M53463_IG58:  ;; offset=0x010C
            movz    x2, #0x4F00      // code for JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_VectorTableLookup_ushort+DataTable:Align(ulong,ulong):ulong
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG59:  ;; offset=0x0110
            movk    x2, #0x35DC LSL #16
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG60:  ;; offset=0x0114
            movk    x2, #0xFFFF LSL #32
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG61:  ;; offset=0x0118
            ldr     x2, [x2]
                                                ;; size=4 bbWeight=1 PerfScore 3.00
G_M53463_IG62:  ;; offset=0x011C
            blr     x2
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG63:  ;; offset=0x0120
            mov     x21, x0
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG64:  ;; offset=0x0124
            add     x0, x19, #48
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG65:  ;; offset=0x0128
            movz    x1, #0x4ED0      // code for JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_VectorTableLookup_ushort+DataTable:get_outArrayPtr():ulong:this
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG66:  ;; offset=0x012C
            movk    x1, #0x35DC LSL #16
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG67:  ;; offset=0x0130
            movk    x1, #0xFFFF LSL #32
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG68:  ;; offset=0x0134
            ldr     x1, [x1]
                                                ;; size=4 bbWeight=1 PerfScore 3.00
G_M53463_IG69:  ;; offset=0x0138
            blr     x1
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG70:  ;; offset=0x013C
            mov     x3, x0
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG71:  ;; offset=0x0140
            mov     x2, x21
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG72:  ;; offset=0x0144
            mov     x0, x19
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG73:  ;; offset=0x0148
            mov     x1, x20
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG74:  ;; offset=0x014C
            movz    x4, #0xBAE0
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG75:  ;; offset=0x0150
            movk    x4, #0x8029 LSL #16
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG76:  ;; offset=0x0154
            movk    x4, #0xFFFF LSL #32
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG77:  ;; offset=0x0158
            movz    x5, #0x5050      // code for JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_VectorTableLookup_ushort:ValidateResult(ulong,ulong,ulong,System.String):this
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG78:  ;; offset=0x015C
            movk    x5, #0x35DC LSL #16
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG79:  ;; offset=0x0160
            movk    x5, #0xFFFF LSL #32
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG80:  ;; offset=0x0164
            ldr     x5, [x5]
                                                ;; size=4 bbWeight=1 PerfScore 3.00
G_M53463_IG81:  ;; offset=0x0168
            blr     x5
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG82:  ;; offset=0x016C
            movz    xip0, #0x5678
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG83:  ;; offset=0x0170
            movk    xip0, #0x1234 LSL #16
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG84:  ;; offset=0x0174
            movk    xip0, #0xDEF0 LSL #32
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG85:  ;; offset=0x0178
            movk    xip0, #0x9ABC LSL #48
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG86:  ;; offset=0x017C
            ldr     xip1, [fp, #0x30]   // [V17 GsCookie]
                                                ;; size=4 bbWeight=1 PerfScore 2.00
G_M53463_IG87:  ;; offset=0x0180
            cmp     xip0, xip1
                                                ;; size=4 bbWeight=1 PerfScore 0.50
G_M53463_IG88:  ;; offset=0x0184
            beq     G_M53463_IG90
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG89:  ;; offset=0x0188
            bl      CORINFO_HELP_FAIL_FAST
                                                ;; size=4 bbWeight=1 PerfScore 1.00
G_M53463_IG90:  ;; offset=0x018C
            ldr     x21, [sp, #0x58]
            ldp     x19, x20, [sp, #0x48]
            ldp     d8, d9, [sp, #0x38]
            ldp     fp, lr, [sp], #0x60
            ret     lr
                                                ;; size=20 bbWeight=1 PerfScore 6.00

; Total bytes of code 416, prolog size 44, PerfScore 116.50, instruction count 104, allocated bytes for code 416 (MethodHash=56ae2f28) for method JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_VectorTableLookup_ushort:RunBasicScenario_Load():this (FullOpts)
; ============================================================

Beginning scenario: RunBasicScenario_Load
Sve.VectorTableLookup<UInt16>(Vector<UInt16>, Vector<UInt16>): RunBasicScenario_Load failed:
    left: (614, 26433, 11891, 21803, 61677, 27467, 51172, 65480)
   right: (2, 10, 5, 8, 0, 10, 7, 10)
  result: (614, 614, 614, 614, 614, 614, 614, 614)

@SwapnilGaikwad SwapnilGaikwad marked this pull request as ready for review June 26, 2024 10:40
@kunalspathak
Copy link
Member

if this is a known issue or to do with the PR

It is a known issue.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@kunalspathak kunalspathak merged commit 6373209 into dotnet:main Jun 26, 2024
145 of 167 checks passed
@SwapnilGaikwad SwapnilGaikwad deleted the github-sve-vectorTableLookup branch June 27, 2024 09:04
@github-actions github-actions bot locked and limited conversation to collaborators Jul 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Runtime.Intrinsics arm-sve Work related to arm64 SVE/SVE2 support community-contribution Indicates that the PR has been added by a community member new-api-needs-documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants