[Arm64] Implement ASIMD widening, narrowing, saturating intrinsics #35612

echesakov · 2020-04-29T17:43:15Z

Implement API as approved in #32512 (with corrections regarding their ISA class - see my comment here)

Note that I am using the following temporary names as per comment:

AddReturningHighNarrowUpper and AddReturningHighNarrowLower
AddReturningRoundedHighNarrowUpper and AddReturningRoundedHighNarrowLower
SubtractReturningHighNarrowUpper and SubtractReturningHighNarrowLower.

I will update the methods names after the next round of API review.

The change is quite straightforward - the special handling is required only for AddWideningLower, AddWideningUpper, SubractWideningLower and SubtractWideningUpper since each of those are mapped to four different instructions (saddl{2}, saddw{2}, uaddl{2} and uaddw{2}; ssubl{2}, ssubw{2}, usubl{2} and usubw{2}) and do not fit into our table-driven model.

~~The change depends on removing HW_Flag_UnfixedSIMDSize in #35594 - I didn't want to keep adding the flag for the added intrinsics for no particular reason.~~

Fixes #32512

ghost · 2020-04-29T17:43:17Z

Tagging subscribers to this area: @tannergooding
Notify danmosemsft if you want to be subscribed.

Dotnet-GitSync-Bot · 2020-04-29T17:43:30Z

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

echesakov · 2020-04-29T17:43:32Z

@CarolEidt @tannergooding PTAL

echesakov · 2020-04-29T17:44:43Z

@TamarChristinaArm PTAL

echesakov · 2020-04-30T02:46:57Z

Rebased on top of 1f436e0

tannergooding · 2020-04-30T16:03:10Z

src/coreclr/src/jit/hwintrinsic.cpp

+                if ((intrinsic == NI_AdvSimd_AddWideningUpper) || (intrinsic == NI_AdvSimd_SubtractWideningUpper))
+                {
+                    assert(varTypeIsSIMD(op1->TypeGet()));
+                    retNode->AsHWIntrinsic()->SetOtherBaseType(getBaseTypeOfSIMDType(argClass));


Why do widening and narrowing need an "other" base type? The return type is always twice the size and of the same signedness of the inputs for these operations.

Why do widening and narrowing need an "other" base type?

Narrowing intrinsics do not need - only AddWideningUpper and SubractWideningUpper do.

They don't need the operand 1 type itself (which is always going to be TYP_SIMD16) but the operand 1 element type so I can distinguish between each pair of these three

Vector128<int> AddWideningUpper(Vector128<short> left, Vector128<short> right); Vector128<short> AddWideningUpper(Vector128<short> left, Vector128<sbyte> right); Vector128<int> AddWideningUpper(Vector128<int> left, Vector128<short> right);

Why can't we just use the second argument for the base type? That is what HW_Flag_BaseTypeFromSecondArg is for.

Oh nevermind, I missed that int = short + short and int = int + short need to be differentiated

Interesting - I renamed the gtIndexBaseType to gtOtherBaseType when I restructured the IR, because I figured there would be cases aside from gather where we'd need an additional base type - I didn't think it would be used that quickly :-)

tannergooding · 2020-04-30T16:06:13Z

src/coreclr/src/jit/hwintrinsiccodegenarm64.cpp

+                assert(varTypeIsIntegral(intrin.baseType));
+                if (intrin.op1->TypeGet() == TYP_SIMD8)
+                {
+                    ins = varTypeIsUnsigned(intrin.baseType) ? INS_uaddl : INS_saddl;


We should be able to get rid of this logic once we change the tables to take simdSize into account during lookup, correct?

We can if we make simdSize and baseType of an intrinsic "independent" on each other - i.e. if we in addition to HW_Flag_BaseTypeFromFirstArg, HW_Flag_BaseTypeFromSecondArg we have HW_Flag_SimdSizeFromFirstArg, HW_Flag_SimdSizeFromSecondArg which would allow, for example, to set simdSize based on op1 and baseType from op2.

Then, the corresponding rows in hwintrinsiclistarm64.h could be updated as

HARDWARE_INTRINSIC(AdvSimd, AddWideningLower, 8, 2, {INS_saddl, INS_uaddl, … HW_Flag_SimdSizeFromFirstArg|HW_Flag_BaseTypeFromSecondArg) HARDWARE_INTRINSIC(AdvSimd, AddWideningLower, 16, 2, {INS_saddw, INS_uaddw, … HW_Flag_SimdSizeFromFirstArg|HW_Flag_BaseTypeFromSecondArg) HARDWARE_INTRINSIC(AdvSimd, AddWideningUpper, 8, 2, {INS_saddl2, INS_uaddl2, … HW_Flag_SimdSizeFromFirstArg|HW_Flag_BaseTypeFromSecondArg) HARDWARE_INTRINSIC(AdvSimd, AddWideningUpper, 16, 2, {INS_saddw2, INS_uaddw2, … HW_Flag_SimdSizeFromFirstArg|HW_Flag_BaseTypeFromSecondArg)

In other words, if we allow x and y coordinates (where x is baseType and y is a tuple of (intrinsicId, simdSize)) of hwintrinsiclistarm64.h table to be orthogonal then the answer to your question is yes.

…NotSupported.cs

…tformNotSupported.cs

…Supported.cs

…ormNotSupported.cs

…Supported.cs

…ormNotSupported.cs

…orted.cs

…otSupported.cs

…upported.cs

…orted.cs

…hwintrinsiclistarm64.h

…intrinsiclistarm64.h

…wintrinsiclistarm64.h

…tarm64.h

…er in hwintrinsiclistarm64.h

…ingUpperAndAdd in hwintrinsiclistarm64.h

…in hwintrinsiclistarm64.h

…intrinsiclistarm64.h

…rowUpper in hwintrinsiclistarm64.h

…pper in hwintrinsiclistarm64.h

…dedHighNarrowUpper in hwintrinsiclistarm64.h

… in hwintrinsiccodegenarm64.cpp

… value in hwintrinsiccodegenarm64.cpp

…nd SubtractWideningUpper in hwintrinsiccodegenarm64.cpp

…per and SubtractWideningUpper in hwintrinsic.cpp

echesakov · 2020-04-30T17:51:56Z

Fixed merge conflicts and rebased on top of aa81328

CarolEidt

LGTM with some non-blocking comments & questions.
I only lightly reviewed the test helper functions and related changes.

CarolEidt · 2020-04-30T18:19:10Z

src/coreclr/src/jit/hwintrinsic.cpp

+                if ((intrinsic == NI_AdvSimd_AddWideningUpper) || (intrinsic == NI_AdvSimd_SubtractWideningUpper))
+                {
+                    assert(varTypeIsSIMD(op1->TypeGet()));
+                    retNode->AsHWIntrinsic()->SetOtherBaseType(getBaseTypeOfSIMDType(argClass));


Interesting - I renamed the gtIndexBaseType to gtOtherBaseType when I restructured the IR, because I figured there would be cases aside from gather where we'd need an additional base type - I didn't think it would be used that quickly :-)

CarolEidt · 2020-04-30T18:25:42Z

src/coreclr/src/jit/hwintrinsiccodegenarm64.cpp

@@ -205,7 +205,11 @@ void CodeGen::genHWIntrinsic(GenTreeHWIntrinsic* node)
    emitAttr emitSize;
    insOpts  opt = INS_OPTS_NONE;

-    if ((intrin.category == HW_Category_SIMDScalar) || (intrin.category == HW_Category_Scalar))
+    if (intrin.category == HW_Category_SIMDScalar)


The changes here highlight the fact that there are many subtleties with regard to emit sizes - the "actual type" of the baseType, the "raw" (emitTypeSize) of the baseType, and the emitSize of the node itself. I think it would be worth some additional comments both here and in the cases where we use emitTypeSize(node) for moves.

Agree, I will follow up and add the comments

CarolEidt · 2020-04-30T18:31:28Z

src/coreclr/src/jit/hwintrinsiccodegenarm64.cpp

+            case NI_AdvSimd_AddWideningUpper:
+            case NI_AdvSimd_SubtractWideningLower:
+            case NI_AdvSimd_SubtractWideningUpper:
+                GetEmitter()->emitIns_R_R_R(ins, emitSize, targetReg, op1Reg, op2Reg, opt);


I believe this is identical to case NI_Crc32_ComputeCrc32: et al. Is there a reason you didn't combine it? Is the idea to keep them in order, and rely on the C++ compiler to de-duplicate the code? (I see there's some duplication already).

No, I just missed this. Thanks for spotting this.

I will follow up and de-duplicate this in a separate PR and I also need to address one of your another suggestions here

echesakov · 2020-05-01T02:39:24Z

Both runtime (Mono Product Build Android x86 debug) and runtime (Mono Product Build OSX x64 debug) has succeeded on the second attempt - https://dev.azure.com/dnceng/public/_build/results?buildId=625278 - merging

echesakov added arch-arm64 area-System.Runtime.Intrinsics area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Apr 29, 2020

Dotnet-GitSync-Bot added the new-api-needs-documentation label Apr 29, 2020

echesakov force-pushed the Arm64-ASIMD-Widening-Narrowing-Saturating-Intrinsics branch from f5004a2 to 97e7d77 Compare April 30, 2020 02:46

tannergooding reviewed Apr 30, 2020

View reviewed changes

tannergooding approved these changes Apr 30, 2020

View reviewed changes

echesakov mentioned this pull request Apr 30, 2020

Add VectorTableList and TableVectorExtension intrinsics #35600

Merged

echesakov added 16 commits April 30, 2020 10:13

Add AddSaturateScalar in AdvSimd.Arm64 in AdvSimd.cs AdvSimd.Platform…

03807cd

…NotSupported.cs

Add SubtractSaturateScalar in AdvSimd.Arm64 in AdvSimd.cs AdvSimd.Pla…

67deb7e

…tformNotSupported.cs

Add AbsoluteDifferenceWideningLower in AdvSimd.cs AdvSimd.PlatformNot…

061e04e

…Supported.cs

Add AbsoluteDifferenceWideningLowerAndAdd in AdvSimd.cs AdvSimd.Platf…

b3cc6e1

…ormNotSupported.cs

Add AbsoluteDifferenceWideningUpper in AdvSimd.cs AdvSimd.PlatformNot…

92abf74

…Supported.cs

Add AbsoluteDifferenceWideningUpperAndAdd in AdvSimd.cs AdvSimd.Platf…

b6edc22

…ormNotSupported.cs

Add AddPairwiseWidening{Scalar} in AdvSimd.cs AdvSimd.PlatformNotSupp…

cc22751

…orted.cs

Add AddPairwiseWideningAndAdd{Scalar} in AdvSimd.cs AdvSimd.PlatformN…

e6e9fab

…otSupported.cs

Add AddSaturate{Scalar} in AdvSimd in AdvSimd.cs AdvSimd.PlatformNotS…

9301bf3

…upported.cs

Add AddWideningLower in AdvSimd.cs AdvSimd.PlatformNotSupported.cs

9ee6714

Add AddWideningUpper in AdvSimd.cs AdvSimd.PlatformNotSupported.cs

4d09a1b

Add FusedAddHalving in AdvSimd.cs AdvSimd.PlatformNotSupported.cs

7ae7150

Add FusedAddRoundedHalving in AdvSimd.cs AdvSimd.PlatformNotSupported.cs

61d6211

Add FusedSubtractHalving in AdvSimd.cs AdvSimd.PlatformNotSupported.cs

0ac3bfb

Add MultiplyWideningLower in AdvSimd.cs AdvSimd.PlatformNotSupported.cs

ecdfd1d

Add MultiplyWideningLowerAndAdd in AdvSimd.cs AdvSimd.PlatformNotSupp…

8d24607

…orted.cs

echesakov added 23 commits April 30, 2020 10:27

Add AddSaturateScalar and SubtractSaturateScalar in AdvSimd_Arm64 in …

23efe6d

…hwintrinsiclistarm64.h

Add AddSaturate{Scalar} and SubtractSaturate{Scalar} in AdvSimd in hw…

f212547

…intrinsiclistarm64.h

Add FusedAdd{Rounded}Halving and FusedSubtractHalving in AdvSimd in h…

7404c75

…wintrinsiclistarm64.h

Add AddWideningLower and AddWideningUpper in hwintrinsiclistarm64.h

cf8fed3

Add SubtractWideningLower and SubtractWideningUpper in hwintrinsiclis…

2d19e56

…tarm64.h

Add MultiplyWideningLower and MultiplyWideningUpper in hwintrinsiclis…

f514ba2

…tarm64.h

Add AbsoluteDifferenceWideningLower and AbsoluteDifferenceWideningUpp…

380fbe8

…er in hwintrinsiclistarm64.h

Put HW_Flag_Commutative on AbsoluteDifference in hwintrinsiclistarm64.h

7cd8068

Add AbsoluteDifferenceWideningLowerAndAdd and AbsoluteDifferenceWiden…

b6b7d5e

…ingUpperAndAdd in hwintrinsiclistarm64.h

Add MultiplyWideningLowerAndAdd and MultiplyWideningLowerAndSubtract …

2f759a2

…in hwintrinsiclistarm64.h

Add MultiplyWideningUpperAndAdd and MultiplyWideningUpperAndSubtract …

0282090

…in hwintrinsiclistarm64.h

Add AddPairwiseWidening{Scalar} in hwintrinsiclistarm64.h

93f5fe2

Add AddPairwiseWideningAndAdd{Scalar} in hwintrinsiclistarm64.h

db98a0f

Add AddReturningHighNarrowLower and AddReturningHighNarrowUpper in hw…

9722ee8

…intrinsiclistarm64.h

Add AddReturningRoundedHighNarrowLower and AddReturningRoundedHighNar…

38aeb95

…rowUpper in hwintrinsiclistarm64.h

Add SubtractReturningHighNarrowLower and SubtractReturningHighNarrowU…

f249aa2

…pper in hwintrinsiclistarm64.h

Add SubtractReturningRoundedHighNarrowLower and SubtractReturningRoun…

f1363ff

…dedHighNarrowUpper in hwintrinsiclistarm64.h

emitSize for SIMDScalar should be emitTypeSize not emitActualTypeSize…

90bdd39

… in hwintrinsiccodegenarm64.cpp

For RMW intrinsics "mov targetReg, op1Reg" should have size of return…

86350e9

… value in hwintrinsiccodegenarm64.cpp

Implement AddWideningLower, AddWideningUpper, SubtractWideningLower a…

0447f9d

…nd SubtractWideningUpper in hwintrinsiccodegenarm64.cpp

Pass element type of first operand in OtherBaseType for AddWideningUp…

e3b0037

…per and SubtractWideningUpper in hwintrinsic.cpp

Update AdvSimd/ AdvSimd.Arm64/

e1a799e

Update System.Runtime.Intrinsics.Experimental.cs

417c253

echesakov force-pushed the Arm64-ASIMD-Widening-Narrowing-Saturating-Intrinsics branch from 97e7d77 to 417c253 Compare April 30, 2020 17:51

CarolEidt approved these changes Apr 30, 2020

View reviewed changes

echesakov merged commit a156293 into dotnet:master May 1, 2020

echesakov deleted the Arm64-ASIMD-Widening-Narrowing-Saturating-Intrinsics branch May 1, 2020 02:47

ghost locked as resolved and limited conversation to collaborators Dec 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Arm64] Implement ASIMD widening, narrowing, saturating intrinsics #35612

[Arm64] Implement ASIMD widening, narrowing, saturating intrinsics #35612

echesakov commented Apr 29, 2020 •

edited

Loading

ghost commented Apr 29, 2020

Dotnet-GitSync-Bot commented Apr 29, 2020

echesakov commented Apr 29, 2020

echesakov commented Apr 29, 2020

echesakov commented Apr 30, 2020

tannergooding Apr 30, 2020

echesakov Apr 30, 2020

tannergooding Apr 30, 2020

tannergooding Apr 30, 2020

CarolEidt Apr 30, 2020

tannergooding Apr 30, 2020

echesakov Apr 30, 2020

echesakov commented Apr 30, 2020

CarolEidt left a comment

CarolEidt Apr 30, 2020

CarolEidt Apr 30, 2020

echesakov Apr 30, 2020

CarolEidt Apr 30, 2020

echesakov Apr 30, 2020

echesakov commented May 1, 2020

[Arm64] Implement ASIMD widening, narrowing, saturating intrinsics #35612

[Arm64] Implement ASIMD widening, narrowing, saturating intrinsics #35612

Conversation

echesakov commented Apr 29, 2020 • edited Loading

ghost commented Apr 29, 2020

Dotnet-GitSync-Bot commented Apr 29, 2020

echesakov commented Apr 29, 2020

echesakov commented Apr 29, 2020

echesakov commented Apr 30, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

echesakov commented Apr 30, 2020

CarolEidt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

echesakov commented May 1, 2020

echesakov commented Apr 29, 2020 •

edited

Loading