-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ARM] Add NEON support for ISD::ABDS/ABDU nodes. #94504
Conversation
@llvm/pr-subscribers-backend-arm Author: Simon Pilgrim (RKSimon) ChangesAs noted on #94466, NEON has ABDS/ABDU instructions but only handles them via intrinsics, plus some VABDL custom patterns. This patch flags basic ABDS/ABDU for neon types as legal. I'm not clear how ARM handles intrinsic -> ISD mapping, so I've left all the intrinsics handling in place so far at the moment. Ideally all the VABD/VABA/VABDL/VABAL handling should be moved to using the abds/abdu nodes - but am I on the right track? Fixes #94466 Full diff: https://github.com/llvm/llvm-project/pull/94504.diff 2 Files Affected:
diff --git a/llvm/lib/Target/ARM/ARMISelLowering.cpp b/llvm/lib/Target/ARM/ARMISelLowering.cpp
index 5212d2c620b75..0f4eafa8e5baf 100644
--- a/llvm/lib/Target/ARM/ARMISelLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMISelLowering.cpp
@@ -205,9 +205,9 @@ void ARMTargetLowering::addTypeForNEON(MVT VT, MVT PromotedLdStVT) {
setOperationAction(ISD::SDIVREM, VT, Expand);
setOperationAction(ISD::UDIVREM, VT, Expand);
- if (!VT.isFloatingPoint() &&
- VT != MVT::v2i64 && VT != MVT::v1i64)
- for (auto Opcode : {ISD::ABS, ISD::SMIN, ISD::SMAX, ISD::UMIN, ISD::UMAX})
+ if (!VT.isFloatingPoint() && VT != MVT::v2i64 && VT != MVT::v1i64)
+ for (auto Opcode : {ISD::ABS, ISD::ABDS, ISD::ABDU, ISD::SMIN, ISD::SMAX,
+ ISD::UMIN, ISD::UMAX})
setOperationAction(Opcode, VT, Legal);
if (!VT.isFloatingPoint())
for (auto Opcode : {ISD::SADDSAT, ISD::UADDSAT, ISD::SSUBSAT, ISD::USUBSAT})
diff --git a/llvm/lib/Target/ARM/ARMInstrNEON.td b/llvm/lib/Target/ARM/ARMInstrNEON.td
index 21a5817252aea..f25706a267741 100644
--- a/llvm/lib/Target/ARM/ARMInstrNEON.td
+++ b/llvm/lib/Target/ARM/ARMInstrNEON.td
@@ -5655,6 +5655,34 @@ def VABDhq : N3VQInt<1, 0, 0b11, 0b1101, 0, N3RegFrm, IIC_VBINQ,
"vabd", "f16", v8f16, v8f16, int_arm_neon_vabds, 1>,
Requires<[HasNEON, HasFullFP16]>;
+let Predicates = [HasNEON] in {
+def : Pat<(v2i32 (abds (v2i32 DPR:$opA), (v2i32 DPR:$opB))),
+ (VABDsv2i32 DPR:$opA, DPR:$opB)>;
+def : Pat<(v4i16 (abds (v4i16 DPR:$opA), (v4i16 DPR:$opB))),
+ (VABDsv4i16 DPR:$opA, DPR:$opB)>;
+def : Pat<(v8i8 (abds (v8i8 DPR:$opA), (v8i8 DPR:$opB))),
+ (VABDsv8i8 DPR:$opA, DPR:$opB)>;
+def : Pat<(v4i32 (abds (v4i32 QPR:$opA), (v4i32 QPR:$opB))),
+ (VABDsv4i32 QPR:$opA, QPR:$opB)>;
+def : Pat<(v8i16 (abds (v8i16 QPR:$opA), (v8i16 QPR:$opB))),
+ (VABDsv8i16 QPR:$opA, QPR:$opB)>;
+def : Pat<(v16i8 (abds (v16i8 QPR:$opA), (v16i8 QPR:$opB))),
+ (VABDsv16i8 QPR:$opA, QPR:$opB)>;
+
+def : Pat<(v2i32 (abdu (v2i32 DPR:$opA), (v2i32 DPR:$opB))),
+ (VABDuv2i32 DPR:$opA, DPR:$opB)>;
+def : Pat<(v4i16 (abdu (v4i16 DPR:$opA), (v4i16 DPR:$opB))),
+ (VABDuv4i16 DPR:$opA, DPR:$opB)>;
+def : Pat<(v8i8 (abdu (v8i8 DPR:$opA), (v8i8 DPR:$opB))),
+ (VABDuv8i8 DPR:$opA, DPR:$opB)>;
+def : Pat<(v4i32 (abdu (v4i32 QPR:$opA), (v4i32 QPR:$opB))),
+ (VABDuv4i32 QPR:$opA, QPR:$opB)>;
+def : Pat<(v8i16 (abdu (v8i16 QPR:$opA), (v8i16 QPR:$opB))),
+ (VABDuv8i16 QPR:$opA, QPR:$opB)>;
+def : Pat<(v16i8 (abdu (v16i8 QPR:$opA), (v16i8 QPR:$opB))),
+ (VABDuv16i8 QPR:$opA, QPR:$opB)>;
+}
+
// VABDL : Vector Absolute Difference Long (Q = | D - D |)
defm VABDLs : N3VLIntExt_QHS<0,1,0b0111,0, IIC_VSUBi4Q,
"vabdl", "s", int_arm_neon_vabds, zext, 1>;
@@ -5662,9 +5690,9 @@ defm VABDLu : N3VLIntExt_QHS<1,1,0b0111,0, IIC_VSUBi4Q,
"vabdl", "u", int_arm_neon_vabdu, zext, 1>;
let Predicates = [HasNEON] in {
-def : Pat<(v8i16 (abs (sub (zext (v8i8 DPR:$opA)), (zext (v8i8 DPR:$opB))))),
+def : Pat<(v8i16 (zext (abdu (v8i8 DPR:$opA), (v8i8 DPR:$opB)))),
(VABDLuv8i16 DPR:$opA, DPR:$opB)>;
-def : Pat<(v4i32 (abs (sub (zext (v4i16 DPR:$opA)), (zext (v4i16 DPR:$opB))))),
+def : Pat<(v4i32 (zext (abdu (v4i16 DPR:$opA), (v4i16 DPR:$opB)))),
(VABDLuv4i32 DPR:$opA, DPR:$opB)>;
}
|
Yeah it sounds OK to me. I think I would expect the intrinsics to be converted to ISD::ABD nodes, as we do for AArch64. |
Cheers, I'll tidy this up and update all the patterns. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
llvm/lib/Target/ARM/ARMInstrNEON.td
Outdated
(zext node:$in2)), (i32 $shift))>; | ||
|
||
let Predicates = [HasNEON] in { | ||
def : Pat<(xor (v2i64 (abd_shr (v2i32 DPR:$opA), (v2i32 DPR:$opB), 63)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this possible to keep, or is the pattern now different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to avoid the need for it in this patch, otherwise it can wait until #92576
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…nd hasOperation. Avoids the need to explicitly test both commuted variants and doesn't match custom lowering after legalization. Cleanup for #94504
Should be ready now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure lets do it. LGTM
Cheers - it turned into a bit of chicken-and-egg situation :) |
As noted on llvm#94466, NEON has ABDS/ABDU instructions but only handles them via intrinsics, plus some VABDL custom patterns. Fixes llvm#94466
As noted on #94466, NEON has ABDS/ABDU instructions but only handles them via intrinsics, plus some VABDL custom patterns. This patch flags basic ABDS/ABDU for neon types as legal.
Ideally all the VABD/VABA/VABDL/VABAL handling should be moved to using the abds/abdu nodes - but am I on the right track?
Fixes #94466