You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The pointer arithmetic extend creates a double wide vector for the index in the header, which later needs to be unzipped and spliced to recover the <8 x i32> vector in the loop.
But if we sink that extend into the loop, then ISel will pick up the index vector as <8 x i32> and do the right thing. E.g.:
I prototyped this with the following patch, but obviously it is a big hammer since it is sinking all extends and not just those that feed masked gather/scatter intrinsics. I'm not sure where this selective sinking should be done, so I'll turn it over to the experts at Arm for a real solution.
index dd431cc6f4f5..bfe1b9b60c1b 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp@@ -8058,9 +8058,12 @@ bool CodeGenPrepare::optimizeInst(Instruction *I, ModifyDT &ModifiedDT) {
if (isa<ZExtInst>(I) || isa<SExtInst>(I)) {
/// Sink a zext or sext into its user blocks if the target type doesn't
/// fit in one register
- if (TLI->getTypeAction(CI->getContext(),- TLI->getValueType(*DL, CI->getType())) ==- TargetLowering::TypeExpandInteger) {+ EVT VT = TLI->getValueType(*DL, CI->getType());+ TargetLowering::LegalizeTypeAction Action =+ TLI->getTypeAction(CI->getContext(), VT);++ if (Action == TargetLowering::TypeExpandInteger ||+ Action == TargetLowering::TypeSplitVector) {
return SinkCast(CI);
} else {
if (TLI->optimizeExtendOrTruncateConversion(
The text was updated successfully, but these errors were encountered:
Masked gather/scatter intrinsics could generate better code if the index vector's type is sunk into the gather/scatter's block.
For example:
This currently generates:
The pointer arithmetic extend creates a double wide vector for the index in the header, which later needs to be unzipped and spliced to recover the
<8 x i32>
vector in the loop.But if we sink that extend into the loop, then ISel will pick up the index vector as
<8 x i32>
and do the right thing. E.g.:I prototyped this with the following patch, but obviously it is a big hammer since it is sinking all extends and not just those that feed masked gather/scatter intrinsics. I'm not sure where this selective sinking should be done, so I'll turn it over to the experts at Arm for a real solution.
The text was updated successfully, but these errors were encountered: