Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Fold mul (sext bool X), Y into select X, -Y, 0 #84792

Merged
merged 2 commits into from
Mar 15, 2024

Conversation

SahilPatidar
Copy link
Contributor

@SahilPatidar SahilPatidar commented Mar 11, 2024

@SahilPatidar
Copy link
Contributor Author

@dtcxzyw

@dtcxzyw dtcxzyw requested a review from goldsteinn March 11, 2024 16:43
@SahilPatidar
Copy link
Contributor Author

Hey, I've made the suggested changes. Could you please take a look and let me know if anything incorrect.

@dtcxzyw dtcxzyw changed the title [InstCombine] Optimize x * !x to 0 for vector #84608 [InstCombine] Fold mul (sext bool X), Y into select X, -Y, 0 Mar 13, 2024
@dtcxzyw dtcxzyw marked this pull request as ready for review March 13, 2024 06:55
@dtcxzyw dtcxzyw requested a review from nikic as a code owner March 13, 2024 06:55
@llvmbot
Copy link
Member

llvmbot commented Mar 13, 2024

@llvm/pr-subscribers-llvm-transforms

Author: None (SahilPatidar)

Changes

Alive2: https://alive2.llvm.org/ce/z/-XEPRP

Resolve #84608


Full diff: https://github.com/llvm/llvm-project/pull/84792.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp (+14)
  • (modified) llvm/test/Transforms/InstCombine/mul.ll (+10)
  • (modified) llvm/test/Transforms/InstCombine/sub-xor-cmp.ll (+3-5)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp b/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
index 0bd4b6d1a835af..fd1b4dbc46ba50 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
@@ -448,6 +448,20 @@ Instruction *InstCombinerImpl::visitMul(BinaryOperator &I) {
   if (match(Op1, m_ZExt(m_Value(X))) && X->getType()->isIntOrIntVectorTy(1))
     return SelectInst::Create(X, Op0, ConstantInt::getNullValue(Ty));
 
+  // mul (sext X), Y -> select X, -Y, 0
+  // mul Y, (sext X) -> select X, -Y, 0
+  Value *SExtOp;
+  if (match(Op0, m_OneUse(m_SExt(m_Value(SExtOp)))) &&
+      SExtOp->getType()->isIntOrIntVectorTy(1)) {
+    return SelectInst::Create(SExtOp, Builder.CreateNeg(Op1),
+                              ConstantInt::getNullValue(Op1->getType()));
+  }
+  if (match(Op1, m_OneUse(m_SExt(m_Value(SExtOp)))) &&
+      SExtOp->getType()->isIntOrIntVectorTy(1)) {
+    return SelectInst::Create(SExtOp, Builder.CreateNeg(Op0),
+                              ConstantInt::getNullValue(Op0->getType()));
+  }
+
   Constant *ImmC;
   if (match(Op1, m_ImmConstant(ImmC))) {
     // (sext bool X) * C --> X ? -C : 0
diff --git a/llvm/test/Transforms/InstCombine/mul.ll b/llvm/test/Transforms/InstCombine/mul.ll
index e7141d7c25ad21..a7e55df21c1068 100644
--- a/llvm/test/Transforms/InstCombine/mul.ll
+++ b/llvm/test/Transforms/InstCombine/mul.ll
@@ -2049,3 +2049,13 @@ define i32 @zext_negpow2_use(i8 %x) {
   %r = mul i32 %zx, -16777216 ; -1 << 24
   ret i32 %r
 }
+
+define i32 @mul_sext_icmp_with_zero(i32 %x) {
+; CHECK-LABEL: @mul_sext_icmp_with_zero(
+; CHECK-NEXT:    ret i32 0
+;
+  %cmp = icmp eq i32 %x, 0
+  %sext = sext i1 %cmp to i32
+  %mul = mul i32 %sext, %x
+  ret i32 %mul
+}
diff --git a/llvm/test/Transforms/InstCombine/sub-xor-cmp.ll b/llvm/test/Transforms/InstCombine/sub-xor-cmp.ll
index 2e1ff0a21a3def..461c9b0fb1e0c0 100644
--- a/llvm/test/Transforms/InstCombine/sub-xor-cmp.ll
+++ b/llvm/test/Transforms/InstCombine/sub-xor-cmp.ll
@@ -117,11 +117,9 @@ define i64 @sext_diff_i1_xor_sub_1(i64 %a, i1 %b, i1 %c) {
 define i64 @sext_multi_uses(i64 %a, i1 %b, i64 %x) {
 ; CHECK-LABEL: define i64 @sext_multi_uses(
 ; CHECK-SAME: i64 [[A:%.*]], i1 [[B:%.*]], i64 [[X:%.*]]) {
-; CHECK-NEXT:    [[C:%.*]] = sext i1 [[B]] to i64
-; CHECK-NEXT:    [[TMP1:%.*]] = sub i64 0, [[A]]
-; CHECK-NEXT:    [[E:%.*]] = select i1 [[B]], i64 [[TMP1]], i64 [[A]]
-; CHECK-NEXT:    [[F:%.*]] = mul i64 [[C]], [[X]]
-; CHECK-NEXT:    [[R:%.*]] = add i64 [[F]], [[E]]
+; CHECK-NEXT:    [[TMP1:%.*]] = add i64 [[X]], [[A]]
+; CHECK-NEXT:    [[TMP2:%.*]] = sub i64 0, [[TMP1]]
+; CHECK-NEXT:    [[R:%.*]] = select i1 [[B]], i64 [[TMP2]], i64 [[A]]
 ; CHECK-NEXT:    ret i64 [[R]]
 ;
   %c = sext i1 %b to i64

Copy link

github-actions bot commented Mar 13, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@dtcxzyw
Copy link
Member

dtcxzyw commented Mar 14, 2024

Please fix the formatting.

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Mar 14, 2024
Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some nits.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me.

Codegen looks a bit worse, at least for 8-bit: https://alive2.llvm.org/ce/z/5D7Ban Do you think an undo transform in the backend would be worthwhile?

@dtcxzyw
Copy link
Member

dtcxzyw commented Mar 14, 2024

Codegen looks a bit worse

mul typically takes ~3 cycles. Do you mean we should revert the transform to reduce code size?

@nikic
Copy link
Contributor

nikic commented Mar 14, 2024

Codegen looks a bit worse

mul typically takes ~3 cycles. Do you mean we should revert the transform to reduce code size?

Hm, good point. Then it's fine, at least for architectures with cmov.

@dtcxzyw dtcxzyw requested review from nikic and goldsteinn March 15, 2024 07:59
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dtcxzyw dtcxzyw merged commit e61e260 into llvm:main Mar 15, 2024
4 checks passed
@dtcxzyw dtcxzyw mentioned this pull request Mar 22, 2024
dtcxzyw added a commit that referenced this pull request Mar 26, 2024
This patch removes APIs that creating NUW neg. It is a trivial case
because `sub nuw 0, X` always gets simplified into zero.
I believe there is no optimization opportunities in the real-world
applications that we can take advantage of the nuw flag.

Motivated by
#84792 (comment).

Compile-time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=da7b7478b7cbb32c09d760f6b8d0e67901e0d533&stat=instructions:u
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize x * !x to 0 for vector
6 participants