[InstCombine] Missed optimization : fold `X > C2 ? X + C1 : C2 + C1` to `max(X, C2) + C1` #82414

XChy · 2024-02-20T20:27:27Z

Alive2 proof: https://alive2.llvm.org/ce/z/ERjNs4

Motivating example

define i32 @src(i32 %x) {
  %add = add nsw i32 %x, 16
  %cmp = icmp sgt i32 %x, 1008
  %s = select i1 %cmp, i32 %add, i32 1024
  ret i32 %s
}

can be folded to:

define i32 @tgt(i32 %x) {
  %smax = call i32 @llvm.smax.i32(i32 %x, i32 1008)
  %add = add nuw nsw i32 %smax, 16
  ret i32 %add
}

LLVM does well when C1 or C2 is not constant, but when both are constants, LLVM missed it. Though this example doesn't show better codegen, I think it's a better canonicalization.

Real-world motivation

This snippet of IR is derived from protobuf/generated_message_tctable_lite.cc after O3 pipeline (original IR is from llvm-opt-benchmark).
Original IR is too big to attach here, email me to get it please.

Let me know if you can confirm that it's an optimization opportunity, thanks.

The text was updated successfully, but these errors were encountered:

nikic · 2024-02-20T20:31:15Z

Does this produce better optimizations in the surrounding context? I'm not sure this transform is worthwhile.

XChy · 2024-02-20T20:35:57Z

Does this produce better optimizations in the surrounding context? I'm not sure this transform is worthwhile.

No, just as this example shows. I don't insist on it since it doesn't bring better codegen..

XChy · 2024-02-20T20:58:38Z

Detected a similar but more complex pattern just now: https://alive2.llvm.org/ce/z/pTzsqM
This one produces better optimizations in the surrounding context, see also: https://godbolt.org/z/9d5n7o1er

pinskia · 2024-02-20T22:19:49Z

Does this produce better optimizations in the surrounding context? I'm not sure this transform is worthwhile.

On RISCV, it will allow to use the smax instruction even for the original simplified example instead of the more complex code.

dtcxzyw · 2024-02-21T00:08:37Z

Does this produce better optimizations in the surrounding context? I'm not sure this transform is worthwhile.

On RISCV, it will allow to use the smax instruction even for the original simplified example instead of the more complex code.

It depends on the materialization cost for the immediates.

XChy · 2024-03-19T17:34:56Z

Detect better optimization in the surrounding context, mainly when the select is used multiple times.
An example: https://alive2.llvm.org/ce/z/7Hb43S , which eliminates an add instruction.

nikic · 2024-04-13T00:19:16Z

I believe rust-lang/rust#123845 is another motivating case for this. That one is smin with multi-use condition and one-use add.

veera-sivarajan · 2024-10-29T17:19:13Z

I'd like to work on this.

…C2) BOp C1` (llvm#116888) Fixes llvm#82414. General Proof: https://alive2.llvm.org/ce/z/ERjNs4 Proof for Tests: https://alive2.llvm.org/ce/z/K-934G This PR transforms `select` instructions of the form `select (Cmp X C1) (BOp X C2) C3` to `BOp (min/max X C1) C2` iff `C3 == BOp C1 C2`. This helps in eliminating a noop loop in rust-lang/rust#123845 but does not improve optimizations.

github-actions bot added the new issue label Feb 20, 2024

EugeneZelenko added llvm:instcombine missed-optimization and removed new issue labels Feb 20, 2024

nikic mentioned this issue Apr 13, 2024

Noop loop is only optimized away when the range is half-open rust-lang/rust#123845

Open

nikic assigned veera-sivarajan Nov 5, 2024

veera-sivarajan mentioned this issue Nov 19, 2024

[InstCombine] Fold X Pred C2 ? X BOp C1 : C2 BOp C1 to min/max(X, C2) BOp C1 #116888

Merged

nikic closed this as completed in #116888 Dec 2, 2024

nikic closed this as completed in 979a035 Dec 2, 2024

EugeneZelenko added the llvm:analysis label Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[InstCombine] Missed optimization : fold `X > C2 ? X + C1 : C2 + C1` to `max(X, C2) + C1` #82414

[InstCombine] Missed optimization : fold `X > C2 ? X + C1 : C2 + C1` to `max(X, C2) + C1` #82414

XChy commented Feb 20, 2024

nikic commented Feb 20, 2024

XChy commented Feb 20, 2024

XChy commented Feb 20, 2024

pinskia commented Feb 20, 2024

dtcxzyw commented Feb 21, 2024

XChy commented Mar 19, 2024

nikic commented Apr 13, 2024

veera-sivarajan commented Oct 29, 2024

[InstCombine] Missed optimization : fold X > C2 ? X + C1 : C2 + C1 to max(X, C2) + C1 #82414

[InstCombine] Missed optimization : fold X > C2 ? X + C1 : C2 + C1 to max(X, C2) + C1 #82414

Comments

XChy commented Feb 20, 2024

Motivating example

Real-world motivation

nikic commented Feb 20, 2024

XChy commented Feb 20, 2024

XChy commented Feb 20, 2024

pinskia commented Feb 20, 2024

dtcxzyw commented Feb 21, 2024

XChy commented Mar 19, 2024

nikic commented Apr 13, 2024

veera-sivarajan commented Oct 29, 2024

[InstCombine] Missed optimization : fold `X > C2 ? X + C1 : C2 + C1` to `max(X, C2) + C1` #82414

[InstCombine] Missed optimization : fold `X > C2 ? X + C1 : C2 + C1` to `max(X, C2) + C1` #82414