Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use usub.with.overflow intrinsic #103299

Merged
merged 1 commit into from
Oct 30, 2022
Merged

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Oct 20, 2022

The canonical form of a usub.with.overflow check in LLVM are separate sub + icmp instructions, rather than a usub.with.overflow intrinsic. Using usub.with.overflow will generally result in worse optimization potential.

The backend will attempt to form usub.with.overflow when it comes to actual instruction selection. This is not fully reliable, but I believe this is a better tradeoff than using the intrinsic in IR.

Fixes #103285.

The canonical form of a usub.with.overflow check in LLVM are
separate sub + icmp instructions, rather than a usub.with.overflow
intrinsic. Using usub.with.overflow will generally result in worse
optimization potential.

The backend will attempt to form usub.with.overflow when it comes
to actual instruction selection. This is not fully reliable, but
I believe this is a better tradeoff than using the intrinsic in
IR.

Fixes rust-lang#103285.
@rustbot rustbot added A-testsuite Area: The testsuite used to check the correctness of rustc T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 20, 2022
@rust-highfive
Copy link
Collaborator

r? @wesleywiser

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 20, 2022
@nikic
Copy link
Contributor Author

nikic commented Oct 20, 2022

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 20, 2022
@bors
Copy link
Contributor

bors commented Oct 20, 2022

⌛ Trying commit 7833012 with merge 9e5787ab4583e67d16150a014f6c118dfa47ab43...

@bors
Copy link
Contributor

bors commented Oct 20, 2022

☀️ Try build successful - checks-actions
Build commit: 9e5787ab4583e67d16150a014f6c118dfa47ab43 (9e5787ab4583e67d16150a014f6c118dfa47ab43)

@rust-timer
Copy link
Collaborator

Queued 9e5787ab4583e67d16150a014f6c118dfa47ab43 with parent 4b3b731, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (9e5787ab4583e67d16150a014f6c118dfa47ab43): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.5% [2.5%, 2.5%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.3% [-2.3%, -2.3%] 1
All ❌✅ (primary) - - 0

Footnotes

  1. the arithmetic mean of the percent change 2

  2. number of relevant changes 2

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 20, 2022
Copy link
Member

@wesleywiser wesleywiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@wesleywiser
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Oct 21, 2022

📌 Commit 7833012 has been approved by wesleywiser

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 21, 2022
@bors
Copy link
Contributor

bors commented Oct 30, 2022

⌛ Testing commit 7833012 with merge f42b6fa...

@bors
Copy link
Contributor

bors commented Oct 30, 2022

☀️ Test successful - checks-actions
Approved by: wesleywiser
Pushing f42b6fa to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 30, 2022
@bors bors merged commit f42b6fa into rust-lang:master Oct 30, 2022
@rustbot rustbot added this to the 1.67.0 milestone Oct 30, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (f42b6fa): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.3% [1.2%, 1.4%] 2
Regressions ❌
(secondary)
3.6% [3.2%, 4.1%] 6
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.3% [1.2%, 1.4%] 2

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.0% [-3.2%, -2.8%] 2
All ❌✅ (primary) - - 0

@rustbot rustbot added the perf-regression Performance regression. label Oct 30, 2022
@nnethercote
Copy link
Contributor

This is benchmark noise.

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Oct 30, 2022
Aaron1011 pushed a commit to Aaron1011/rust that referenced this pull request Jan 6, 2023
Don't use usub.with.overflow intrinsic

The canonical form of a usub.with.overflow check in LLVM are separate sub + icmp instructions, rather than a usub.with.overflow intrinsic. Using usub.with.overflow will generally result in worse optimization potential.

The backend will attempt to form usub.with.overflow when it comes to actual instruction selection. This is not fully reliable, but I believe this is a better tradeoff than using the intrinsic in IR.

Fixes rust-lang#103285.
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 19, 2024
Make `checked` ops emit *unchecked* LLVM operations where feasible

For things with easily pre-checked overflow conditions -- shifts and unsigned subtraction -- write then checked methods in such a way that we stop emitting wrapping versions of them.

For example, today <https://rust.godbolt.org/z/qM9YK8Txb> neither
```rust
a.checked_sub(b).unwrap()
```
nor
```rust
a.checked_sub(b).unwrap_unchecked()
```
actually optimizes to `sub nuw`.  After this PR they do.

cc rust-lang#103299
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 20, 2024
…bilee

Make `checked` ops emit *unchecked* LLVM operations where feasible

For things with easily pre-checked overflow conditions -- shifts and unsigned subtraction -- write the checked methods in such a way that we stop emitting wrapping versions of them.

For example, today <https://rust.godbolt.org/z/qM9YK8Txb> neither
```rust
a.checked_sub(b).unwrap()
```
nor
```rust
a.checked_sub(b).unwrap_unchecked()
```
actually optimizes to `sub nuw`.  After this PR they do.

cc rust-lang#103299
bors added a commit to rust-lang-ci/rust that referenced this pull request May 12, 2024
Invert comparison in `uN::checked_sub`

After rust-lang#124114, LLVM no longer combines the comparison and subtraction in `uN::checked_sub` when either operand is a constant (demo: https://rust.godbolt.org/z/MaeoYbsP1). The difference is more pronounced when the expression is slightly more complex (https://rust.godbolt.org/z/4rPavsYdc).

This is due to the use of `>=` here:

https://github.com/rust-lang/rust/blob/ee97564e3a9f9ac8c65103abb37c6aa48d95bfa2/library/core/src/num/uint_macros.rs#L581-L593

For constant `C`, LLVM eagerly converts `a >= C` into `a > C - 1`, but the backend can only combine `a < C` with `a - C`, not `C - 1 < a` and `a - C`: https://github.com/llvm/llvm-project/blob/e586556e375fc5c4f7e76b5c299cb981f2016108/llvm/lib/CodeGen/CodeGenPrepare.cpp#L1697-L1742

This PR[^1] simply inverts the `>=` into `<` to restore the LLVM magic, and somewhat align this with the implementation of `uN::overflowing_sub` from rust-lang#103299.

When the result is stored as an `Option` (rather than being branched/cmoved on), the discriminant is `self >= rhs`. This PR doesn't affect the codegen (and relevant tests) of that since LLVM will negate `self < rhs` to `self >= rhs` when necessary.

[^1]: Note to `self`: My very first contribution to publicly-used code. Hopefully like what I should learn to always be, tiny and humble.
fmease added a commit to fmease/rust that referenced this pull request May 15, 2024
Invert comparison in `uN::checked_sub`

After rust-lang#124114, LLVM no longer combines the comparison and subtraction in `uN::checked_sub` when either operand is a constant (demo: https://rust.godbolt.org/z/MaeoYbsP1). The difference is more pronounced when the expression is slightly more complex (https://rust.godbolt.org/z/4rPavsYdc).

This is due to the use of `>=` here:

https://github.com/rust-lang/rust/blob/ee97564e3a9f9ac8c65103abb37c6aa48d95bfa2/library/core/src/num/uint_macros.rs#L581-L593

For constant `C`, LLVM eagerly converts `a >= C` into `a > C - 1`, but the backend can only combine `a < C` with `a - C`, not `C - 1 < a` and `a - C`: https://github.com/llvm/llvm-project/blob/e586556e375fc5c4f7e76b5c299cb981f2016108/llvm/lib/CodeGen/CodeGenPrepare.cpp#L1697-L1742

This PR[^1] simply inverts the `>=` into `<` to restore the LLVM magic, and somewhat align this with the implementation of `uN::overflowing_sub` from rust-lang#103299.

When the result is stored as an `Option` (rather than being branched/cmoved on), the discriminant is `self >= rhs`. This PR doesn't affect the codegen (and relevant tests) of that since LLVM will negate `self < rhs` to `self >= rhs` when necessary.

[^1]: Note to `self`: My very first contribution to publicly-used code. Hopefully like what I should learn to always be, tiny and humble.
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request May 15, 2024
Rollup merge of rust-lang#125038 - ivan-shrimp:checked_sub, r=joboet

Invert comparison in `uN::checked_sub`

After rust-lang#124114, LLVM no longer combines the comparison and subtraction in `uN::checked_sub` when either operand is a constant (demo: https://rust.godbolt.org/z/MaeoYbsP1). The difference is more pronounced when the expression is slightly more complex (https://rust.godbolt.org/z/4rPavsYdc).

This is due to the use of `>=` here:

https://github.com/rust-lang/rust/blob/ee97564e3a9f9ac8c65103abb37c6aa48d95bfa2/library/core/src/num/uint_macros.rs#L581-L593

For constant `C`, LLVM eagerly converts `a >= C` into `a > C - 1`, but the backend can only combine `a < C` with `a - C`, not `C - 1 < a` and `a - C`: https://github.com/llvm/llvm-project/blob/e586556e375fc5c4f7e76b5c299cb981f2016108/llvm/lib/CodeGen/CodeGenPrepare.cpp#L1697-L1742

This PR[^1] simply inverts the `>=` into `<` to restore the LLVM magic, and somewhat align this with the implementation of `uN::overflowing_sub` from rust-lang#103299.

When the result is stored as an `Option` (rather than being branched/cmoved on), the discriminant is `self >= rhs`. This PR doesn't affect the codegen (and relevant tests) of that since LLVM will negate `self < rhs` to `self >= rhs` when necessary.

[^1]: Note to `self`: My very first contribution to publicly-used code. Hopefully like what I should learn to always be, tiny and humble.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testsuite Area: The testsuite used to check the correctness of rustc merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Between opt-level=1 and opt-level=2, LLVM deoptimizes the output of ptr::addr, if it ptr::addr is #[inline]
7 participants