Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization regression in 1.32+ #59352

Open
adrian17 opened this issue Mar 21, 2019 · 4 comments
Open

Optimization regression in 1.32+ #59352

adrian17 opened this issue Mar 21, 2019 · 4 comments
Labels
A-codegen Area: Code generation I-slow Issue: Problems and improvements with respect to performance of generated code. P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@adrian17
Copy link

The following function:

pub fn num_to_digit(num: char) -> u32 {
    if num.is_digit(8) { num.to_digit(8).unwrap() } else { 0 }
}

With 1.31 produced the following:

  add edi, -48
  xor eax, eax
  cmp edi, 8
  cmovb eax, edi
  ret

Which looks reasonable, as unwrap() should never fail.
But in 1.32 and above it produces an unnecessary panic branch:

  push rax
  mov ecx, edi
  and ecx, -8
  xor eax, eax
  cmp ecx, 48
  jne .LBB0_3
  add edi, -48
  mov eax, edi
  cmp edi, 8
  jae .LBB0_2
.LBB0_3:
  pop rcx
  ret
.LBB0_2:
  lea rdi, [rip + .L__unnamed_1]
  call qword ptr [rip + _ZN4core9panicking5panic17h6f50c0de2dcd7974E@GOTPCREL]
  ud2

Interestingly, when I manually inline is_digit:

#[inline]
pub fn is_digit(self, radix: u32) -> bool {
self.to_digit(radix).is_some()
}

pub fn num_to_digit_fast1(num: char) -> u32 {
    if num.to_digit(8).is_some() { num.to_digit(8).unwrap() } else { 0 }
}

Or just use unwrap_or:

pub fn num_to_digit_fast2(num: char) -> u32 {
    num.to_digit(8).unwrap_or(0)
}

It produces good code on all compiler versions.

Godbolt instance I tested this on: https://godbolt.org/z/8yg7x0

@jonas-schievink jonas-schievink added I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. regression-from-stable-to-stable Performance or correctness regression from one stable version to another. labels Mar 22, 2019
@andjo403
Copy link
Contributor

have bisected this regression and the PR where the regression started is #55932

seems like the inliner are not able to inline the function any more after that PR

@spastorino spastorino added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jun 17, 2020
@LeSeulArtichaut LeSeulArtichaut added the A-codegen Area: Code generation label Jun 17, 2020
@spastorino spastorino added P-medium Medium priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jun 17, 2020
@mati865
Copy link
Contributor

mati865 commented Jun 17, 2020

Worth noting with -Zmir-opt-level=2 the regression is gone:

example::num_to_digit_fast:
  add edi, -48
  xor eax, eax
  cmp edi, 8
  cmovb eax, edi
  ret

example::num_to_digit_fast2:
  add edi, -48
  xor eax, eax
  cmp edi, 7
  cmovbe eax, edi
  ret

@bugadani
Copy link
Contributor

bugadani commented Oct 7, 2020

As far as I can see, this can be solved if mir inlining gets out of mir-opt-level 2.

bugadani added a commit to bugadani/rust that referenced this issue Jan 15, 2021
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Jan 15, 2021
Add test for rust-lang#59352

Issue rust-lang#59352 reported an optimization regression with rustc 1.32.0+. That regression could be tracked to a change that caused a function to miss the size limit of llvm's inlining, which results in an unreachable panicing branch being generated.
Enabling mir inline solves the issue, but is currently only done for `mir-opt-level>=2`.

This PR adds a test that can serve as a regression test for rust-lang#59352, if/when mir inlining gets mature enough for opt-level 1, or some other optimization can remove the panic.
bors added a commit to rust-lang-ci/rust that referenced this issue Jan 16, 2021
…laumeGomez

Rollup of 6 pull requests

Successful merges:

 - rust-lang#77693 (Add test for rust-lang#59352)
 - rust-lang#80515 (Improve JS performance by storing length before comparing to it in loops)
 - rust-lang#81030 (Update mdbook)
 - rust-lang#81033 (Remove useless `clean::Variant` struct)
 - rust-lang#81049 (inline: Round word-size cost estimates up)
 - rust-lang#81054 (Drop a few unneeded borrows)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@adrian17
Copy link
Author

FYI, looks like this got partially fixed somewhere in 1.64 (no clue by which commit though):

example::num_to_digit:
  lea ecx, [rdi - 48]
  mov edx, edi
  and edx, -8
  xor eax, eax
  cmp edx, 48
  cmove eax, ecx
  ret

And 1.65 optimized it more, it looks mostly the same as before the regression now (only difference being add -> lea):

example::num_to_digit:
  lea ecx, [rdi - 48]
  xor eax, eax
  cmp ecx, 8
  cmovb eax, ecx
  ret

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation I-slow Issue: Problems and improvements with respect to performance of generated code. P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants