Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed optimization with mir-opt-level 4 #100408

Open
leonardo-m opened this issue Aug 11, 2022 · 2 comments
Open

Missed optimization with mir-opt-level 4 #100408

leonardo-m opened this issue Aug 11, 2022 · 2 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@leonardo-m
Copy link

This code with an inner immediately invoked lambda function:

use std::num::NonZeroI32;

pub fn foo(x: NonZeroI32) -> i32 {
	(|x: &NonZeroI32| 33 / x.get())(&x)
}

Compiled using:

rustc 1.65.0-nightly (f03ce3096 2022-08-08)
binary: rustc
commit-hash: f03ce30962cf1b2a5158667eabae8bf6e8d1cb03
commit-date: 2022-08-08
host: x86_64-unknown-linux-gnu
release: 1.65.0-nightly
LLVM version: 14.0.6

With arguments:

--edition 2021 -C opt-level=3 -Z mir-opt-level=3

Gives a nice asm:

foo:
        mov     eax, 33
        xor     edx, edx
        idiv    edi
        ret

But cranking opt up to the 4th level:

--edition 2021 -C opt-level=3 -Z mir-opt-level=4

Gives an asm similar to not using the lambda function at all:

foo:
        test    edi, edi
        je      .LBB0_2
        mov     eax, 33
        xor     edx, edx
        idiv    edi
        ret
.LBB0_2:
        push    rax
        lea     rdi, [rip + str.0]
        lea     rdx, [rip + .L__unnamed_1]
        mov     esi, 25
        call    qword ptr [rip + core::panicking::panic@GOTPCREL]
        ud2

(Inspired by: https://old.reddit.com/r/rust/comments/wkw55b/help_matching_c_codegen_for_a_small_function/ijr7o5x/ ).

@leonardo-m leonardo-m added the C-bug Category: This is a bug. label Aug 11, 2022
@Noratrieb
Copy link
Member

I thought that it might be UnreachablePropagation being broken here, but my fix for that pass still emits the bad assembly, so it's probably not it.

@tmiasko
Copy link
Contributor

tmiasko commented Aug 12, 2022

Looks like SROA drops the load range metadata (similar to what happened with !nonnull in #37945).

@tmiasko tmiasko added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Aug 12, 2022
@Noratrieb Noratrieb added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Apr 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

3 participants