-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization opportunity for constructor functions #49539
Comments
Also note that the pointer we give to pub fn foo() -> Vec<Foo> {
vec![Foo::new()]
} Compiles to the following when inlined: push r14
push rbx
sub rsp, 56
mov r14, rdi
lea rdx, [rsp + 8]
mov edi, 512
mov esi, 1
call __rust_alloc@PLT
mov rbx, rax
test rbx, rbx
je .LBB2_1
mov esi, 42
mov edx, 512
mov rdi, rbx
call memset@PLT
mov qword ptr [r14], rbx
mov qword ptr [r14 + 8], 1
mov qword ptr [r14 + 16], 1
mov rax, r14
add rsp, 56
pop rbx
pop r14
ret And to the following when not inlined: push r15
push r14
push rbx
sub rsp, 528
mov r14, rdi
lea rdx, [rsp + 16]
mov edi, 512
mov esi, 1
call __rust_alloc@PLT
mov rbx, rax
test rbx, rbx
je .LBB3_1
lea r15, [rsp + 16]
mov rdi, r15
call Foo::new
mov edx, 512
mov rdi, rbx
mov rsi, r15
call memcpy@PLT
mov qword ptr [r14], rbx
mov qword ptr [r14 + 8], 1
mov qword ptr [r14 + 16], 1
mov rax, r14
add rsp, 528
pop rbx
pop r14
pop r15
ret While this could be: push r14
push rbx
sub rsp, 56
mov r14, rdi
lea rdx, [rsp + 8]
mov edi, 512
mov esi, 1
call __rust_alloc@PLT
mov rbx, rax
test rbx, rbx
je .LBB2_1
mov rdi, rbx
call Foo::new
mov qword ptr [r14], rbx
mov qword ptr [r14 + 8], 1
mov qword ptr [r14 + 16], 1
mov rax, r14
add rsp, 56
pop rbx
pop r14
ret |
Cc @rust-lang/wg-codegen |
Isn't this essentially #13707 (comment) just for tuple struct constructors? |
maybe? |
Please note that looking at assembly can hide the reasons for the generated code. But generally, these sorts of these are nowadays present in the MIR and later stages can't necessarily remove them because they lack the information to make correctness assumptions. The plan is getting something like #47954 into the compiler in the coming months. |
This optimizes well on nightly: https://godbolt.org/z/sGza7c This is because LLVM 12 can perform call slot optimization with a GEP destination. |
Consider the following constructor:
(stupid newtype with large and stupid content to trigger a recognizable
memset
call)Now, let's say we use the constructor in some way:
Typical rust would essentially have lots of constructs like this, in more elaborated forms.
The code above compiles to the following straightforward code:
Now, if for some reason the constructor is not inlined (and that can happen), here is what this becomes:
I don't see a reason why this couldn't be the following instead:
avoiding a useless copy that inlining avoided.
The text was updated successfully, but these errors were encountered: