Use out pointer more eagerly (return value optimisation) #7298

huonw · 2013-06-22T14:29:13Z

The following pattern occurs reasonably often in the standard lib:

fn new() -> SomeStruct {
   let mut foo = SomeStruct { empty... };
   // ... initialise foo ...
   foo
}

which currently uses a memcpy at the end to copy the result into the out pointer, rather than just directly placing foo into it to begin with.

An example:

pub fn foo() -> [uint, .. 8] {
    [0, .. 8]
}
pub fn bar() -> [uint, .. 8] {
    let a = [0, .. 8];
    a
}

Compiling with -O -S --emit-llvm:

define void @_ZN3foo15_5274bbbd7427d03_00E([8 x i64]* nocapture, { i64, %tydesc*, i8*, i8*, i8 } addrspace(1)* nocapture) #1 {
static_allocas:
  %2 = bitcast [8 x i64]* %0 to i8*
  call void @llvm.memset.p0i8.i64(i8* %2, i8 0, i64 64, i32 8, i1 false)
  ret void
}

define void @_ZN3bar15_5274bbbd7427d03_00E([8 x i64]* nocapture, { i64, %tydesc*, i8*, i8*, i8 } addrspace(1)* nocapture) #1 {
static_allocas:
  %2 = alloca [8 x i64], align 8
  %3 = bitcast [8 x i64]* %2 to i8*
  call void @llvm.memset.p0i8.i64(i8* %3, i8 0, i64 64, i32 8, i1 false)
  %4 = bitcast [8 x i64]* %0 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %4, i8* %3, i64 64, i32 8, i1 false)
  ret void
}

The text was updated successfully, but these errors were encountered:

mstewartgallus · 2013-06-23T02:12:41Z

So I thought that this might be due to being only at optimization level 2 but even at optimization level 3 the code is still the same. By the way what does optimization level 3 actually enable over 2?

define void @_ZN3foo17_87e6ad83abde57f23_00E([8 x i64]* nocapture, { i64, %tydesc*, i8*, i8*, i8 } addrspace(1)* nocapture) #1 {
static_allocas:
   %2 = bitcast [8 x i64]* %0 to i8*
   call void @llvm.memset.p0i8.i64(i8* %2, i8 0, i64 64, i32 8, i1 false)
   ret void
}

define void @_ZN3bar17_87e6ad83abde57f23_00E([8 x i64]* nocapture, { i64, %tydesc*, i8*, i8*, i8 } addrspace(1)* nocapture) #1 {
static_allocas:
   %2 = alloca [8 x i64], align 8
   3 = bitcast [8 x i64]* %2 to i8*
   call void @llvm.memset.p0i8.i64(i8* %3, i8 0, i64 64, i32 8, i1 false)
   %4 = bitcast [8 x i64]* %0 to i8*
   call void @llvm.memcpy.p0i8.p0i8.i64(i8* %4, i8* %3, i64 64, i32 8, i1 false)
   ret void
}

huonw · 2013-06-23T02:16:44Z

If I'm reading back::passes correctly, --opt-level=3 adds argpromotion and loop-vectorize.

This brings Rust in line with how `clang` handles return pointers. Example: pub fn bar() -> [uint, .. 8] { let a = [0, .. 8]; a } Before: ; Function Attrs: nounwind uwtable define void @_ZN3bar17ha4635c6f704bfa334v0.0E([8 x i64]* nocapture, { i64, %tydesc*, i8*, i8*, i8 }* nocapture readnone) #1 { "function top level": %a = alloca [8 x i64], align 8 %2 = bitcast [8 x i64]* %a to i8* call void @llvm.memset.p0i8.i64(i8* %2, i8 0, i64 64, i32 8, i1 false) %3 = bitcast [8 x i64]* %0 to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %2, i64 64, i32 8, i1 false) ret void } After: ; Function Attrs: nounwind uwtable define void @_ZN3bar17ha4635c6f704bfa334v0.0E([8 x i64]* noalias nocapture sret, { i64, %tydesc*, i8*, i8*, i8 }* nocapture readnone) #1 { "function top level": %2 = bitcast [8 x i64]* %0 to i8* call void @llvm.memset.p0i8.i64(i8* %2, i8 0, i64 64, i32 8, i1 false) ret void } Closes #9072 Closes #7298

thestinger · 2014-08-16T07:12:11Z

This has regressed in LLVM after the issue was originally closed. Despite having both sret and dereferenceable, this does not optimize out.

luqmana · 2014-08-22T04:29:09Z

So it turn out that it's the lifetime markers that are stopping LLVM from optimizing this into just one memset. I have patch for LLVM upstream: http://reviews.llvm.org/D5020

Update our LLVM snapshot to master (as of ~ Wed Oct 1 18:49:58 2014 +0000). Since my patches have landed upstream this fixes #13429 and #7298.

luqmana · 2014-10-05T17:54:30Z

Closed by #17776.

…ednet,flip1995 Switch CI to new metadata collection r? `@xFrednet` Things we have to keep in mind: - This removes the template files and the scripts used for deployment from the checkout. This was added in rust-lang#5517. I don't think we ever needed those there. Not sure though. - ~~As a result, we can't remove the python scripts yet. We have to wait until this hits a stable Clippy release.~~ I'll just break the next stable deploy and do it by hand once. - This should be merged together with rust-lang#7279. Me and `@xFrednet` will coordinate the switch - ...? I still have to try out some things: - [x] Is it worth caching? Yes - [x] ~~Is it worth to do a release build?~~ Nope - [x] Does it actually work? With a few changes, yes - [ ] ...? changelog: Clippy now uses a lint to generate its documentation 🎉

Rollup of 3 pull requests Successful merges: - rust-lang#7279 (Adapting the lint list to Clippy's new metadata format) - rust-lang#7298 (Switch CI to new metadata collection) - rust-lang#7420 (Update lint documentation to use markdown headlines) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup changelog: rollup

Updated changelog for 1.55 This has again been a bit of work, but I'm happy to notice that my English is still improving, and I'm getting faster at these things. That's a very nice side effect of contributing and getting feedback on reviews 😊 Moving on, there were a few things that I was unsure about: * The PR rust-lang#86717 changes an old entry in the change log, is this worth mentioning? I've left it out for now. * The stabilization of `cargo clippy --fix` is quite awesome and important IMO. It sadly gets a bit lost in the *Other* entry, as it's the last one. Do we maybe want to move it somewhere else or change the headline order for this release? * I've listed the introduction of the new `suspicious` group under the *Moves and Deprecations* section. Is this alright, or should it be moved to the *Other* section as well? * Last but definitely not least, some fun! I've used the 🎉 emoji in the `cargo clippy --fix` entry, is this okay? Sorry for the bombardment of questions xD --- The PR already includes the entries for the new metadata collection and website updates. These are not merged yet, but should probably be to make this correct. This might also require the commit hashes to be updated (Not sure on this, though). It would actually be super fitting to get this into this release as we also stabilize `--fix`. TODOs: * [x] Merge metadata collection PRs: 1. rust-lang#7279 2. rust-lang#7298 3. rust-lang#7420 (Hope to not get any merge conflicts) --- [Rendered 📰](https://github.com/xFrednet/rust-clippy/blob/changelog-1-55/CHANGELOG.md) r? `@flip1995` changelog: none

thestinger mentioned this issue Sep 10, 2013

add sret + noalias to the out pointer parameter #9100

Closed

thestinger closed this as completed in b2eb1c0 Sep 17, 2013

thestinger reopened this Aug 16, 2014

luqmana mentioned this issue Oct 4, 2014

Update LLVM. #17776

Merged

luqmana added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Oct 4, 2014

bors added a commit that referenced this issue Oct 5, 2014

auto merge of #17776 : luqmana/rust/ul, r=alexcrichton

3b8c528

Update our LLVM snapshot to master (as of ~ Wed Oct 1 18:49:58 2014 +0000). Since my patches have landed upstream this fixes #13429 and #7298.

luqmana closed this as completed Oct 5, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use out pointer more eagerly (return value optimisation) #7298

Use out pointer more eagerly (return value optimisation) #7298

huonw commented Jun 22, 2013

mstewartgallus commented Jun 23, 2013

huonw commented Jun 23, 2013

thestinger commented Aug 16, 2014

luqmana commented Aug 22, 2014

luqmana commented Oct 5, 2014

Use out pointer more eagerly (return value optimisation) #7298

Use out pointer more eagerly (return value optimisation) #7298

Comments

huonw commented Jun 22, 2013

mstewartgallus commented Jun 23, 2013

huonw commented Jun 23, 2013

thestinger commented Aug 16, 2014

luqmana commented Aug 22, 2014

luqmana commented Oct 5, 2014