Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable mutable noalias for LLVM >= 12 #82834

Merged
merged 6 commits into from
Mar 22, 2021
Merged

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Mar 6, 2021

Enable mutable noalias by default on LLVM 12, as previously known miscompiles have been resolved. Now it's time to find the next one ;)

@nikic
Copy link
Contributor Author

nikic commented Mar 6, 2021

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 6, 2021
@bors
Copy link
Contributor

bors commented Mar 6, 2021

⌛ Trying commit f8452a5159f665c77df589e44b37fa17fa508df0 with merge 3352d8ec2c800d702ecbbdc68f5502978773c4f3...

@bors
Copy link
Contributor

bors commented Mar 6, 2021

☀️ Try build successful - checks-actions
Build commit: 3352d8ec2c800d702ecbbdc68f5502978773c4f3 (3352d8ec2c800d702ecbbdc68f5502978773c4f3)

@rust-timer
Copy link
Collaborator

Queued 3352d8ec2c800d702ecbbdc68f5502978773c4f3 with parent 51748a8, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (3352d8ec2c800d702ecbbdc68f5502978773c4f3): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Mar 6, 2021
@bjorn3
Copy link
Member

bjorn3 commented Mar 6, 2021

When looking at incr-unchanged to exclude any slowdown in LLVM, there are many improvements of up to 1.4%, a few slowdowns of up to 0.6% and a regression of ctfe-stress of 1.8%.

When looking at all benchmarks, there are big losses of up to 3.3%. For the style-servo-opt full run for example LLVM_module_codegen took 28.5% longer.

@joshtriplett
Copy link
Member

I don't think we should expect this to be compile-time-neutral; LLVM does have to do more work. It'd be helpful to have runtime performance benchmarks to compare these to. Ultimately, we may end up trading compile-time for runtime here.

@bjorn3
Copy link
Member

bjorn3 commented Mar 8, 2021

It'd be helpful to have runtime performance benchmarks to compare these to.

I was looking at incr-unchanged for that. Those bail out before LLVM gets invoked.

@joshtriplett
Copy link
Member

@bjorn3 I meant "runtime performance" as in "performance of the compiled code". I'd expect the primary benefit of noalias to be better code generation, not faster code generation.

@panstromek
Copy link
Contributor

@joshtriplett I think that was the idea. If you enable noalias for the compiler, you expect the Rust part of it to get faster, so incr-unchanged can roughly tell you that.

@nikic nikic force-pushed the mutable-noalias branch 2 times, most recently from ce8fb3b to 73006de Compare March 18, 2021 21:59
@nikic nikic changed the title [perf] Try enabling mutable noalias again Enabling mutable noalias for LLVM >= 12 Mar 18, 2021
@nikic nikic force-pushed the mutable-noalias branch from 73006de to ee2eefe Compare March 18, 2021 22:19
@nikic
Copy link
Contributor Author

nikic commented Mar 18, 2021

I think this is ready for review now, r? @nagisa

I followed @RalfJung's suggestion to not apply noalias to !Unpin, which should avoid issues with self-referential structures in practice, though the more general T-lang problem remains open.

Copy link
Member

@nagisa nagisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that when looking at the performance results you can open the -Zself-profile outputs to see where the regressions end up being. For example: old vs new.

There are two areas of comptime regressions here: one is in LLVM internals (which makes sense as LLVM is doing strictly more work optimizing things now) and the other in generating the IR (which also makes sense as we're generating more IR)

This LGTM to me overall. r=me with or without @RalfJung's comment addressed.

compiler/rustc_codegen_llvm/src/abi.rs Show resolved Hide resolved
@nikic
Copy link
Contributor Author

nikic commented Mar 19, 2021

It's probably worthwhile to check the perf results again with the full implementation. The previous perf run was with it unconditionally enabled, without the Unpin and version checks.

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 19, 2021
@bors
Copy link
Contributor

bors commented Mar 19, 2021

⌛ Trying commit ee2eefeecbc5a8af4c5e81a05bffdad00912f3f8 with merge 989a6de73b924cf690879f45fab0a3889c267b43...

@bors
Copy link
Contributor

bors commented Mar 19, 2021

☀️ Try build successful - checks-actions
Build commit: 989a6de73b924cf690879f45fab0a3889c267b43 (989a6de73b924cf690879f45fab0a3889c267b43)

@rust-timer
Copy link
Collaborator

Queued 989a6de73b924cf690879f45fab0a3889c267b43 with parent eb95ace, future comparison URL.

@bjorn3
Copy link
Member

bjorn3 commented Mar 22, 2021

Yes, noalias is enabled by default after this PR. The perf runs are on a rustc version that is built using a rustc version built using the same sources as this PR: The bootstrap rustc builds a rustc which itself builds a rustc. This last rustc is what is shipped to the user and what is used for perf runs.

@felix91gr
Copy link
Contributor

Ah, perfect. Thank you, @bjorn3 💛

@the8472
Copy link
Member

the8472 commented Mar 23, 2021

I don't think we should expect this to be compile-time-neutral; LLVM does have to do more work. It'd be helpful to have runtime performance benchmarks to compare these to. Ultimately, we may end up trading compile-time for runtime here.

Would tracking the produced binary size in perf.rlo make sense as a cheaper proxy measure instead doing full benchmarks? On the assumption that LLVM optimizing more results in things being optimized away and thus smaller code.

@memoryruins
Copy link
Contributor

tracking the produced binary size in perf.rlo

relevant issue rust-lang/rustc-perf#145

@felix91gr
Copy link
Contributor

I don't think we should expect this to be compile-time-neutral; LLVM does have to do more work. It'd be helpful to have runtime performance benchmarks to compare these to. Ultimately, we may end up trading compile-time for runtime here.

Would tracking the produced binary size in perf.rlo make sense as a cheaper proxy measure instead doing full benchmarks? On the assumption that LLVM optimizing more results in things being optimized away and thus smaller code.

I'm not 100% sure that that would always be the case. LLVM optimizes by default for runtime performance, and many times that would entail a larger binary size. Inlining and loop unrolling are very common examples of this. Monomorphization as well.

That said, maybe there's something related to the noalias optimizations that makes LLVM reduce the binary size. I think some testing (or asking around, although with how new the feature is, maybe not many know what it does to binary size in practice) is in order :3

@the8472
Copy link
Member

the8472 commented Mar 23, 2021

I'm not 100% sure that that would always be the case.

correlation < 1 is expected, which is why wrote "proxy measure". I just hope it's still big enough to be useful.

hawkw added a commit to hawkw/mycelium that referenced this pull request Mar 28, 2021
In light of rust-lang/rust#82834, we must ensure that the intrusive
linked list pointers never get mutable-noalias optimizations (see also
rust-lang/rust#63818). Adding a `PhantomPinned` to the `Links` struct
ensures it will always be `!Unpin`, disabling mutable-noalias.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
hawkw added a commit to hawkw/mycelium that referenced this pull request Mar 28, 2021
In light of rust-lang/rust#82834, we must ensure that the intrusive
linked list pointers never get mutable-noalias optimizations (see also
rust-lang/rust#63818). Adding a `PhantomPinned` to the `Links` struct
ensures it will always be `!Unpin`, disabling mutable-noalias.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
@jyn514 jyn514 modified the milestones: 1.53.0, 1.54.0 Aug 25, 2021
LeSeulArtichaut added a commit to LeSeulArtichaut/rust that referenced this pull request Aug 25, 2021
Add mutable-noalias to the release notes for 1.54

It was enabled in rust-lang#82834 and disabled in 1.53 by rust-lang#86036, but it was never disabled on (then) nightly, so it still landed in 1.54.

This was mentioned on rust-lang#86696 but never made it into the release notes.

r? `@XAMPPRocky` cc `@nikic`
github-actions bot pushed a commit to gnoliyil/fuchsia that referenced this pull request May 3, 2022
We haven't seen any regressions upstream since it was last enabled again
in March 2021: rust-lang/rust#82834.

This results in a negligible increase in binary size of 24-56 kiB,
depending on build configuration.

Runtime perf only changed on x64 builds. 46 test cases got faster and
18 test cases got slower, with a couple of significant regressions in
FIDL microbenchmarks. Overall the results look positive.

Fixed: 76297
Change-Id: Id4a2b643e30e748e8d200f9d88c54ecc0ea02b2c
Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/674307
Commit-Queue: Tyler Mandry <tmandry@google.com>
Reviewed-by: Dan Johnson <computerdruid@google.com>
@archshift
Copy link
Contributor

Is there any plan to enable emitting LLVM alias.scope / noalias metadata for references not originating in function arguments?

@bjorn3
Copy link
Member

bjorn3 commented Aug 30, 2022

I suspect that depends on deciding a final memory model for rust.

@archshift
Copy link
Contributor

I was under the impression that there are clear cases where we can guarantee no mutable aliasing around references?

@felix91gr
Copy link
Contributor

Is it plausible that you two aren't talking about the same thing?

I'm not entirely sure of what context @archshift is talking about, but I think there are indeed some contexts in which we can guarantee that.

Gui, do you think you could show us an example of what you mean?

I get the feeling that the context you're talking about is more specific than what Bjorn is talking about when they mean that the memory model is needed to know the answer.

@RalfJung
Copy link
Member

An old closed PR is definitely not the right place for such a discussion though. :) Please take this to Zulip, IRLO, or a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.