-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I can't stop writing copy propagation passes #76723
I can't stop writing copy propagation passes #76723
Conversation
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit ce5fed7aeec86b6b49236cafdc2017f2edaebc77 with merge dd7c661df15bcb47908d8fe078481eea3ab62938... |
☀️ Try build successful - checks-actions, checks-azure |
Queued dd7c661df15bcb47908d8fe078481eea3ab62938 with parent 9b41541, future comparison URL. |
Finished benchmarking try commit (dd7c661df15bcb47908d8fe078481eea3ab62938): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
1684.1% increased time in MIR optimisations. |
Are those compile times (or run times of benchmarks)? |
Compile times of the benchmarks. |
Oh, the clearing has |
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit d976615e8bfe62e6a466046aaba45c17d672791f with merge 56e90b1cdbec60d65cf2a4d28e60ede43c1a1880... |
☀️ Try build successful - checks-actions, checks-azure |
Queued 56e90b1cdbec60d65cf2a4d28e60ede43c1a1880 with parent 07ece44, future comparison URL. |
Finished benchmarking try commit (56e90b1cdbec60d65cf2a4d28e60ede43c1a1880): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
Much better. The remaining regressions seem to be caused by codegen unit partitioning from what I can tell. |
Do you have some run-time benchmarks too to see the effect of this pull? |
@leonardo-m |
This comment has been minimized.
This comment has been minimized.
The improvements are almost entirely in incremental builds and the CTFE stress test, so it looks like we're only benefiting from the reduction of MIR, not LLVM IR (less metadata to write/read = faster incremental builds; less MIR to interpret = faster CTFE). Which is still good, of course, but it won't benefit from-scratch builds. |
☔ The latest upstream changes (presumably #76176) made this pull request unmergeable. Please resolve the merge conflicts. Note that reviewers usually do not review pull requests until merge conflicts are resolved! Once you resolve the conflicts, you should change the labels applied by bors to indicate that your PR is ready for review. Post this as a comment to change the labels:
|
r? @oli-obk |
☔ The latest upstream changes (presumably #77373) made this pull request unmergeable. Please resolve the merge conflicts. Note that reviewers usually do not review pull requests until merge conflicts are resolved! Once you resolve the conflicts, you should change the labels applied by bors to indicate that your PR is ready for review. Post this as a comment to change the labels:
|
Auto-import made a bit of a mess here
Co-authored-by: LingMan <LingMan@users.noreply.github.com>
// _2 = _1; | ||
// use(move _2); <- can replace with `use(_1)` | ||
// Or: | ||
// use(move _2, move _1); <- can replace with `use(_1, _1)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this doesn't sound right, this is only correct if the places _1
and _1
are used are allowed to overlap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there cases where they are not?
From codegen perspective it seemed correct to me when I was reviewing the changes. Based on the fact that codegen for Operand::Copy
introduces a copy when passing by ref, unlike Operand::Move
which does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, for example call destinations must not alias any call arguments (maybe this only applies to copied arguments, as you mentioned). I ran into this when writing the destination propagation pass.
Unfortunately these invariants haven't really been consciously designed or documented from what I know, but the MIR validator does try to enforce them (or what I think is a reasonably approximation of them).
It seems that this pass is still correct though – aliasing restrictions generally only apply when one of the places is being mutated, and when that happens we invalidate all affected locals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extended the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also implicit assumption about when the move invalidates the local. As far as I can see this pass could make a following transformation:
_2 = _1
_3 = f(move _1, move _2);
_2 = _1
_3 = f(move _1, _1);
EDIT: Either way it it looks OK, just for far more subtle reasons.
But generally speaking the aliasing concerns do apply to the inputs alone in the move case. For example, running this optimization and then just before codegen turning copy operands into move operands results in the same miscompilation that affected copy propagation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof, yeah, that doesn't sound like it should work, but maybe it currently does. Promotion of copies to moves is something we do want to do at some point, so that would have to agree with this pass.
if let Some(local) = place.as_local() { | ||
if let Some(known_place) = self.local_values.get(local) { | ||
debug!("{:?}: replacing use of {:?} with {:?}", location, place, known_place); | ||
*operand = Operand::Copy(known_place); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This "demotion" from move to copy could be a problem, since moves are treated quite differently and tend to optimize better. Not sure how to solve that without a pass that promotes them again (which needs dataflow).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll probably want such a pass anyway to invoke every now and then in the pipeline (and at the end of it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's too expensive, we could do some caching similar to the predecessor cache.
+ (_4.0: &ViewportPercentageLength) = _1; // scope 0 at $DIR/early_otherwise_branch_68867.rs:22:15: 22:16 | ||
StorageLive(_4); // scope 0 at $DIR/early_otherwise_branch_68867.rs:22:14: 22:24 | ||
StorageLive(_5); // scope 0 at $DIR/early_otherwise_branch_68867.rs:22:15: 22:16 | ||
_5 = _1; // scope 0 at $DIR/early_otherwise_branch_68867.rs:22:15: 22:16 | ||
StorageLive(_6); // scope 0 at $DIR/early_otherwise_branch_68867.rs:22:18: 22:23 | ||
_6 = _2; // scope 0 at $DIR/early_otherwise_branch_68867.rs:22:18: 22:23 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... I was wondering why these were left around, but then I realized SimplifyBranches
happens before SimplifyLocals
@@ -23,11 +24,14 @@ fn try_identity(_1: std::result::Result<u32, i32>) -> std::result::Result<u32, i | |||
} | |||
} | |||
scope 6 { | |||
debug self => _0; // in scope 6 at $SRC_DIR/core/src/result.rs:LL:COL | |||
debug self => _2; // in scope 6 at $SRC_DIR/core/src/result.rs:LL:COL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a regression in debugging. Once that local is optimized out, there will be no more debug info for self
.
/// Stores the locals that need to be invalidated when this local is modified or deallocated. | ||
#[derive(Default, Clone)] | ||
struct Invalidation { | ||
locals: SmallVec<[Local; 4]>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document this magic number (and maybe move to a constant?)
// *again* later on would be a use-after-move. For example: | ||
// _1 = ...; | ||
// _2 = move _1; | ||
// use(move _2); <- can *not* replace with `use(move _1)` or `use(_1)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this expected to get changed in the future if _2
is only ever used right here?
|
||
// (*) Some MIR statements and terminators may forbid aliasing between some of the places | ||
// they access. This only happens between output and input places, never between multiple | ||
// input places, so what this pass is safe: Any local that is mutated will invalidate all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// input places, so what this pass is safe: Any local that is mutated will invalidate all | |
// input places, so what this pass does is safe: Any local that is mutated will invalidate all |
// Or: | ||
// use(move _2, move _1); <- can replace with `use(_1, _1)`, if the inputs may alias (*) | ||
if let Operand::Copy(place) | Operand::Move(place) = operand { | ||
if let Some(local) = place.as_local() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should be able to apply this to all places that satisfy place_eligible
, right? Maybe not in this PR, but if nothing speaks against it, leave a FIXME
, and if something does, please document it.
if let Some(local) = place.as_local() { | ||
if let Some(known_place) = self.local_values.get(local) { | ||
debug!("{:?}: replacing use of {:?} with {:?}", location, place, known_place); | ||
*operand = Operand::Copy(known_place); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll probably want such a pass anyway to invoke every now and then in the pipeline (and at the end of it).
if let Some(local) = place.as_local() { | ||
if let Some(known_place) = self.local_values.get(local) { | ||
debug!("{:?}: replacing use of {:?} with {:?}", location, place, known_place); | ||
*operand = Operand::Copy(known_place); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's too expensive, we could do some caching similar to the predecessor cache.
☔ The latest upstream changes (presumably #68965) made this pull request unmergeable. Please resolve the merge conflicts. Note that reviewers usually do not review pull requests until merge conflicts are resolved! Once you resolve the conflicts, you should change the labels applied by bors to indicate that your PR is ready for review. Post this as a comment to change the labels:
|
@jonas-schievink Could you address Oli's review? Thanks for working on these opts! |
@jonas-schievink Ping from triage, closing this due to inactivity. Please feel free to reopen or create a new PR when you're ready to work on this again. Thanks! |
This implements a simple and easy to understand intra-block copy propagation pass. It can clean up temporaries introduced by MIR building and reduces the amount of MIR we end up with, mostly improving the CTFE and incremental compilation benchmarks.
The pass triggers ~120000 times on the whole libstd.
I have not yet looked for improvements in the generated code, but given that this is the first attempt at a copy propagation pass that does not come with compile time regressions this should already be worth it on its own. I don't expect any improvements in generated code since LLVM is already capable of optimizing out redundant copies within basic blocks.