Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for Non-Lexical Constant Propagation #80038

Closed
4 tasks
felix91gr opened this issue Dec 14, 2020 · 8 comments · Fixed by #101168
Closed
4 tasks

Tracking Issue for Non-Lexical Constant Propagation #80038

felix91gr opened this issue Dec 14, 2020 · 8 comments · Fixed by #101168
Labels
A-const-eval Area: Constant evaluation (MIR interpretation) A-mir-opt Area: MIR optimizations C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@felix91gr
Copy link
Contributor

felix91gr commented Dec 14, 2020

This is a tracking issue for the work I'm doing together with @oli-obk, regarding the upgrade of the MIR const-prop pass into a dataflow-analysis-enhanced Constant Propagation pass. This is part of my undergrad thesis, so as long as I am able to work on it, I should have at least the equivalent of part-time availability for the development of the feature.

The feature gate for this issue is #![feature(non_lexical_constprop)]
(feature gate is not in-tree yet)

What this feature means

Currently, there is a MIR optimization pass that does what is commonly know as Constant Folding. In this post however, I'll refer to it as Constant Propagation, since that is the name that we've given it in the mir-opts directory and documentation 🙂

What the Constant Propagation pass currently does is very limited in scope: it has no awareness of conditional execution (the true and false branches of an if block, for example) and therefore, it limits the propagation range of values known-to-be-constant to just the block in which these values reside.

In this work we will do our best to bring CP up to at least the standard constant propagation passes present in the literature: where the compiler is smart enough to understand control flow, and can conditionally propagate values into control-flow-dependent blocks. For this, the tool we expect to use is the MIR dataflow analysis framework, since CP is a problem known to be well-suited for dataflow analysis.

There are two end goals for this feature so far:

  1. To bring faster compilation times via the rustc frontend giving IR of greater quality to LLVM.
  2. To make certain kinds of static analyses possible, that so far require ad-hoc machinery to even be possible, like this Clippy lint that tries to catch whether or not the user is transmuting a null pointer at compile time. By having better compile-time information on the value of variables, we expect many analyses like this one to be more viable to realize.

Steps

(Subject to further breakup into smaller steps)

  • Implement CP based on the MIR dataflow framework that's currently available in rustc
  • Test the implementation thoroughly (using a feature flag).
  • Adjust relevant documentation (CC @oli-obk is there docs on MIR opts? I think we've discussed this before. Maybe it doesn't make sense to document the optimization per se until we've really found an end to the road of available const-prop enhancements, since CP will definitely get upgraded until that happens).
  • Stabilization PR (see instructions on rustc-dev-guide)

Unresolved Questions

None so far. Please ask away, I'm sure there's something I've missed 🙂

Implementation history

There's no implementation yet, although I can look up the recent changes to the const-prop pass and outline them here.

Closing note

This should probably be called Non-Lexical ConstProp, since Lexical boundaries are precisely as far as ConstProp can reach right now, and dataflow-aware ConstProp will go beyond, just like the Lifetime analysis did ^^

@felix91gr felix91gr added the C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. label Dec 14, 2020
@felix91gr
Copy link
Contributor Author

Here's an addendum for questions that might occur to the readers:

What about Value-Range Analysis?

Value-Range Analysis is something that the MIR-opts do not have at all yet. It's pretty clear that VRA can be treated like a generalization of Constant Propagation, since a Constant of value a can be seen as a range restriction of [x >= a, x <= a]. Therefore if implemented well, it could be used as the tool for this issue's feature as well.

However, its complexity is much greater than simple CP, and its use-cases go beyond the CP use cases as well.

  • The complexity makes it a larger and riskier bet, because we could find ourselves having nothing of use for a long time, maybe even closing the PRs and abandoning the feature because it became larger than what we can handle.
  • Its use-cases go beyond CP, since VRA can be used to eliminate large dead branches of code and optimize many more situations than just the ones that depend on the exact value of certain variables.
    This makes it harder to plan for: what are the test cases? What is the scope? Have we exhausted this feature yet?

The writer of this issue (me, Félix) in particular feels that they cannot really attack it head on just yet.

Ideally, one would implement just VRA and piggyback on it for CP. However, given the overall risk of going directly for VRA, the author prefers to first conquer CP and then go for VRA after getting a better understanding of the whole.

@jonas-schievink jonas-schievink added A-const-eval Area: Constant evaluation (MIR interpretation) A-mir-opt Area: MIR optimizations labels Dec 14, 2020
@jyn514 jyn514 added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Dec 14, 2020
@leonardo-m
Copy link

Value-Range Analysis (and slice-length analysis) is important for lot of things, but I agree it's a better idea to think about CP first.

@oli-obk
Copy link
Contributor

oli-obk commented Dec 15, 2020

cc @rust-lang/wg-mir-opt

using a feature flag, I think?

Ideally we'd use the new (closed PR #77665, feel free to reopen :D) optimization selection framework. This allows us fine grained control over which optimization passes to run via the command line.

is there docs on MIR opts?

There may be a little bit in the rustc guide, but I think keeping optimization specific documentation close to the optimization impl is probably for the best. This way it'll show up in the rustc internal rustdoc generated docs. Though I'll happily concede that point to someone who put more thought into this than I have.

@wesleywiser
Copy link
Member

This sounds great and I'm really excited to see some work on this!

At least in terms of goal 1 (reducing compilation times), it would be good to keep in mind the guidelines about -Zmir-opt-level. It may be necessary to enable or disable some parts of the new pass based on the value of that flag.

I would also recommend starting the new pass from scratch and not attempting to modify the existing one. Since the current one is enabled by default, bugs introduced there can easily result in unsound behavior in the program being compiled. I think it will also make it easier at the end of this project as we can simply delete the old pass and enable the new one.

@felix91gr
Copy link
Contributor Author

I would also recommend starting the new pass from scratch and not attempting to modify the existing one. Since the current one is enabled by default, bugs introduced there can easily result in unsound behavior in the program being compiled. I think it will also make it easier at the end of this project as we can simply delete the old pass and enable the new one.

Indeed, I think I completely agree with both points 🙂

@leonardo-m
Copy link

An example where Value-Range Analysis could help:
#80075

@felix91gr felix91gr changed the title Tracking Issue for Control-Flow-Aware Constant Propagation Tracking Issue for Non-Lexical Constant Propagation Dec 21, 2020
@felix91gr
Copy link
Contributor Author

Another space where VRA can help: #81237

@felix91gr
Copy link
Contributor Author

Small update:

On me.

It's been a hectic month here. I got an injury in my right elbow and that's limiting my ability to use the right arm in general. Also one of our puppers had to have an urgent, very dangerous surgery and that also took away basically all of my attention and energy. Currently I'm working on getting my arm in better shape, and struggling a bit with MH stuff.

On this work.

I've been studying the fundamental theory behind dataflow analysis, because things are currently such that I have no one to bother with my questions about it. So I'm trying to become an expert in the stuff.
During my work on this feature, I'll also fix some issues there are in the documentation, namely that the Convergence Requisites are not sufficient for all cases. Below, a more technical explanation of what I've elucidated so far:

True requirements for a dataflow analysis' semilattice to reach fixpoint

We find the Maximum Fixed Point in our analysis. We work with the join operation, starting with the bottom value and looking for the MFP from there. This MFP can be computed, only if:

  1. The semilattice satisfies the Ascending Chain Condition. That is, all strictly ascending chains (sequences of elements of the form a1 < a2 < a3 < a4 ...) are finite.
  2. The flow functions used are all Monotonic. That is, if f is a flow function in our analysis, and a and b are elements in the semilattice such that a <= b, then we know that f(a) <= f(b) holds as well.

Currently the requisites are weaker, and there are examples of semilattices which satisfy the conditions in our docs which actually don't have a computable, or even an existing MFP.


Sorry if anyone's been trying to find me in Zulip or expected me to get this done quicker. Things are just very hard at the moment. Much love to y'all ❤️

@bors bors closed this as completed in 357f660 Nov 15, 2022
Aaron1011 pushed a commit to Aaron1011/rust that referenced this issue Jan 6, 2023
Add new MIR constant propagation based on dataflow analysis

The current constant propagation in `rustc_mir_transform/src/const_prop.rs` fails to handle many cases that would be expected from a constant propagation optimization. For example:
```rust
let x = if true { 0 } else { 0 };
```
This pull request adds a new constant propagation MIR optimization pass based on the existing dataflow analysis framework. Since most of the analysis is not unique to constant propagation, a generic framework has been extracted. It works on top of the existing framework and could be reused for other optimzations.

Closes rust-lang#80038. Closes rust-lang#81605.

## Todo
### Essential
- [x] [Writes to inactive enum variants](rust-lang#101168 (review)). Resolved by rejecting the registration of places with downcast projections for now. Could be improved by flooding other variants if mutable access to a variant is observed.
- [X] Handle [`StatementKind::CopyNonOverlapping`](rust-lang#101168 (comment)). Resolved by flooding the destination.
- [x] Handle `UnsafeCell` / `!Freeze` correctly.
- [X] Overflow propagation of `CheckedBinaryOp`: Decided to not propagate if overflow flag is `true` (`false` will still be propagated)
- [x] More documentation in general.
- [x] Arguments for correctness, documentation of necessary assumptions.
- [x] Better performance, or alternatively, require `-Zmir-opt-level=3` for now.

### Extra
- [x]  Add explicit unreachability, i.e. upgrading the lattice from $\mathbb{P} \to \mathbb{V}$ to $\set{\bot} \cup (\mathbb{P} \to \mathbb{V})$.
- [x] Use storage statements to improve precision.
- [ ] Consider opening issue for duplicate diagnostics: rust-lang#101168 (comment)
- [ ] Flood moved-from places with $\bot$ (requires some changes for places with tracked projections).
- [ ] Add downcast projections back in.
- [ ] [Algebraic simplifications](rust-lang#101168 (comment)) (possibly with a shared API; done by old const prop).
- [ ] Propagation through slices / arrays.
- [ ] Find other optimizations that are done by old `const_prop.rs`, but not by this one.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-const-eval Area: Constant evaluation (MIR interpretation) A-mir-opt Area: MIR optimizations C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants