-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
save LTO import info and check it when trying to reuse build products #67020
save LTO import info and check it when trying to reuse build products #67020
Conversation
r? @eddyb (rust_highfive has picked a reviewer for you, use r? to override) |
I've been thinking more about this assertion this morning. I think I should just remove it. Here is my thinking: assuming the LTO import set computation is always based on the current state of the code, and we observe when going from revision₁ to revision₂ that the import set for (green) module A has gone from set₁ to set₂ where set₂ ⊆ set₁, then logically one should be able to reverse the direction of the change (going from revision₂ to revision₁), and LLVM may (or even should) compute a corresponding change in the import set for that same (still green) module A where the import set now goes from set₂ to set₁ (where we still have set₁ ⊇ set₂). That would cause the assertion to fail. Instead, I should just trust the underlying logic: the serialized imports for codegen unit A correspond to whatever information the LTO used for that compilation. If we reuse previously compiled A, then we keep the serialized LTO imports. If we recompile (or re-run LTO optimization) for A, then we use its newly computed LTO imports. |
eek! Yes: One can trivially construct a test showing this with slight modification of the test that I already have on this PR. Fixing that up now. |
I'm wondering, do we even need to "overwrite" anything? Or could we make the decision once, before ThinLTO, based on the previous import map? Then save the current map to disk as is, without any merging. It should be correct and deterministic as it is always generated from the entire set of modules, including the cached ones. |
That is an excellent description of the error scenario btw ❤️ ❤️ ❤️ |
Here is a version that makes re-use decisions based solely on the previous import map: It's from #52309, which was never merged. |
My understanding of how the current import map is generated leads me to think that we cannot get by using the previous imports and storing the current ones. Otherwise, you'll still end up in the same scenario, just one incremental compile-cycle later. I can try to spell this out in more detail, if need be. |
I think the problem arises from making decision based on the previous state of the object files in combination with the current import map. If the two things match up, i.e. making decisions about the previous object files based on the previous import map (which always accurately describes them), then the problem would not arise. I'm generally very uneasy about keeping around data from sessions |
If you want, I can try to sketch an alternative PR tomorrow morning, so we'd see if the bug gets fixed. |
But that is indeed the problem: the so-called "previous object files" could have originated from arbitrarily distant compilation sessions, not just the immediately preceding session. So I do not see how we can avoid accumulating information from an arbitrary number of sessions. |
001cf76
to
f29e4bd
Compare
Yes, but the import map is always re-computed from scratch from the entire set of modules, including the older ones. That will make it up-to-date. |
In other words: The import map should always be the same, regardless of what your cache looks like and even if there is no cache. That makes it a function of the current state of the source code. (<= that describes more accurately what I mean). |
Okay. I think I see what you are saying. It seems like it the import map can change in non-local ways though, right? I mean, the This non-locality property doesn't contradict what you said. I just want to have a clear mental model. Update: the non-locality property is what makes me worry that we would need to accumulate imports from arbitrarily far back. However, on zulip, @michaelwoerister and I have sketched out a different variant approach for fixing this that we're both happy with. |
Hurray! I was able to construct a pretty small test case for this: // ./src/test/incremental/thinlto/import_removed.rs
// revisions: cfail1 cfail2
// compile-flags: -O -Zhuman-readable-cgu-names -Cllvm-args=-import-instr-limit=10
// build-pass
// TODO: Add proper description.
fn main() {
foo::foo();
bar::baz();
}
mod foo {
// In cfail1, foo() gets inlined into main.
// In cfail2, ThinLTO decides that foo() does not get inlined into main, and
// instead bar() gets inlined into foo(). But faulty logic in our incr.
// ThinLTO implementation thought that `main()` is unchanged and thus reused
// the object file still containing a call to the now non-existant bar().
pub fn foo(){
bar()
}
// This function needs to be big so that it does not get inlined by ThinLTO
// but *does* get inlined into foo() once it is declared `internal` in
// cfail2.
pub fn bar(){
println!("quux1");
println!("quux2");
println!("quux3");
println!("quux4");
println!("quux5");
println!("quux6");
println!("quux7");
println!("quux8");
println!("quux9");
}
}
mod bar {
#[inline(never)]
pub fn baz() {
#[cfg(cfail1)]
{
crate::foo::bar();
}
}
} This test case fails with the expected error on a recent nightly. I have not tested whether the fix in this PR makes it pass, but it should. |
(okay this is ready for re-review; I incororated all feedback and adapted the test case, thanks @michaelwoerister !) |
Ping from triage: |
// are doing the ThinLTO in this current compilation cycle.) | ||
// | ||
// See rust-lang/rust#59535. | ||
if let (Some(prev_import_map), true) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting pattern :)
aecb511
to
42b00a4
Compare
(I went ahead and squashed since I was rebasing anyway to remove the |
@bors r=mw |
📌 Commit 42b00a4 has been approved by |
🌲 The tree is currently closed for pull requests below priority 100, this pull request will be tested once the tree is reopened |
@bors p=200 |
…ts, r=mw save LTO import info and check it when trying to reuse build products Fix #59535 Previous runs of LTO optimization on the previous incremental build can import larger portions of the dependence graph into a codegen unit than the current compilation run is choosing to import. We need to take that into account when we choose to reuse PostLTO-optimization object files from previous compiler invocations. This PR accomplishes that by serializing the LTO import information on each incremental build. We load up the previous LTO import data as well as the current LTO import data. Then as we decide whether to reuse previous PostLTO objects or redo LTO optimization, we check whether the LTO import data matches. After we finish with this decision process for every object, we write the LTO import data back to disk. ---- What is the scenario where comparing against past LTO import information is necessary? I've tried to capture it in the comments in the regression test, but here's yet another attempt from me to summarize the situation: 1. Consider a call-graph like `[A] -> [B -> D] <- [C]` (where the letters are functions and the modules are enclosed in `[]`) 2. In our specific instance, the earlier compilations were inlining the call to`B` into `A`; thus `A` ended up with a external reference to the symbol `D` in its object code, to be resolved at subsequent link time. The LTO import information provided by LLVM for those runs reflected that information: it explicitly says during those runs, `B` definition and `D` declaration were imported into `[A]`. 3. The change between incremental builds was that the call `D <- C` was removed. 4. That change, coupled with other decisions within `rustc`, made the compiler decide to make `D` an internal symbol (since it was no longer accessed from other codegen units, this makes sense locally). And then the definition of `D` was inlined into `B` and `D` itself was eliminated entirely. 5. The current LTO import information reported that `B` alone is imported into `[A]` for the *current compilation*. So when the Rust compiler surveyed the dependence graph, it determined that nothing `[A]` imports changed since the last build (and `[A]` itself has not changed either), so it chooses to reuse the object code generated during the previous compilation. 6. But that previous object code has an unresolved reference to `D`, and that causes a link time failure! ---- The interesting thing is that its quite hard to actually observe the above scenario arising, which is probably why no one has noticed this bug in the year or so since incremental LTO support landed (PR #53673). I've literally spent days trying to observe the bug on my local machine, but haven't managed to find the magic combination of factors to get LLVM and `rustc` to do just the right set of the inlining and `internal`-reclassification choices that cause this particular problem to arise. ---- Also, I have tried to be careful about injecting new bugs with this PR. Specifically, I was/am worried that we could get into a scenario where overwriting the current LTO import data with past LTO import data would cause us to "forget" a current import. ~~To guard against this, the PR as currently written always asserts, at overwrite time, that the past LTO import-set is a *superset* of the current LTO import-set. This way, the overwriting process should always be safe to run.~~ * The previous note was written based on the first version of this PR. It has since been revised to use a simpler strategy, where we never attempt to merge the past LTO import information into the current one. We just *compare* them, and act accordingly. * Also, as you can see from the comments on the PR itself, I was quite right to be worried about forgetting past imports; that scenario was observable via a trivial transformation of the regression test I had devised.
☀️ Test successful - checks-azure |
…file in incremental compilation. This is symmetric to PR rust-lang#67020, which handled the case where the LLVM module's *imports* changed. This commit builds upon the infrastructure added there; the export map is just the inverse of the import map, so we can build the export map at the same time that we load the serialized import map. Fix rust-lang#69798
…lto-products-when-exports-change, r=nagisa Do not reuse post LTO products when exports change Do not reuse post lto products when exports change Generalizes code from PR rust-lang#67020, which handled case when imports change. Fix rust-lang#69798
…file in incremental compilation. This is symmetric to PR rust-lang#67020, which handled the case where the LLVM module's *imports* changed. This commit builds upon the infrastructure added there; the export map is just the inverse of the import map, so we can build the export map at the same time that we load the serialized import map. Fix rust-lang#69798
…s-all-green, r=nagisa attempt to recover perf by removing `exports_all_green` attempt to recover perf by removing `exports_all_green` flag. cc rust-lang#71248 (My hypothesis is that my use of this flag was an overly conservative generalization of PR rust-lang#67020.)
During incremental ThinLTO compilation, we attempt to re-use the optimized (post-ThinLTO) bitcode file for a module if it is 'safe' to do so. Up until now, 'safe' has meant that the set of modules that our current modules imports from/exports to is unchanged from the previous compilation session. See PR rust-lang#67020 and PR rust-lang#71131 for more details. However, this turns out be insufficient to guarantee that it's safe to reuse the post-LTO module (i.e. that optimizing the pre-LTO module would produce the same result). When LLVM optimizes a module during ThinLTO, it may look at other information from the 'module index', such as whether a (non-imported!) global variable is used. If this information changes between compilation runs, we may end up re-using an optimized module that (for example) had dead-code elimination run on a function that is now used by another module. Fortunately, LLVM implements its own ThinLTO module cache, which is used when ThinLTO is performed by a linker plugin (e.g. when clang is used to compile a C proect). Using this cache directly would require extensive refactoring of our code - but fortunately for us, LLVM provides a function that does exactly what we need. The function `llvm::computeLTOCacheKey` is used to compute a SHA-1 hash from all data that might influence the result of ThinLTO on a module. In addition to the module imports/exports that we manually track, it also hashes information about global variables (e.g. their liveness) which might be used during optimization. By using this function, we shouldn't have to worry about new LLVM passes breaking our module re-use behavior. In LLVM, the output of this function forms part of the filename used to store the post-ThinLTO module. To keep our current filename structure intact, this PR just writes out the mapping 'CGU name -> Hash' to a file. To determine if a post-LTO module should be reused, we compare hashes from the previous session. This should unblock PR rust-lang#75199 - by sheer chance, it seems to have hit this issue due to the particular CGU partitioning and optimization decisions that end up getting made.
…twco,nikic Use llvm::computeLTOCacheKey to determine post-ThinLTO CGU reuse During incremental ThinLTO compilation, we attempt to re-use the optimized (post-ThinLTO) bitcode file for a module if it is 'safe' to do so. Up until now, 'safe' has meant that the set of modules that our current modules imports from/exports to is unchanged from the previous compilation session. See PR rust-lang#67020 and PR rust-lang#71131 for more details. However, this turns out be insufficient to guarantee that it's safe to reuse the post-LTO module (i.e. that optimizing the pre-LTO module would produce the same result). When LLVM optimizes a module during ThinLTO, it may look at other information from the 'module index', such as whether a (non-imported!) global variable is used. If this information changes between compilation runs, we may end up re-using an optimized module that (for example) had dead-code elimination run on a function that is now used by another module. Fortunately, LLVM implements its own ThinLTO module cache, which is used when ThinLTO is performed by a linker plugin (e.g. when clang is used to compile a C proect). Using this cache directly would require extensive refactoring of our code - but fortunately for us, LLVM provides a function that does exactly what we need. The function `llvm::computeLTOCacheKey` is used to compute a SHA-1 hash from all data that might influence the result of ThinLTO on a module. In addition to the module imports/exports that we manually track, it also hashes information about global variables (e.g. their liveness) which might be used during optimization. By using this function, we shouldn't have to worry about new LLVM passes breaking our module re-use behavior. In LLVM, the output of this function forms part of the filename used to store the post-ThinLTO module. To keep our current filename structure intact, this PR just writes out the mapping 'CGU name -> Hash' to a file. To determine if a post-LTO module should be reused, we compare hashes from the previous session. This should unblock PR rust-lang#75199 - by sheer chance, it seems to have hit this issue due to the particular CGU partitioning and optimization decisions that end up getting made.
Fix #59535
Previous runs of LTO optimization on the previous incremental build can import larger portions of the dependence graph into a codegen unit than the current compilation run is choosing to import. We need to take that into account when we choose to reuse PostLTO-optimization object files from previous compiler invocations.
This PR accomplishes that by serializing the LTO import information on each incremental build. We load up the previous LTO import data as well as the current LTO import data. Then as we decide whether to reuse previous PostLTO objects or redo LTO optimization, we check whether the LTO import data matches. After we finish with this decision process for every object, we write the LTO import data back to disk.
What is the scenario where comparing against past LTO import information is necessary?
I've tried to capture it in the comments in the regression test, but here's yet another attempt from me to summarize the situation:
[A] -> [B -> D] <- [C]
(where the letters are functions and the modules are enclosed in[]
)B
intoA
; thusA
ended up with a external reference to the symbolD
in its object code, to be resolved at subsequent link time. The LTO import information provided by LLVM for those runs reflected that information: it explicitly says during those runs,B
definition andD
declaration were imported into[A]
.D <- C
was removed.rustc
, made the compiler decide to makeD
an internal symbol (since it was no longer accessed from other codegen units, this makes sense locally). And then the definition ofD
was inlined intoB
andD
itself was eliminated entirely.B
alone is imported into[A]
for the current compilation. So when the Rust compiler surveyed the dependence graph, it determined that nothing[A]
imports changed since the last build (and[A]
itself has not changed either), so it chooses to reuse the object code generated during the previous compilation.D
, and that causes a link time failure!The interesting thing is that its quite hard to actually observe the above scenario arising, which is probably why no one has noticed this bug in the year or so since incremental LTO support landed (PR #53673).
I've literally spent days trying to observe the bug on my local machine, but haven't managed to find the magic combination of factors to get LLVM and
rustc
to do just the right set of the inlining andinternal
-reclassification choices that cause this particular problem to arise.Also, I have tried to be careful about injecting new bugs with this PR. Specifically, I was/am worried that we could get into a scenario where overwriting the current LTO import data with past LTO import data would cause us to "forget" a current import.
To guard against this, the PR as currently written always asserts, at overwrite time, that the past LTO import-set is a superset of the current LTO import-set. This way, the overwriting process should always be safe to run.