save LTO import info and check it when trying to reuse build products #67020

pnkfelix · 2019-12-04T13:21:23Z

Previous runs of LTO optimization on the previous incremental build can import larger portions of the dependence graph into a codegen unit than the current compilation run is choosing to import. We need to take that into account when we choose to reuse PostLTO-optimization object files from previous compiler invocations.

This PR accomplishes that by serializing the LTO import information on each incremental build. We load up the previous LTO import data as well as the current LTO import data. Then as we decide whether to reuse previous PostLTO objects or redo LTO optimization, we check whether the LTO import data matches. After we finish with this decision process for every object, we write the LTO import data back to disk.

What is the scenario where comparing against past LTO import information is necessary?

I've tried to capture it in the comments in the regression test, but here's yet another attempt from me to summarize the situation:

Consider a call-graph like [A] -> [B -> D] <- [C] (where the letters are functions and the modules are enclosed in [])
In our specific instance, the earlier compilations were inlining the call toB into A; thus A ended up with a external reference to the symbol D in its object code, to be resolved at subsequent link time. The LTO import information provided by LLVM for those runs reflected that information: it explicitly says during those runs, B definition and D declaration were imported into [A].
The change between incremental builds was that the call D <- C was removed.
That change, coupled with other decisions within rustc, made the compiler decide to make D an internal symbol (since it was no longer accessed from other codegen units, this makes sense locally). And then the definition of D was inlined into B and D itself was eliminated entirely.
The current LTO import information reported that B alone is imported into [A] for the current compilation. So when the Rust compiler surveyed the dependence graph, it determined that nothing [A] imports changed since the last build (and [A] itself has not changed either), so it chooses to reuse the object code generated during the previous compilation.
But that previous object code has an unresolved reference to D, and that causes a link time failure!

The interesting thing is that its quite hard to actually observe the above scenario arising, which is probably why no one has noticed this bug in the year or so since incremental LTO support landed (PR #53673).

I've literally spent days trying to observe the bug on my local machine, but haven't managed to find the magic combination of factors to get LLVM and rustc to do just the right set of the inlining and internal-reclassification choices that cause this particular problem to arise.

Also, I have tried to be careful about injecting new bugs with this PR. Specifically, I was/am worried that we could get into a scenario where overwriting the current LTO import data with past LTO import data would cause us to "forget" a current import. To guard against this, the PR as currently written always asserts, at overwrite time, that the past LTO import-set is a superset of the current LTO import-set. This way, the overwriting process should always be safe to run.

The previous note was written based on the first version of this PR. It has since been revised to use a simpler strategy, where we never attempt to merge the past LTO import information into the current one. We just compare them, and act accordingly.
Also, as you can see from the comments on the PR itself, I was quite right to be worried about forgetting past imports; that scenario was observable via a trivial transformation of the regression test I had devised.

rust-highfive · 2019-12-04T13:21:27Z

r? @eddyb

(rust_highfive has picked a reviewer for you, use r? to override)

pnkfelix · 2019-12-04T13:21:41Z

r? @michaelwoerister

src/test/run-make/removing-code-and-incremental-lto/Makefile

pnkfelix · 2019-12-05T10:15:44Z

I was/am worried that we could get into a scenario where overwriting the current LTO import data with past LTO import data would cause us to "forget" a current import. To guard against this, the PR as currently written always asserts, at overwrite time, that the past LTO import-set is a superset of the current LTO import-set. This way, the overwriting process should always be safe to run.

I've been thinking more about this assertion this morning. I think I should just remove it.

Here is my thinking: assuming the LTO import set computation is always based on the current state of the code, and we observe when going from revision₁ to revision₂ that the import set for (green) module A has gone from set₁ to set₂ where set₂ ⊆ set₁, then logically one should be able to reverse the direction of the change (going from revision₂ to revision₁), and LLVM may (or even should) compute a corresponding change in the import set for that same (still green) module A where the import set now goes from set₂ to set₁ (where we still have set₁ ⊇ set₂). That would cause the assertion to fail.

Instead, I should just trust the underlying logic: the serialized imports for codegen unit A correspond to whatever information the LTO used for that compilation. If we reuse previously compiled A, then we keep the serialized LTO imports. If we recompile (or re-run LTO optimization) for A, then we use its newly computed LTO imports.

pnkfelix · 2019-12-05T11:46:08Z

logically one should be able to reverse the direction of the change (going from revision₂ to revision₁)

eek! Yes: One can trivially construct a test showing this with slight modification of the test that I already have on this PR.

Fixing that up now.

michaelwoerister · 2019-12-05T11:59:20Z

I'm wondering, do we even need to "overwrite" anything? Or could we make the decision once, before ThinLTO, based on the previous import map? Then save the current map to disk as is, without any merging. It should be correct and deterministic as it is always generated from the entire set of modules, including the cached ones.

michaelwoerister · 2019-12-05T11:59:53Z

That is an excellent description of the error scenario btw ❤️ ❤️ ❤️

michaelwoerister · 2019-12-05T12:06:41Z

Here is a version that makes re-use decisions based solely on the previous import map:
https://github.com/rust-lang/rust/pull/52309/files#diff-7495f8d1204a6939d7aa2d406839e509R949-R1008
(EDIT: Clicking the link doesn't scroll down to the function in question, it seems. Look for fn determine_cgu_reuse)

It's from #52309, which was never merged.

pnkfelix · 2019-12-05T12:11:24Z

I'm wondering, do we even need to "overwrite" anything? Or could we make the decision once, before ThinLTO, based on the previous import map? Then save the current map to disk as is, without any merging. It should be correct and deterministic as it is always generated from the entire set of modules, including the cached ones.

My understanding of how the current import map is generated leads me to think that we cannot get by using the previous imports and storing the current ones. Otherwise, you'll still end up in the same scenario, just one incremental compile-cycle later.

I can try to spell this out in more detail, if need be.

michaelwoerister · 2019-12-05T12:40:28Z

I think the problem arises from making decision based on the previous state of the object files in combination with the current import map. If the two things match up, i.e. making decisions about the previous object files based on the previous import map (which always accurately describes them), then the problem would not arise.

I'm generally very uneasy about keeping around data from sessions N-x where x > 1. That could add a kind of indeterminism unless you re-create the entire version history of compiling a crate (which would make for a really nasty debugging experience).

michaelwoerister · 2019-12-05T12:41:41Z

If you want, I can try to sketch an alternative PR tomorrow morning, so we'd see if the bug gets fixed.

pnkfelix · 2019-12-05T12:53:56Z

I think the problem arises from making decision based on the previous state of the object files in combination with the current import map. If the two things match up, i.e. making decisions about the previous object files based on the previous import map (which always accurately describes them), then the problem would not arise.

I'm generally very uneasy about keeping around data from sessions N-x where x > 1. That could add a kind of indeterminism unless you re-create the entire version history of compiling a crate (which would make for a really nasty debugging experience).

But that is indeed the problem: the so-called "previous object files" could have originated from arbitrarily distant compilation sessions, not just the immediately preceding session. So I do not see how we can avoid accumulating information from an arbitrary number of sessions.

michaelwoerister · 2019-12-05T12:58:16Z

But that is indeed the problem: the so-called "previous object files" can come from arbitrarily distant compilation sessions.

Yes, but the import map is always re-computed from scratch from the entire set of modules, including the older ones. That will make it up-to-date.

michaelwoerister · 2019-12-05T13:00:33Z

In other words: The import map should always be the same, regardless of what your cache looks like and even if there is no cache. That makes it a function of the current state of the source code. (<= that describes more accurately what I mean).

pnkfelix · 2019-12-05T13:09:36Z

Okay. I think I see what you are saying. It seems like it the import map can change in non-local ways though, right? I mean, the [A] -> [B -> C] <- [D] example seems to illustrate that a change to D can cause the import map for [A] to change.

This non-locality property doesn't contradict what you said. I just want to have a clear mental model.

Update: the non-locality property is what makes me worry that we would need to accumulate imports from arbitrarily far back. However, on zulip, @michaelwoerister and I have sketched out a different variant approach for fixing this that we're both happy with.

src/librustc_codegen_llvm/back/lto.rs

michaelwoerister · 2019-12-06T10:49:53Z

Hurray! I was able to construct a pretty small test case for this:

// ./src/test/incremental/thinlto/import_removed.rs

// revisions: cfail1 cfail2
// compile-flags: -O -Zhuman-readable-cgu-names -Cllvm-args=-import-instr-limit=10
// build-pass

// TODO: Add proper description.

fn main() {
    foo::foo();
    bar::baz();
}

mod foo {

    // In cfail1, foo() gets inlined into main.
    // In cfail2, ThinLTO decides that foo() does not get inlined into main, and
    // instead bar() gets inlined into foo(). But faulty logic in our incr.
    // ThinLTO implementation thought that `main()` is unchanged and thus reused
    // the object file still containing a call to the now non-existant bar().
    pub fn foo(){
        bar()
    }

    // This function needs to be big so that it does not get inlined by ThinLTO
    // but *does* get inlined into foo() once it is declared `internal` in
    // cfail2.
    pub fn bar(){
        println!("quux1");
        println!("quux2");
        println!("quux3");
        println!("quux4");
        println!("quux5");
        println!("quux6");
        println!("quux7");
        println!("quux8");
        println!("quux9");
    }
}

mod bar {

    #[inline(never)]
    pub fn baz() {
        #[cfg(cfail1)]
        {
            crate::foo::bar();
        }
    }
}

This test case fails with the expected error on a recent nightly. I have not tested whether the fix in this PR makes it pass, but it should.

pnkfelix · 2019-12-06T14:59:38Z

(okay this is ready for re-review; I incororated all feedback and adapted the test case, thanks @michaelwoerister !)

JohnCSimon · 2019-12-14T02:46:48Z

Ping from triage:
@michaelwoerister - can you please review this pr?

michaelwoerister · 2019-12-16T12:51:33Z

src/librustc_codegen_llvm/back/lto.rs

+            // are doing the ThinLTO in this current compilation cycle.)
+            //
+            // See rust-lang/rust#59535.
+            if let (Some(prev_import_map), true) =


Interesting pattern :)

pnkfelix · 2019-12-20T03:48:30Z

(I went ahead and squashed since I was rebasing anyway to remove the run-make based test.)

pnkfelix · 2019-12-20T03:48:46Z

@bors r=mw

bors · 2019-12-20T03:48:48Z

📌 Commit 42b00a4 has been approved by mw

bors · 2019-12-20T03:48:49Z

🌲 The tree is currently closed for pull requests below priority 100, this pull request will be tested once the tree is reopened

Centril · 2019-12-20T20:56:05Z

@bors p=200

bors · 2019-12-20T20:56:14Z

⌛ Testing commit 42b00a4 with merge ccd2383...

…ts, r=mw save LTO import info and check it when trying to reuse build products Fix #59535 Previous runs of LTO optimization on the previous incremental build can import larger portions of the dependence graph into a codegen unit than the current compilation run is choosing to import. We need to take that into account when we choose to reuse PostLTO-optimization object files from previous compiler invocations. This PR accomplishes that by serializing the LTO import information on each incremental build. We load up the previous LTO import data as well as the current LTO import data. Then as we decide whether to reuse previous PostLTO objects or redo LTO optimization, we check whether the LTO import data matches. After we finish with this decision process for every object, we write the LTO import data back to disk. ---- What is the scenario where comparing against past LTO import information is necessary? I've tried to capture it in the comments in the regression test, but here's yet another attempt from me to summarize the situation: 1. Consider a call-graph like `[A] -> [B -> D] <- [C]` (where the letters are functions and the modules are enclosed in `[]`) 2. In our specific instance, the earlier compilations were inlining the call to`B` into `A`; thus `A` ended up with a external reference to the symbol `D` in its object code, to be resolved at subsequent link time. The LTO import information provided by LLVM for those runs reflected that information: it explicitly says during those runs, `B` definition and `D` declaration were imported into `[A]`. 3. The change between incremental builds was that the call `D <- C` was removed. 4. That change, coupled with other decisions within `rustc`, made the compiler decide to make `D` an internal symbol (since it was no longer accessed from other codegen units, this makes sense locally). And then the definition of `D` was inlined into `B` and `D` itself was eliminated entirely. 5. The current LTO import information reported that `B` alone is imported into `[A]` for the *current compilation*. So when the Rust compiler surveyed the dependence graph, it determined that nothing `[A]` imports changed since the last build (and `[A]` itself has not changed either), so it chooses to reuse the object code generated during the previous compilation. 6. But that previous object code has an unresolved reference to `D`, and that causes a link time failure! ---- The interesting thing is that its quite hard to actually observe the above scenario arising, which is probably why no one has noticed this bug in the year or so since incremental LTO support landed (PR #53673). I've literally spent days trying to observe the bug on my local machine, but haven't managed to find the magic combination of factors to get LLVM and `rustc` to do just the right set of the inlining and `internal`-reclassification choices that cause this particular problem to arise. ---- Also, I have tried to be careful about injecting new bugs with this PR. Specifically, I was/am worried that we could get into a scenario where overwriting the current LTO import data with past LTO import data would cause us to "forget" a current import. ~~To guard against this, the PR as currently written always asserts, at overwrite time, that the past LTO import-set is a *superset* of the current LTO import-set. This way, the overwriting process should always be safe to run.~~ * The previous note was written based on the first version of this PR. It has since been revised to use a simpler strategy, where we never attempt to merge the past LTO import information into the current one. We just *compare* them, and act accordingly. * Also, as you can see from the comments on the PR itself, I was quite right to be worried about forgetting past imports; that scenario was observable via a trivial transformation of the regression test I had devised.

bors · 2019-12-21T00:06:51Z

☀️ Test successful - checks-azure
Approved by: mw
Pushing ccd2383 to master...

…file in incremental compilation. This is symmetric to PR rust-lang#67020, which handled the case where the LLVM module's *imports* changed. This commit builds upon the infrastructure added there; the export map is just the inverse of the import map, so we can build the export map at the same time that we load the serialized import map. Fix rust-lang#69798

…lto-products-when-exports-change, r=nagisa Do not reuse post LTO products when exports change Do not reuse post lto products when exports change Generalizes code from PR rust-lang#67020, which handled case when imports change. Fix rust-lang#69798

…file in incremental compilation. This is symmetric to PR rust-lang#67020, which handled the case where the LLVM module's *imports* changed. This commit builds upon the infrastructure added there; the export map is just the inverse of the import map, so we can build the export map at the same time that we load the serialized import map. Fix rust-lang#69798

…s-all-green, r=nagisa attempt to recover perf by removing `exports_all_green` attempt to recover perf by removing `exports_all_green` flag. cc rust-lang#71248 (My hypothesis is that my use of this flag was an overly conservative generalization of PR rust-lang#67020.)

During incremental ThinLTO compilation, we attempt to re-use the optimized (post-ThinLTO) bitcode file for a module if it is 'safe' to do so. Up until now, 'safe' has meant that the set of modules that our current modules imports from/exports to is unchanged from the previous compilation session. See PR rust-lang#67020 and PR rust-lang#71131 for more details. However, this turns out be insufficient to guarantee that it's safe to reuse the post-LTO module (i.e. that optimizing the pre-LTO module would produce the same result). When LLVM optimizes a module during ThinLTO, it may look at other information from the 'module index', such as whether a (non-imported!) global variable is used. If this information changes between compilation runs, we may end up re-using an optimized module that (for example) had dead-code elimination run on a function that is now used by another module. Fortunately, LLVM implements its own ThinLTO module cache, which is used when ThinLTO is performed by a linker plugin (e.g. when clang is used to compile a C proect). Using this cache directly would require extensive refactoring of our code - but fortunately for us, LLVM provides a function that does exactly what we need. The function `llvm::computeLTOCacheKey` is used to compute a SHA-1 hash from all data that might influence the result of ThinLTO on a module. In addition to the module imports/exports that we manually track, it also hashes information about global variables (e.g. their liveness) which might be used during optimization. By using this function, we shouldn't have to worry about new LLVM passes breaking our module re-use behavior. In LLVM, the output of this function forms part of the filename used to store the post-ThinLTO module. To keep our current filename structure intact, this PR just writes out the mapping 'CGU name -> Hash' to a file. To determine if a post-LTO module should be reused, we compare hashes from the previous session. This should unblock PR rust-lang#75199 - by sheer chance, it seems to have hit this issue due to the particular CGU partitioning and optimization decisions that end up getting made.

…twco,nikic Use llvm::computeLTOCacheKey to determine post-ThinLTO CGU reuse During incremental ThinLTO compilation, we attempt to re-use the optimized (post-ThinLTO) bitcode file for a module if it is 'safe' to do so. Up until now, 'safe' has meant that the set of modules that our current modules imports from/exports to is unchanged from the previous compilation session. See PR rust-lang#67020 and PR rust-lang#71131 for more details. However, this turns out be insufficient to guarantee that it's safe to reuse the post-LTO module (i.e. that optimizing the pre-LTO module would produce the same result). When LLVM optimizes a module during ThinLTO, it may look at other information from the 'module index', such as whether a (non-imported!) global variable is used. If this information changes between compilation runs, we may end up re-using an optimized module that (for example) had dead-code elimination run on a function that is now used by another module. Fortunately, LLVM implements its own ThinLTO module cache, which is used when ThinLTO is performed by a linker plugin (e.g. when clang is used to compile a C proect). Using this cache directly would require extensive refactoring of our code - but fortunately for us, LLVM provides a function that does exactly what we need. The function `llvm::computeLTOCacheKey` is used to compute a SHA-1 hash from all data that might influence the result of ThinLTO on a module. In addition to the module imports/exports that we manually track, it also hashes information about global variables (e.g. their liveness) which might be used during optimization. By using this function, we shouldn't have to worry about new LLVM passes breaking our module re-use behavior. In LLVM, the output of this function forms part of the filename used to store the post-ThinLTO module. To keep our current filename structure intact, this PR just writes out the mapping 'CGU name -> Hash' to a file. To determine if a post-LTO module should be reused, we compare hashes from the previous session. This should unblock PR rust-lang#75199 - by sheer chance, it seems to have hit this issue due to the particular CGU partitioning and optimization decisions that end up getting made.

rust-highfive assigned eddyb Dec 4, 2019

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 4, 2019

rust-highfive assigned michaelwoerister and unassigned eddyb Dec 4, 2019

pnkfelix commented Dec 4, 2019

View reviewed changes

src/test/run-make/removing-code-and-incremental-lto/Makefile Outdated Show resolved Hide resolved

pnkfelix added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Dec 4, 2019

pnkfelix mentioned this pull request Dec 5, 2019

run-make/thumb-none-cortex-m is no-op due to incompatible # only-target directives #67018

Closed

pnkfelix force-pushed the issue-59535-accumulate-past-lto-imports branch from 001cf76 to f29e4bd Compare December 5, 2019 12:56

michaelwoerister reviewed Dec 6, 2019

View reviewed changes

src/librustc_codegen_llvm/back/lto.rs Outdated Show resolved Hide resolved

michaelwoerister reviewed Dec 6, 2019

View reviewed changes

src/librustc_codegen_llvm/back/lto.rs Outdated Show resolved Hide resolved

michaelwoerister reviewed Dec 6, 2019

View reviewed changes

src/librustc_codegen_llvm/back/lto.rs Show resolved Hide resolved

pnkfelix mentioned this pull request Dec 12, 2019

cdylib fails to link with incremental compilation after panic -> no panic transition #67118

Closed

michaelwoerister reviewed Dec 16, 2019

View reviewed changes

pnkfelix force-pushed the issue-59535-accumulate-past-lto-imports branch from aecb511 to 42b00a4 Compare December 20, 2019 03:47

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 20, 2019

bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 21, 2019

bors merged commit 42b00a4 into rust-lang:master Dec 21, 2019

matthiaskrgr mentioned this pull request Dec 21, 2019

link failure on nightly, windows #67483

Closed

Aaron1011 mentioned this pull request Jan 2, 2020

Incremental compilation linker error: 'relocation R_X86_64_PC32 against undefined hidden symbol' #67802

Closed

Aaron1011 mentioned this pull request Mar 10, 2020

cdylib link error after TLS unused -> used transition #69798

Closed

pnkfelix mentioned this pull request Apr 14, 2020

Do not reuse post LTO products when exports change #71131

Merged

pnkfelix mentioned this pull request Apr 20, 2020

attempt to recover perf by removing exports_all_green #71267

Merged

Aaron1011 mentioned this pull request Sep 18, 2020

Use llvm::computeLTOCacheKey to determine post-ThinLTO CGU reuse #76859

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

save LTO import info and check it when trying to reuse build products #67020

save LTO import info and check it when trying to reuse build products #67020

pnkfelix commented Dec 4, 2019 •

edited

Loading

rust-highfive commented Dec 4, 2019

pnkfelix commented Dec 4, 2019

pnkfelix commented Dec 5, 2019 •

edited

Loading

pnkfelix commented Dec 5, 2019

michaelwoerister commented Dec 5, 2019

michaelwoerister commented Dec 5, 2019

michaelwoerister commented Dec 5, 2019 •

edited

Loading

pnkfelix commented Dec 5, 2019

michaelwoerister commented Dec 5, 2019

michaelwoerister commented Dec 5, 2019

pnkfelix commented Dec 5, 2019 •

edited

Loading

michaelwoerister commented Dec 5, 2019

michaelwoerister commented Dec 5, 2019

pnkfelix commented Dec 5, 2019 •

edited

Loading

michaelwoerister commented Dec 6, 2019

pnkfelix commented Dec 6, 2019

JohnCSimon commented Dec 14, 2019

michaelwoerister Dec 16, 2019

pnkfelix commented Dec 20, 2019

pnkfelix commented Dec 20, 2019

bors commented Dec 20, 2019

bors commented Dec 20, 2019

Centril commented Dec 20, 2019

bors commented Dec 20, 2019

bors commented Dec 21, 2019

save LTO import info and check it when trying to reuse build products #67020

save LTO import info and check it when trying to reuse build products #67020

Conversation

pnkfelix commented Dec 4, 2019 • edited Loading

rust-highfive commented Dec 4, 2019

pnkfelix commented Dec 4, 2019

pnkfelix commented Dec 5, 2019 • edited Loading

pnkfelix commented Dec 5, 2019

michaelwoerister commented Dec 5, 2019

michaelwoerister commented Dec 5, 2019

michaelwoerister commented Dec 5, 2019 • edited Loading

pnkfelix commented Dec 5, 2019

michaelwoerister commented Dec 5, 2019

michaelwoerister commented Dec 5, 2019

pnkfelix commented Dec 5, 2019 • edited Loading

michaelwoerister commented Dec 5, 2019

michaelwoerister commented Dec 5, 2019

pnkfelix commented Dec 5, 2019 • edited Loading

michaelwoerister commented Dec 6, 2019

pnkfelix commented Dec 6, 2019

JohnCSimon commented Dec 14, 2019

michaelwoerister Dec 16, 2019

Choose a reason for hiding this comment

pnkfelix commented Dec 20, 2019

pnkfelix commented Dec 20, 2019

bors commented Dec 20, 2019

bors commented Dec 20, 2019

Centril commented Dec 20, 2019

bors commented Dec 20, 2019

bors commented Dec 21, 2019

pnkfelix commented Dec 4, 2019 •

edited

Loading

pnkfelix commented Dec 5, 2019 •

edited

Loading

michaelwoerister commented Dec 5, 2019 •

edited

Loading

pnkfelix commented Dec 5, 2019 •

edited

Loading

pnkfelix commented Dec 5, 2019 •

edited

Loading