Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change how compiler-builtins gets many CGUs #73136

Merged
merged 1 commit into from
Jun 19, 2020

Conversation

alexcrichton
Copy link
Member

This commit intends to fix an accidental regression from #70846. The
goal of #70846 was to build compiler-builtins with a maximal number of
CGUs to ensure that each module in the source corresponds to an object
file. This high degree of control for compiler-builtins is desirable to
ensure that there's at most one exported symbol per CGU, ideally
enabling compiler-builtins to not conflict with the system libgcc as
often.

In #70846, however, only part of the compiler understands that
compiler-builtins is built with many CGUs. The rest of the compiler
thinks it's building with sess.codegen_units(). Notably the
calculation of sess.lto() consults sess.codegen_units(), which when
there's only one CGU it disables ThinLTO. This means that
compiler-builtins is built without ThinLTO, which is quite harmful to
performance! This is the root of the cause from #73135 where intrinsics
were found to not be inlining trivial functions.

The fix applied in this commit is to remove the special-casing of
compiler-builtins in the compiler. Instead the build system is now
responsible for special-casing compiler-builtins. It doesn't know
exactly how many CGUs will be needed but it passes a large number that
is assumed to be much greater than the number of source-level modules
needed. After reading the various locations in the compiler source, this
seemed like the best solution rather than adding more and more special
casing in the compiler for compiler-builtins.

Closes #73135

@rust-highfive
Copy link
Collaborator

r? @Mark-Simulacrum

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 8, 2020
@ehuss
Copy link
Contributor

ehuss commented Jun 8, 2020

Will build-std need something similar?

@tmiasko
Copy link
Contributor

tmiasko commented Jun 8, 2020

@ehuss the original implementation used attribute specifically for build-std case. It would be nice if this logic was applied there at some point. There is also a similar case of debug assertions which should be disabled for compiler builtins.

@alexcrichton
Copy link
Member Author

For build-std I think it sort of depends. The compiler-builtins works regardless of the number of CGUs it is compiled with, it's just that it can work best with system libraries that have conflicting symbols if it has one object per symbol. In that sense for many pure-Rust targets using build-std (e.g. anything embedded) the number of CGUs for compiler-builtins doesn't matter.

In the limit though, yeah, we may have to embed this logic (like debug assertions)

Copy link
Member

@Mark-Simulacrum Mark-Simulacrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me

// Verifies that during compiler_builtins compilation the codegen units are kept
// unmerged. Even when only a single codegen unit is requested with -Ccodegen-units=1.
//
// compile-flags: -Zprint-mono-items=eager -Ccodegen-units=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth keeping this test with the codegen-units=1000 logic and making sure we're not merging in that case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect not in the sense that there's already likely tests for that, and this is just falling back on existing functionality.

@alexcrichton
Copy link
Member Author

@bors: r=Mark-Simulacrum

@bors
Copy link
Contributor

bors commented Jun 9, 2020

📌 Commit 926c7b6bb53b8a30b712759eaa9641040d15caef has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 9, 2020
@glandium
Copy link
Contributor

There is a regression in 1.44 when building Firefox with cross-language LTO for OSX (and only in that combination), where libxul fails to link with error messages like:

Undefined symbols for architecture x86_64:
  "__ZN52_$LT$u128$u20$as$u20$compiler_builtins..int..Int$GT$12extract_sign17h2f47846013b80758E", referenced from:
      __ZN17compiler_builtins5float4conv13__floatuntidf17h0187dacb8fabec21E in libgkrust.a(compiler_builtins-57149daefc485a08.compiler_builtins.4zbdfqik-cgu.80.rcgu.o)
  "__ZN51_$LT$i32$u20$as$u20$compiler_builtins..int..Int$GT$12aborting_div17hb9ce4562c23cc4a0E", referenced from:
      __ZN17compiler_builtins5float3pow9__powisf217h50ee614f1c4d9a6bE in libgkrust.a(compiler_builtins-57149daefc485a08.compiler_builtins.4zbdfqik-cgu.27.rcgu.o)
  "__ZN52_$LT$i128$u20$as$u20$compiler_builtins..int..Int$GT$12extract_sign17h8299efdd5198e4f5E", referenced from:
      __ZN17compiler_builtins5float4conv11__floattidf17h3723b89a83f7291fE in libgkrust.a(compiler_builtins-57149daefc485a08.compiler_builtins.4zbdfqik-cgu.80.rcgu.o)
ld: symbol(s) not found for architecture x86_64

This can be reproduced with nightly-2020-04-08 (42abbd8) and not with nightly-2020-04-07 (6dee5f1). Further narrowing down with rustup-install-toolchain-master came out with a range between 39b6253 and 42abbd8, which would seem to point at #70846. I'm hoping this change will fix it, but for some reason, I can't reproduce the problem in the first place with a local build of a revision known to be faulty in nightly, so I can't check whether this fixes it. At this point, I'll wait for the next nightly following the merge of this PR to test whether it fixes it, otherwise, I'll open a new issue.

@glandium
Copy link
Contributor

Corollary: if this fixes it, it would be nice to have in a 1.44.1 (or at least in 1.45).

@alexcrichton
Copy link
Member Author

@glandium I'm not 100% certain, but that's probably different from this issue. That may have, however, been fixed by either #72020 or #72325

@glandium
Copy link
Contributor

Both fixes you link have landed a number of days ago, but I can still reproduce with the latest nightly.

@Mark-Simulacrum
Copy link
Member

If this (or some other PR) needs stable backports, then the window for 1.44.1 is rapidly closing, would be good to get them stable-nominated today (for either emergency approval tomorrow's compiler meeting or unilateral approval).

@alexcrichton
Copy link
Member Author

@Mark-Simulacrum this only affects the performance of compiler-builtins which doesn't affect many high-profile platforms (wasm is the main motivation for this), so I would personally think it's ok to not include this in a point release.

@glandium
Copy link
Contributor

I wish I could test this fix...

@Mark-Simulacrum
Copy link
Member

@bors try

This'll at least get us a try build which might help with testing?

@bors
Copy link
Contributor

bors commented Jun 13, 2020

🙅 Please do not try after a pull request has been r+ed. If you need to try, unapprove (r-) it first.

// Ideally this would be specified through an env var to Cargo so Cargo
// knows how many CGUs are for this specific crate, but for now
// per-crate configuration isn't specifiable in the environment.
if crate_name == Some("compiler_builtins") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[profile.release.package.compiler_builtins]
codegen-units = 10000

in the workspace Cargo.toml?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice forgot about that, that does indeed work!

This commit intends to fix an accidental regression from rust-lang#70846. The
goal of rust-lang#70846 was to build compiler-builtins with a maximal number of
CGUs to ensure that each module in the source corresponds to an object
file. This high degree of control for compiler-builtins is desirable to
ensure that there's at most one exported symbol per CGU, ideally
enabling compiler-builtins to not conflict with the system libgcc as
often.

In rust-lang#70846, however, only part of the compiler understands that
compiler-builtins is built with many CGUs. The rest of the compiler
thinks it's building with `sess.codegen_units()`. Notably the
calculation of `sess.lto()` consults `sess.codegen_units()`, which when
there's only one CGU it disables ThinLTO. This means that
compiler-builtins is built without ThinLTO, which is quite harmful to
performance! This is the root of the cause from rust-lang#73135 where intrinsics
were found to not be inlining trivial functions.

The fix applied in this commit is to remove the special-casing of
compiler-builtins in the compiler. Instead the build system is now
responsible for special-casing compiler-builtins. It doesn't know
exactly how many CGUs will be needed but it passes a large number that
is assumed to be much greater than the number of source-level modules
needed. After reading the various locations in the compiler source, this
seemed like the best solution rather than adding more and more special
casing in the compiler for compiler-builtins.

Closes rust-lang#73135
@alexcrichton alexcrichton force-pushed the thinlto-compiler-builtins branch from 926c7b6 to d6156e8 Compare June 15, 2020 14:38
@alexcrichton

This comment has been minimized.

@alexcrichton
Copy link
Member Author

@bors: r=Mark-Simulacrum

@bors
Copy link
Contributor

bors commented Jun 15, 2020

📌 Commit d6156e8 has been approved by Mark-Simulacrum

Manishearth added a commit to Manishearth/rust that referenced this pull request Jun 16, 2020
…ins, r=Mark-Simulacrum

Change how compiler-builtins gets many CGUs

This commit intends to fix an accidental regression from rust-lang#70846. The
goal of rust-lang#70846 was to build compiler-builtins with a maximal number of
CGUs to ensure that each module in the source corresponds to an object
file. This high degree of control for compiler-builtins is desirable to
ensure that there's at most one exported symbol per CGU, ideally
enabling compiler-builtins to not conflict with the system libgcc as
often.

In rust-lang#70846, however, only part of the compiler understands that
compiler-builtins is built with many CGUs. The rest of the compiler
thinks it's building with `sess.codegen_units()`. Notably the
calculation of `sess.lto()` consults `sess.codegen_units()`, which when
there's only one CGU it disables ThinLTO. This means that
compiler-builtins is built without ThinLTO, which is quite harmful to
performance! This is the root of the cause from rust-lang#73135 where intrinsics
were found to not be inlining trivial functions.

The fix applied in this commit is to remove the special-casing of
compiler-builtins in the compiler. Instead the build system is now
responsible for special-casing compiler-builtins. It doesn't know
exactly how many CGUs will be needed but it passes a large number that
is assumed to be much greater than the number of source-level modules
needed. After reading the various locations in the compiler source, this
seemed like the best solution rather than adding more and more special
casing in the compiler for compiler-builtins.

Closes rust-lang#73135
Manishearth added a commit to Manishearth/rust that referenced this pull request Jun 16, 2020
…ins, r=Mark-Simulacrum

Change how compiler-builtins gets many CGUs

This commit intends to fix an accidental regression from rust-lang#70846. The
goal of rust-lang#70846 was to build compiler-builtins with a maximal number of
CGUs to ensure that each module in the source corresponds to an object
file. This high degree of control for compiler-builtins is desirable to
ensure that there's at most one exported symbol per CGU, ideally
enabling compiler-builtins to not conflict with the system libgcc as
often.

In rust-lang#70846, however, only part of the compiler understands that
compiler-builtins is built with many CGUs. The rest of the compiler
thinks it's building with `sess.codegen_units()`. Notably the
calculation of `sess.lto()` consults `sess.codegen_units()`, which when
there's only one CGU it disables ThinLTO. This means that
compiler-builtins is built without ThinLTO, which is quite harmful to
performance! This is the root of the cause from rust-lang#73135 where intrinsics
were found to not be inlining trivial functions.

The fix applied in this commit is to remove the special-casing of
compiler-builtins in the compiler. Instead the build system is now
responsible for special-casing compiler-builtins. It doesn't know
exactly how many CGUs will be needed but it passes a large number that
is assumed to be much greater than the number of source-level modules
needed. After reading the various locations in the compiler source, this
seemed like the best solution rather than adding more and more special
casing in the compiler for compiler-builtins.

Closes rust-lang#73135
@bors bors merged commit 1e31a7c into rust-lang:master Jun 19, 2020
@glandium
Copy link
Contributor

I can confirm the Firefox linking error mentioned in #73136 (comment) does not happen anymore with nightly-2020-06-20, which is the first nightly with this fix. I cannot, however, say anything about nightly-2020-06-19, because rustc crashed during compilation (and so did nightly-2020-06-18).

So it would have been nice in 1.44.1, but the second best thing would be to have it in 1.45.

@Mark-Simulacrum Mark-Simulacrum added the beta-nominated Nominated for backporting to the compiler in the beta channel. label Jun 21, 2020
@Mark-Simulacrum
Copy link
Member

Beta-nominating as this (apparently) fixes a regression -- @glandium would be good to file an issue in the future for easier tracking on our end (in addition to comments like #73136 (comment)).

@alexcrichton alexcrichton deleted the thinlto-compiler-builtins branch June 23, 2020 15:18
@alexcrichton
Copy link
Member Author

FWIW this patch as-is uses a relatively new Cargo feature of profile overrides, and if it's backported I'd recommend double-checking that it works before landing.

@spastorino
Copy link
Member

Adding T-compiler so our tools are able to fetch it and include it on the weekly agenda.

@spastorino spastorino added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jun 24, 2020
@pnkfelix
Copy link
Member

pnkfelix commented Jul 2, 2020

Discussed in last week's T-compiler meeting

We are inclined to backport the earlier version of this PR that doesn't rely on new cargo stuff. See link above for details.

@pnkfelix pnkfelix added the beta-accepted Accepted for backporting to the compiler in the beta channel. label Jul 2, 2020
@Elinvynia Elinvynia mentioned this pull request Jul 6, 2020
@Mark-Simulacrum Mark-Simulacrum removed the beta-nominated Nominated for backporting to the compiler in the beta channel. label Jul 10, 2020
@Mark-Simulacrum Mark-Simulacrum mentioned this pull request Jul 10, 2020
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 13, 2020
…ulacrum

[beta] next

Backports of:

* rustdoc: Fix doc aliases with crate filtering rust-lang#73644
* rustdoc: Rename invalid_codeblock_attribute lint to be plural rust-lang#74131
* rustc_lexer: Simplify shebang parsing once more rust-lang#73596
* Perform obligation deduplication to avoid buggy `ExistentialMismatch` rust-lang#73485
* Reorder order in which MinGW libs are linked to fix recent breakage rust-lang#73184
* Change how compiler-builtins gets many CGUs rust-lang#73136
* Fix wasm32 being broken due to a NodeJS version bump rust-lang#73885
@cuviper cuviper added this to the 1.46 milestone May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beta-accepted Accepted for backporting to the compiler in the beta channel. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

compiler-builtins: Int trait functions are not inlined on wasm