-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
__asan_globals_registered is not comdat when building a staticlib with LTO #113404
Comments
https://bugs.chromium.org/p/chromium/issues/detail?id=1459233 is tracking this for Chromium as it causes our asan bots to fail. |
Here is the code which creates the It's interesting that it uses I don't know how Rust is producing LTOed static libraries, but maybe somewhere along the way there is a bug in how LTO is handling common linkage globals. One possible summary of this issue is that the ODR violation detector is suffering from an ODR violation, we have two flags when we should have one. |
I like this take :) |
I consider myself quite familiar with asan, LTO, Clang Driver, but I know very little about Rust.
gives me this command (after removing rustc --crate-name builtins_static --edition=2021 src/lib.rs --diagnostic-width=159 --crate-type staticlib --emit=dep-info,link -C opt-level=3 -C lto -C metadata=9ef9aaa79a14308e -C extra-filename=-9ef9aaa79a14308e --out-dir /tmp/p/nss-builtins/target/release/deps -L dependency=/tmp/p/nss-builtins/target/release/deps --extern pkcs11_bindings=/tmp/p/nss-builtins/target/release/deps/libpkcs11_bindings-cb6ed583361f31fa.rlib --extern smallvec=/tmp/p/nss-builtins/target/release/deps/libsmallvec-e6a97ae5d501d626.rlib -Zsanitizer=address
Most defined symbols are localized. I do not know why the 4 symbols are special and rustc lto doesn't localize them. I don't know how to rerun the rustc with a locally built ( |
If we use a COMDAT for |
fuchsia-fyi-x64-asan build is broken due to detect_odr_violation in rust.[1] and search for "odr-violation" in [2]. It's a known issue [3] and also impacts other platforms [4]. A previous attempt to disable the sanitize-address-globals-dead-stripping (in https://crrev.com/c/4690811) fixed media_unittests but broke base_unittests. So I ended up deciding to follow the suggestion from the sanitizer itself to use the environment variable. Since there isn't a asan try, it takes 6+ hours to run, I have to manually test the change locally. With this change, both media_unittests and base_unittests are passing when is_asan = true in gn args. [1]: https://chromium-swarm.appspot.com/task?id=6360bb22591abb10 [2]: https://cas-viewer.appspot.com/projects/chromium-swarm/instances/default_instance/blobs/be139e3b5b49518b8a83be3fcdc5c209dc2acc2c9f725c558dd3fc2aac93a558/404237?filename=emulator_log.serial [3]: rust-lang/rust#113404 [4]: https://crbug.com/1459233 Bug: 1459233, 1465997 Change-Id: I36fcbb947f87bafaa618bb5de5d631ad895a1bac Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4688280 Reviewed-by: David Dorwin <ddorwin@chromium.org> Reviewed-by: Arthur Eubanks <aeubanks@google.com> Commit-Queue: Zijie He <zijiehe@google.com> Cr-Commit-Position: refs/heads/main@{#1172603}
fuchsia-fyi-x64-asan build is broken due to detect_odr_violation in rust.[1] and search for "odr-violation" in [2]. It's a known issue [3] and also impacts other platforms [4]. A previous attempt to disable the sanitize-address-globals-dead-stripping (in https://crrev.com/c/4690811) fixed media_unittests but broke base_unittests. So I ended up deciding to follow the suggestion from the sanitizer itself to use the environment variable. Since there isn't a asan try, it takes 6+ hours to run, I have to manually test the change locally. With this change, both media_unittests and base_unittests are passing when is_asan = true in gn args. [1]: https://chromium-swarm.appspot.com/task?id=6360bb22591abb10 [2]: https://cas-viewer.appspot.com/projects/chromium-swarm/instances/default_instance/blobs/be139e3b5b49518b8a83be3fcdc5c209dc2acc2c9f725c558dd3fc2aac93a558/404237?filename=emulator_log.serial [3]: rust-lang/rust#113404 [4]: https://crbug.com/1459233 Bug: 1459233, 1465997 Change-Id: I36fcbb947f87bafaa618bb5de5d631ad895a1bac Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4688280 Reviewed-by: David Dorwin <ddorwin@chromium.org> Reviewed-by: Arthur Eubanks <aeubanks@google.com> Commit-Queue: Zijie He <zijiehe@google.com> Cr-Commit-Position: refs/heads/main@{#1172603} NOKEYCHECK=True GitOrigin-RevId: 78d978e2ea02c1703ed1c65e2da7f04eda84a4ed
Let me try to help move this bug forward by attempting to answer some of the questions above. I have limited experience with linkers, asan, and compilers, so please shout if there are any mistakes below. RE: @MaskRay: I don't know how to rerun the rustc with a locally built (./x.py build) I tweaked the original
(Note that this changes the path where the build artifacts are - e.g. RE: @MaskRay: I do not know why the 4 symbols are special and rustc lto doesn't localize them. I am not sure if these are the reasons, but this is what I've found for some of the symbols emitted by
RE: @MaskRay: I suspect that preventing ___asan_globals_registered from being localized will fix this bug. Can you please elaborate on that? From your #113404 (comment), it seems that you are saying that a change in OTOH, it seems that
|
/cc @eugenis who AFAICT added the |
/cc @tru fyi. Although this isn't fully understood and it may well be that the fix will be on the rust side, this is a rustc/clang interop issue that's new with the upcoming clang 17 release, so perhaps there should at least be a "known issues" release note. |
cc @nikic |
Probably the symbol needs to be added to the symbol export list, see existing handling for __llvm_profile and __msan symbols in
|
@nikic - thanks for the pointer! After adding the following to if tcx.sess.opts.unstable_opts.sanitizer.contains(SanitizerSet::ADDRESS) {
// Similar to profiling, preserve weak asan symbols during LTO.
const ASAN_WEAK_SYMBOLS: [&str; 1] = ["___asan_globals_registered"];
symbols.extend(ASAN_WEAK_SYMBOLS.into_iter().map(|sym| {
let exported_symbol = ExportedSymbol::NoDefId(SymbolName::new(tcx, sym));
(
exported_symbol,
SymbolExportInfo {
level: SymbolExportLevel::C,
kind: SymbolExportKind::Data,
used: false,
},
)
}));
} I am now seeing $ $HOME/src/github/rust/build/x86_64-unknown-linux-gnu/llvm/bin/llvm-nm -gU builtins_static-9bce9c00959dd948.builtins_static.29256d2b2790aa9a-cgu.1.rcgu.o
0000000000000000 T BUILTINSC_GetFunctionList
0000000000000000 V DW.ref.rust_eh_personality
0000000000000000 D _ZN3std3sys4unix4args3imp15ARGV_INIT_ARRAY17h45459e0562778ae4E
0000000000000008 C ___asan_globals_registered
0000000000000000 T rust_eh_personality Repro steps for completeness:
I'll try to put together a PR with the changes above. I think one of the first steps is to figure out if/how to add tests for this. I'll try to see if I can cargo cult something from either e2acaee (Add codegen test that makes sure PGO instrumentation is emitted as expected), 4053e25 (librustc_trans: Mark some profiler symbols as exported to avoid LTO removing them), d8c661a (Mark __msan_keep_going as an exported symbol for LTO), or 2c0845c (Mark __msan_track_origins as an exported symbol for LTO by @nikic - thanks again for the code pointer!), FWIW, I think it would be nice to consolidate the code from the 3 similar cases: Finally, I am a bit surprised that |
This is because Rust assumes that all code provided to (non-plugin) LTO comes from Rust, so it knows about all symbols that are involved. This doesn't hold up for symbols that get injected by LLVM, so they need to be special-cased. |
TL;DR: I would appreciate help with creating a regression test for this. I have trouble creating a regression test for this. Based on the contributor docs, FileCheck tests under // Verifies that AddressSanitizer symbols show up as expected in LLVM IR
// with -Zsanitizer (DO NOT SUBMIT: consider adding non-LTO and fat LTO test flavours/revisions?)
//
// needs-sanitizer-address
// compile-flags: -Zsanitizer=address -C lto
#![crate_type="lib"]
// The test below mimics `CACHED_POW10` from `library/core/src/num/flt2dec/strategy/grisu.rs` which
// (because of incorrect handling of `___asan_globals_registered` during LTO) was incorrectly
// reported as an ODR violation in https://crbug.com/1459233#c1.
//
// See https://github.com/rust-lang/rust/issues/113404 for more discussion.
//
// CHECK: @___asan_globals_registered = common hidden global i64 0
// CHECK: @__start_asan_globals = extern_weak hidden global i64
// CHECK: @__stop_asan_globals = extern_weak hidden global i64
pub static CACHED_POW10: [(u64, i16, i16); 4] = [
(0xe61acf033d1a45df, -1087, -308),
(0xab70fe17c79ac6ca, -1060, -300),
(0xff77b1fcbebcdc4f, -1034, -292),
(0xbe5691ef416bd60c, -1007, -284),
]; AFAICT the test above replicates the following I notice that #113404 (comment) (and my repro steps in #113404 (comment)) reported seeing |
When a module doesn't exports any symbols, You can export an arbitrary symbol. For example mark |
@tmiasko - thanks! I have a working test now :-). |
I have a patch lying around to try to simplify this logic, but I haven't prioritized it because this issue is still outstanding and I don't want to complicate matters while we look for a solution. |
Mark `___asan_globals_registered` as an exported symbol for LTO Export a weak symbol defined by AddressSanitizer instrumentation. Previously, when using LTO, the symbol would get internalized and eliminated. Fixes rust-lang#113404. --------------- FWIW, let me list similar PRs from the past + who reviewed them: * rust-lang#68410 (fixing `__msan_keep_going` and `__msan_track_origins`; reviewed by `@varkor)` * rust-lang#60038 and rust-lang#48346 (fixing `__llvm_profile_raw_version` and `__llvm_profile_filename`; reviewed by `@alexcrichton)`
Mark `___asan_globals_registered` as an exported symbol for LTO Export a weak symbol defined by AddressSanitizer instrumentation. Previously, when using LTO, the symbol would get internalized and eliminated. Fixes rust-lang#113404. --------------- FWIW, let me list similar PRs from the past + who reviewed them: * rust-lang#68410 (fixing `__msan_keep_going` and `__msan_track_origins`; reviewed by ``@varkor)`` * rust-lang#60038 and rust-lang#48346 (fixing `__llvm_profile_raw_version` and `__llvm_profile_filename`; reviewed by ``@alexcrichton)``
Thanks @tmiasko! I've added I note that, since we are saying that |
The latest test failure (see #114642 (comment)) on
This seems to indicate that the product code changes should not unconditionally expect the
So, based on the above bf9820e#diff-6d46444b86c506127f8dd251a0c949845de78baec01de25b7524b3b10d90efd7 should maybe be modified to only recognize |
I don't know how to detect ELF or MachO targets based on This hurdle also reinforces the feeling that I've expressed in #113404 (comment): it feels fragile that we have to keep @tmiasko, @nikic, @rnk, and/or @MaskRay - could you please comment on the above? Do we really want/need |
RE: @nikic: #113404 (comment): Rust assumes that all code provided to (non-plugin) LTO comes from Rust Would it be possible for I tried staring for a while at |
With an understanding that we want to work together and find a practical solution, let me first represent the perspective of the instrumentation tool writer. From the PoV of one writing instrumentation passes, it is surprising that Rust assumes that all symbols originate from Rust, and that instrumentation passes cannot add symbols and rely on the standard linkage rules. From the ASan PoV, all we are doing is inventing a common symbol and assuming that linking will proceed as normal, meaning the common symbol will be deduplicated, and there will be one per DSO. I would argue that this Rust pass is breaking fairly reasonable assumptions. But I'm more interested in practical solutions, so yes, I think using those prefixes would be a good place to start. I'm not sure that list is complete, however.
We could go the direction of having LLVM instrumentation passes annotate symbols somehow, but it would be difficult to enforce that as a requirement for instrumentation passes. Anyone writing a new instrumentation pass is likely to discover the annotation requirement when they first try to instrument Rust code, which doesn't reduce the cost of this issue, it just distributes the fixes from You may be able to put together some heuristic by looking at global variables with weak linkages (common, linkonce_odr, weak, weak_odr, etc). If Rust never or rarely produces weak symbols, you could teach the exporter to leave those alone. That would be the simplest. |
For the staticlib and cdylib crate types we need to hide all symbols except those marked with Footnotes
|
AFAICT the initial fix attempt at bf9820e influences whether So, let me try a different fix approach and recognize ASAN/MSAN/etc-related symbols in the |
I guess one could argue about pushing the knowledge about |
Putting this list of symbols in LLVM sounds practical to me. Is it possible to emulate what Rust is doing using |
Hmmm... I now think that it still makes sense to review and attempt to land the
RE: @rnk: #113404 (comment): Is it possible to emulate what Rust is doing using Maybe. I don't know how to check. Sorry. FWIW, I see that LLVM-level tests that use RE: @rnk: #113404 (comment): Putting this list of symbols in LLVM sounds practical to me. The
|
Status update / summary (I edited this comment on 2023-08-25 at 10:47 PST to add one other potential next step at the very end of the comment):
Can we discuss the next steps?:
|
@bjorn3, in #114946 (comment) you've suggested that there might be issue with Option 2 of the fix. Let me partially reply here, because I have some questions that are not related to Option 2. Do you think we should proceed with Option 1 (or is there another approach that you'd suggest)? If so, then I assume that you agree that to avoid the x86_64-apple-1 test failures we should only inject |
I think so, but I may be misunderstanding what exactly address sanitizer needs in terms of linkage.
Like this: rust/compiler/rustc_codegen_ssa/src/back/metadata.rs Lines 217 to 225 in 22d41ae
Maybe extracting this into a function to ensure it is kept in sync makes sense. |
Thanks! That helps :-). Not sure how I managed to miss this piece of code when greepping the source code for FWIW I've applied the changes suggested above to the newly pushed version of #114642 (e.g. see d9abc46#diff-aa810a3be0834da891b171a8b04b09d0d3bbb76ac27c757cd232235099e062d2) |
… r=tmiasko Preserve ASAN-related symbols during LTO. Fixes rust-lang#113404
Preserve ASAN-related symbols during LTO. Fixes rust-lang/rust#113404
Disclaimer: I tried to create a testcase from scratch, but for some reason I wasn't able to find a way to trigger the use of
__asan_register_elf_globals
instead of__asan_globals_register
.STR:
cd nss-builtins
RUSTFLAGS="-Zsanitizer=address" CARGO_PROFILE_RELEASE_LTO=true cargo +nightly build --release
objdump -t target/release/libbuiltins_static.a | grep asan_globals_registered
Actual output:
Expected output:
Something like:
This doesn't happen without LTO.
The unfortunate consequence is that when the resulting static library is linked with C or C++ code compiled with clang with
-fsanitize=address -fsanitize-address-globals-dead-stripping
(that latter flag is now default in clang trunk), which also uses__asan_register_elf_globals
/__asan_globals_registered
, ODR violation detection kick in complaining about globals defined multiple times, because both the clang-side asan constructor and the rust asan constructor register all the globals. Normally, what happens is that they both use the same__asan_globals_registered
(thus it normally being*COM*
), and set its value, so that only one constructor registers the globals. With the LTOed staticlib, what happens is that there are two distinct__asan_globals_registered
, so both constructors go through.rustc +nightly --version --verbose
:(Edit: fixed typos, changed the peculiar setup with
-Clto
and-Cembed-bitcode=yes
to the more normal LTO, which shows the problem too)The text was updated successfully, but these errors were encountered: