Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to fix CI #582

Merged
merged 1 commit into from
Mar 28, 2024
Merged

Attempt to fix CI #582

merged 1 commit into from
Mar 28, 2024

Conversation

Amanieu
Copy link
Member

@Amanieu Amanieu commented Mar 15, 2024

Upstream issue: rust-lang/rust#121552

Now that rust-lang/rust#121421 is merged, we are no longer emitting calls to functions in core from compiler builtins. However rustc may still codegen such functions, which will make them appear in the .rlib. This shouldn't cause issues since the linker will discard these functions as unreachable.

@RalfJung
Copy link
Member

Does this still error with the latest nightly? It seems like if the new test from rust-lang/rust#122580 passes on windows CI runners, then the build should also succeed here, shouldn't it?

@Amanieu
Copy link
Member Author

Amanieu commented Mar 23, 2024

Seems to be failing only on x86_64-apple-darwin.

@RalfJung
Copy link
Member

Is Apple's linker just particularly bad at removing dead code?

@saethlin
Copy link
Member

Is Apple's linker just particularly bad at removing dead code?

The point of (and design of) rust-lang/rust#122580 is to not rely on linkers to remove dead code which would fail to link. For whatever reason, Linux linkers seem to be quite good at that, so even if you are incautious things tend to work great on Linux and break on other platforms 🙃

@saethlin
Copy link
Member

The undefined symbol references must be only in the LLVM bitcode. How can I disassemble(?) the bitcode to figure out where they are referenced?

@RalfJung
Copy link
Member

Does --emit=llvm-ir not help?

@saethlin
Copy link
Member

🤷 The panic-related functions from core only appear in the emitted IR as function declarations and they are mentioned in a pile of other symbols in @rustc.embedded.module, which almost looks like the symbol table for the bitcode module.

My only guess based on the available data is that the undefined symbol reported by llvm-nm is just indicating that the function is declared but not defined in the bitcode, and it's declared because we created a MonoItem for the panic methods because they are encountered in the MIR.

I think it would be fair to call the CI failure here a false positive.

@Amanieu
Copy link
Member Author

Amanieu commented Mar 23, 2024

The issue on x86_64-apple-darwin here happens with -C embed-bitcode=no, so something else is the issue. I'm having trouble reproducing this myself since I don't have a mac system.

@saethlin
Copy link
Member

saethlin commented Mar 23, 2024

You don't need an Apple system; we have a cross-compiling toolchain.
In a minimal Cargo project, with just a lib.rs that is just #![no_std] and nothing else (this is the workflow the test I added to the compiler is based on) you can run

RUSTFLAGS="-Copt-level=0 -Ccodegen-units=1 -Cembed-bitcode=yes" cargo +nightly build -Zbuild-std=core --target=aarch64-apple-darwin

Then you can use your llvm-nm on that rlib:

~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin/llvm-nm -u /tmp/scratch/target/aarch64-apple-darwin/debug/deps/libcompiler_builtins-3c1087e4468ef7f2.rlib | grep panic

@saethlin
Copy link
Member

Oh 🤦 you said it happens even without embedding bitcode.

@saethlin
Copy link
Member

Actually... can you explain why you think this happens with -Cembed-bitcode=no? I don't see any evidence of that flag being set in CI. The check at the bottom of the test that's failing has found 3 rlibs but only reports one of each of the panic symbols, so I suspect only one of the compiler_builtins rlibs contains them. And one of the builds is a debug build with LTO, and locally I can reproduce the output of llvm-nm on that scenario.

@Amanieu
Copy link
Member Author

Amanieu commented Mar 23, 2024

Actually I just realized you are right. The build with CARGO_PROFILE_DEV_LTO=true causes Cargo not use the -C embed-bitcode=no flag, which means that bitcode is emitted. So that's likely what is causing the last CI failure. I've managed to reproduce it locally.

It seems that the panic symbol is appearing in the symbol table, but there is no relocation which actually uses it.

@saethlin
Copy link
Member

llvm-objdump --syms doesn't print these panic symbols, so I think this is really just llvm-nm inspecting the bitcode to emulate the behavior of native objects when reading bitcode.

@Amanieu
Copy link
Member Author

Amanieu commented Mar 23, 2024

That's fair. I'll just change this test to use llvm-objdump --syms then.

@Amanieu Amanieu force-pushed the ci-symbols branch 4 times, most recently from 8aae9f8 to ce105a4 Compare March 24, 2024 21:22
@RalfJung RalfJung mentioned this pull request Mar 27, 2024
@@ -36,17 +42,11 @@ case $1 in
;;
esac

NM=$(find $(rustc --print sysroot) -name llvm-nm)
NM=$(find $(rustc --print sysroot) \( -name llvm-nm -o -name llvm-nm.exe \) )
Copy link
Contributor

@dpaoliello dpaoliello Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this is causing the issue: comparing a working job to the failing one using llvm-nm instead of nm on i686-pc-windows-gnu is causing the tests to fail.

When I try to run llvm-nm locally in MinGW32, I get error while loading shared libraries: ?: cannot open shared object file: No such file or directory and an exit code of 127. Perhaps we're missing also missing some shared libraries required to run this on the CI machines?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be libLLVM.dll itself. Try prefixing the command with rustup run +nightly. That will set the dynamic linker path as necessary.

Copy link
Member

@RalfJung RalfJung Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test passes on windows, it's only macOS where it fails. See the CI results.

Never mind. I didn't realize the macOS thing was resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be libLLVM.dll itself. Try prefixing the command with rustup run +nightly. That will set the dynamic linker path as necessary.

Good call - that fixed the issue. I don't think it was libLLVM.dll (since that's not in the toolchain), but rather libgcc_s_dw2-1.dll and libwinpthread-1.dll.

I had to do a bit extra since not every build has rustup on the path, and the toolchain isn't always called nightly, but I now have this working: 4242372 @Amanieu feel free to steal that commit, or I can publish the draft PR if you prefer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this! I implemented a similar workaround based on yours but without the need for a second argument to run.sh.

@Amanieu Amanieu merged commit 8ed91cd into rust-lang:master Mar 28, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants