Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CARGO env var set incorrectly when Cargo invoked through ld #10113

Closed
jonhoo opened this issue Nov 23, 2021 · 2 comments · Fixed by #11285
Closed

CARGO env var set incorrectly when Cargo invoked through ld #10113

jonhoo opened this issue Nov 23, 2021 · 2 comments · Fixed by #11285
Labels
C-bug Category: bug

Comments

@jonhoo
Copy link
Contributor

jonhoo commented Nov 23, 2021

Problem

Earlier today I got a report from an internal developer about a build failing with an error like this:

  thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: CargoMetadata(".../Cargo.toml", Metadata(Output { status: ExitStatus(ExitStatus(32512)), stdout: "", stderr: "metadata: error while loading shared libraries: metadata: cannot open shared object file\n" }))', build.rs:30:14

The line in question is:

cbindgen::generate(&crate_dir).unwrap()

where

let crate_dir = env::var("CARGO_MANIFEST_DIR").unwrap();

Digging some more using strace, the developer found that the culprit was this execve:

execve("/usr/lib64/ld-2.26.so", ["/usr/lib64/ld-2.26.so", "metadata", "--all-features", "--format-version", "1", "--manifest-path", ".../Cargo.toml"], ...

This led me down a bit of a rabbit hole. The error propagates an error from when cbindgen tries to invoke cargo metadata, which constructs the actual Command here. The name of the command is determined by the CARGO environment variable, which Cargo sets based on Config::cargo_exe. Now that, in turn, is set based on std::env::current_exe, which gets the "full path to the current executable" (Linux impl is reading /proc/self/exe).

Which brings us to /usr/lib64/ld-2.26.so — the Linux dynamic linker. Internally, binaries are executed through the dynamic linker so that we can explicitly set the shared library search path to ensure that the same shared libraries used to build a given binary are used when it is executed. It has a similar effect as setting LD_LIBRARY_PATH, but with the added benefit that it does not also set the shared library search path of any programs executed down the line. There is a lot of underlying complexity here that is probably not worth digging too much into, but the idea is that a binary should ship "with its environment", and so if cargo and some subcommand cargo-foo are built separately, they should use different shared library search paths, namely the ones that map to exactly what they were built with. Executing through the dynamic linker achieves that, LD_LIBRARY_PATH does not. (This is also done by things like go-appimage).

Now, the challenge is that when you execute a binary through the dynamic linker, that's what the "current executable" is set to. Even if you execute it through exec -a (to set argv[0]), /proc/self/exe still points at the linker itself. And since Cargo prefers current_exe (which is /proc/self/exe), it ends up setting CARGO to the linker, which means that build scripts that try to run Cargo using that path fail:

pub fn cargo_exe(&self) -> CargoResult<&Path> {
self.cargo_exe
.try_borrow_with(|| {
fn from_current_exe() -> CargoResult<PathBuf> {
// Try fetching the path to `cargo` using `env::current_exe()`.
// The method varies per operating system and might fail; in particular,
// it depends on `/proc` being mounted on Linux, and some environments
// (like containers or chroots) may not have that available.
let exe = env::current_exe()?.canonicalize()?;
Ok(exe)
}
fn from_argv() -> CargoResult<PathBuf> {
// Grab `argv[0]` and attempt to resolve it to an absolute path.
// If `argv[0]` has one component, it must have come from a `PATH` lookup,
// so probe `PATH` in that case.
// Otherwise, it has multiple components and is either:
// - a relative path (e.g., `./cargo`, `target/debug/cargo`), or
// - an absolute path (e.g., `/usr/local/bin/cargo`).
// In either case, `Path::canonicalize` will return the full absolute path
// to the target if it exists.
let argv0 = env::args_os()
.map(PathBuf::from)
.next()
.ok_or_else(|| anyhow!("no argv[0]"))?;
paths::resolve_executable(&argv0)
}
let exe = from_current_exe()
.or_else(|_| from_argv())
.with_context(|| "couldn't get the path to cargo executable")?;
Ok(exe)

You can also reproduce that with this snippet:

$ cat src/main.rs
fn main() {
    println!("{}", std::env::current_exe().unwrap().display());
    println!("{}", std::env::args().nth(0).unwrap());
}
$ cargo run
   Compiling print_current_exe v0.1.0 (/local/home/jongje/print_current_exe)
    Finished dev [unoptimized + debuginfo] target(s) in 0.26s
     Running `target/debug/print_current_exe`
/local/home/jongje/print_current_exe/target/debug/print_current_exe
target/debug/print_current_exe
$ /usr/lib64/ld-linux-x86-64.so.2 target/debug/print_current_exe
/usr/lib64/ld-2.26.so
target/debug/print_current_exe

Steps

No response

Possible Solution(s)

It's not entirely clear how to fix this. I understand why Cargo prefers current_exe over argv[0], and that's probably the right choice generally. I wonder if perhaps the way to go about this is to have Cargo favor argv[0] if the filename of current_exe matches ld-*.so*?

Notes

No response

Version

cargo 1.56.0 (4ed5d137b 2021-10-04)
release: 1.56.0
commit-hash: 4ed5d137baff5eccf1bae5a7b2ae4b57efad4a7d
commit-date: 2021-10-04
@jonhoo
Copy link
Contributor Author

jonhoo commented Nov 23, 2021

I implemented the proposed fix in #10115 for discussion.

jonhoo pushed a commit to jonhoo/cargo that referenced this issue Nov 23, 2021
@jonhoo
Copy link
Contributor Author

jonhoo commented Jun 17, 2022

Just to leave some breadcrumbs, I think it's likely that this will be solved by whatever solves #10119, so I'm going to close this issue in favor of that one.

@jonhoo jonhoo closed this as completed Jun 17, 2022
jonhoo pushed a commit to jonhoo/cargo that referenced this issue Oct 25, 2022
Currently, Cargo will always set `$CARGO` to point to what it detects
its own path to be (using `std::env::current_exe`). Unfortunately, this
runs into trouble when Cargo is used as a library, or when `current_exe`
is not actually the binary itself (e.g., when invoked through Valgrind
or `ld.so`), since `$CARGO` will not point at something that can be used
as `cargo`. This, in turn, means that users can't currently rely on
`$CARGO` to do the right thing, and will sometimes have to invoke
`cargo` directly from `$PATH` instead, which may not reflect the `cargo`
that's currently in use.

This patch makes Cargo re-use the existing value of `$CARGO` if it's
already set in the environment. For Cargo subcommands, this will mean
that the initial invocation of `cargo` in `cargo foo` will set `$CARGO`,
and then Cargo-as-a-library inside of `cargo-foo` will inherit that
(correct) value instead of overwriting it with the incorrect value
`cargo-foo`. For other execution environments that do not have `cargo`
in their call stack, it gives them the opportunity to set a working
value for `$CARGO`.

One note about the implementation of this is that the test suite now
needs to override `$CARGO` explicitly so that the _user's_ `$CARGO` does
not interfere with the contents of the tests. It _could_ remove `$CARGO`
instead, but overriding it seemed less error-prone.

Fixes rust-lang#10119.
Fixes rust-lang#10113.
jonhoo pushed a commit to jonhoo/cargo that referenced this issue Oct 25, 2022
Currently, Cargo will always set `$CARGO` to point to what it detects
its own path to be (using `std::env::current_exe`). Unfortunately, this
runs into trouble when Cargo is used as a library, or when `current_exe`
is not actually the binary itself (e.g., when invoked through Valgrind
or `ld.so`), since `$CARGO` will not point at something that can be used
as `cargo`. This, in turn, means that users can't currently rely on
`$CARGO` to do the right thing, and will sometimes have to invoke
`cargo` directly from `$PATH` instead, which may not reflect the `cargo`
that's currently in use.

This patch makes Cargo re-use the existing value of `$CARGO` if it's
already set in the environment. For Cargo subcommands, this will mean
that the initial invocation of `cargo` in `cargo foo` will set `$CARGO`,
and then Cargo-as-a-library inside of `cargo-foo` will inherit that
(correct) value instead of overwriting it with the incorrect value
`cargo-foo`. For other execution environments that do not have `cargo`
in their call stack, it gives them the opportunity to set a working
value for `$CARGO`.

One note about the implementation of this is that the test suite now
needs to override `$CARGO` explicitly so that the _user's_ `$CARGO` does
not interfere with the contents of the tests. It _could_ remove `$CARGO`
instead, but overriding it seemed less error-prone.

Fixes rust-lang#10119.
Fixes rust-lang#10113.
bors added a commit that referenced this issue Nov 2, 2022
Make cargo forward pre-existing CARGO if set

Currently, Cargo will always set `$CARGO` to point to what it detects its own path to be (using `std::env::current_exe`). Unfortunately, this runs into trouble when Cargo is used as a library, or when `current_exe` is not actually the binary itself (e.g., when invoked through Valgrind or `ld.so`), since `$CARGO` will not point at something that can be used as `cargo`. This, in turn, means that users can't currently rely on `$CARGO` to do the right thing, and will sometimes have to invoke `cargo` directly from `$PATH` instead, which may not reflect the `cargo` that's currently in use.

This patch makes Cargo re-use the existing value of `$CARGO` if it's already set in the environment. For Cargo subcommands, this will mean that the initial invocation of `cargo` in `cargo foo` will set `$CARGO`, and then Cargo-as-a-library inside of `cargo-foo` will inherit that (correct) value instead of overwriting it with the incorrect value `cargo-foo`. For other execution environments that do not have `cargo` in their call stack, it gives them the opportunity to set a working value for `$CARGO`.

One note about the implementation of this is that the test suite now needs to override `$CARGO` explicitly so that the _user's_ `$CARGO` does not interfere with the contents of the tests. It _could_ remove `$CARGO` instead, but overriding it seemed less error-prone.

Fixes #10119.
Fixes #10113.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: bug
Projects
None yet
1 participant