Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving CARGO_HOME invalidates target caches #10915

Open
overdrivenpotato opened this issue Aug 1, 2022 · 6 comments
Open

Moving CARGO_HOME invalidates target caches #10915

overdrivenpotato opened this issue Aug 1, 2022 · 6 comments
Labels
A-rebuild-detection Area: rebuild detection and fingerprinting C-bug Category: bug S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.

Comments

@overdrivenpotato
Copy link

Problem

When the CARGO_HOME folder is moved to a new location, subsequent builds invalidate target folder caches because source file paths were updated. This is relevant in CI, where a cache folder can be placed in a new location for every build. If the path changes, there is no way to cache crates, so they must be rebuilt every time.

Steps

$ export CARGO_HOME=$(pwd)/home1
$ cargo new foo
     Created binary (application) `foo` package
$ cd foo
$ echo 'serde = "*"' >> Cargo.toml
$ cargo build
    Updating crates.io index
  Downloaded serde v1.0.140
  Downloaded 1 crate (76.4 KB) in 0.28s
   Compiling serde v1.0.140
   Compiling foo v0.1.0 (/private/tmp/repro/foo)
    Finished dev [unoptimized + debuginfo] target(s) in 2m 33s
$ cd ..
$ mv home1 home2
$ export CARGO_HOME=$(pwd)/home2
$ cd foo
$ cargo build
   Compiling serde v1.0.140
   Compiling foo v0.1.0 (/private/tmp/repro/foo)
    Finished dev [unoptimized + debuginfo] target(s) in 4.74s

Note that serde is built twice, after the CARGO_HOME folder is moved.

Possible Solution(s)

Perhaps it is possible to have the CARGO_HOME portion of crate build paths replaced with something that does not change?

Notes

No response

Version

No response

@overdrivenpotato overdrivenpotato added the C-bug Category: bug label Aug 1, 2022
@ehuss ehuss added the A-rebuild-detection Area: rebuild detection and fingerprinting label Aug 13, 2022
@weihanglo
Copy link
Member

weihanglo commented Aug 14, 2022

That's because fingerprint take source file paths into account 1. Currently, fingerprint calculation mostly relies on filesystem mtime and paths, not content hashes. Cargo never know the intent of a user changing CARGO_HOME, so it chooses to rebuild it all. There were some discussions about switching to content hash detection2, but they never conclude. I feel like that content hashing approach can fix this issue, though I don't know where it will go 😞

Out of curious, could you share more about why swapping CARGO_HOME? Which CI does that?

Footnotes

  1. https://github.com/rust-lang/cargo/blob/84941490fd7304317282a89309c1fe2123200a8b/src/cargo/core/compiler/fingerprint.rs#L1340

  2. (Option to) Fingerprint by file contents instead of mtime #6529

@epage
Copy link
Contributor

epage commented Aug 15, 2022

That's because fingerprint take source file paths into account 1. Currently, fingerprint calculation mostly relies on filesystem mtime and paths, not content hashes. Cargo never know the intent of a user changing CARGO_HOME, so it chooses to rebuild it all.

What if we instead hashed relative to CARGO_HOME? So long as everything else has stayed the same, we shouldn't need to worry about whether CARGO_HOME has changed I would think

@overdrivenpotato
Copy link
Author

Which CI does that?

Concourse creates a new work directory with a random ID during every build on some setups, e.g. macOS workers. For example, one build may create /opt/concourse/work_dir/volumes/live/56383523-066e-4a62-77e2-c4c70c3fa52a/volume, only for the next build to be /opt/concourse/work_dir/volumes/live/258f073b-0051-4ae7-68f3-22b902a6e478/volume. Because the volume ID changed, attempting to cache CARGO_HOME inside the volume directory will not work for subsequent builds. It's also not a great solution to move the directory to a location outside of the volume as macOS-based workers don't use containers. Doing so would tamper with the rest of the system. However, currently it seems to be the only solution.

@weihanglo
Copy link
Member

What if we instead hashed relative to CARGO_HOME?

Personally I am happy towards this. My little concern is that someone already relies on switching CARGO_HOME for different registry index or other configurations to strike a level of reproducibility. However, if it is really a case, introducing more granular cache keys might be better instead of hashing CARGO_HOME.

@weihanglo
Copy link
Member

weihanglo commented Dec 15, 2023

This is definitely a duplicate of #10179. Since that one was closed, I'll keep this open.


From #10179 (comment):

I believe this is correct behavior on behalf of Cargo right now because the full source path is used for debug information so it affects the final artifact. "Fixing" this issue would mean somehow doing something along the lines of remapping the paths to the same value.

#12137 -Ztrim-paths introduces a built-in remap mechanism in Cargo. The exact remap rules is under discussion in #13171. Debuginfo seems to be addressed soon, so IMO this should be re-considered.

My little concern is that someone already relies on switching CARGO_HOME for different registry index or other configurations to strike a level of reproducibility

I would tell me in the past that whoever depends on absolute CARGO_HOME is https://xkcd.com/1172/

@weihanglo weihanglo added the S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted. label Dec 15, 2023
@weihanglo
Copy link
Member

In #13171 we have an idea that a new subcommand like cargo debug generates remap rules for debuggers, so that debug info never contains a fixed absolute path and instead can always have a placeholder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rebuild-detection Area: rebuild detection and fingerprinting C-bug Category: bug S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.
Projects
None yet
Development

No branches or pull requests

4 participants