Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache CARGO_HOME #25

Closed
dmikusa opened this issue Mar 25, 2021 · 3 comments
Closed

Cache CARGO_HOME #25

dmikusa opened this issue Mar 25, 2021 · 3 comments

Comments

@dmikusa
Copy link
Contributor

dmikusa commented Mar 25, 2021

Cargo stores downloaded into $CARGO_HOME, which defaults to $HOME/.cargo. That is not going to be useful in CNBS, so we should be setting $CARGO_HOME to a layer that is set with build + cache flags. That way downloaded resources are restored for subsequent runs of the buildpack & if something needs to be recompiled, Cargo will not need to hit the Internet to download resources again.

The entire folder does not need to be cached though. This actually retains some duplicate information. According to this doc link we can trim down the size a bit by storing only these folder:

  • bin/
  • registry/index/
  • registry/cache/
  • git/db/
@dmikusa
Copy link
Contributor Author

dmikusa commented Apr 4, 2021

I did some testing on this and setting CARGO_HOME does indeed help. It prevents cargo from downloading things over and over. It doesn't seem to do as much as I'd have hoped though. Cargo is still rebuilding a lot of files. Looking deeper it seems to be due to timestamp differences. That seems odd to me, but I didn't have time to look more & need to, since it's adding way too much overhead and rebuilding unnecessarily.

@dmikusa
Copy link
Contributor Author

dmikusa commented Apr 5, 2021

Looked into this a bit more. When cargo install runs, it's looking at the mtimes of the files. These appear to be off slightly in a couple of cases. Those then ripple out and cause other things to be rebuilt, which on a large project can cause significant delays.

You can set CARGO_LOG=cargo::core::compiler::fingerprint=trace to see more details about the decisions it's making.

Testing locally, this doesn't happen. Nor does testing with Dockerfile & docker build.

I suspect that this has to do with the way the cache layer is being restored & that the old mtimes are not persisted. Thus what you have is the mtime set to the time the file was extracted, not the time it was originally written by the Rust compiler. When you cargo next, it checks the mtimes which are wrong and rebuilds accordingly.

dmikusa pushed a commit that referenced this issue Apr 18, 2021
- Remove buildpack's layer caching. Caching for the layer could be difficult, so removing it just leaves things up to Cargo which is the most accurate
- Creates two folders under the cargo cache layer, `target/` and `home/`. The former is where build files go while the latter is where `cargo` cached downloads, like from crates.io go.
- At the moment, a layer cached by the lifecycle does not preserve file mtimes. They are squashed in the name of reproducible builds. To get around this, the buildpack will preserve the mtimes of everything install, after cargo runs & then restore them the next time before cargo runs. This keeps consistent file mtimes, which is necessary for cargo to work properly.
- Removes unnecessary directories from cargo's home directory, based on https://doc.rust-lang.org/cargo/guide/cargo-home.html#caching-the-cargo-home-in-ci, which saves space on the cache layer.

Resolves: #24 and #25
ForestEckhardt pushed a commit that referenced this issue Apr 21, 2021
- Remove buildpack's layer caching. Caching for the layer could be difficult, so removing it just leaves things up to Cargo which is the most accurate
- Creates two folders under the cargo cache layer, `target/` and `home/`. The former is where build files go while the latter is where `cargo` cached downloads, like from crates.io go.
- At the moment, a layer cached by the lifecycle does not preserve file mtimes. They are squashed in the name of reproducible builds. To get around this, the buildpack will preserve the mtimes of everything install, after cargo runs & then restore them the next time before cargo runs. This keeps consistent file mtimes, which is necessary for cargo to work properly.
- Removes unnecessary directories from cargo's home directory, based on https://doc.rust-lang.org/cargo/guide/cargo-home.html#caching-the-cargo-home-in-ci, which saves space on the cache layer.

Resolves: #24 and #25
@dmikusa
Copy link
Contributor Author

dmikusa commented Apr 21, 2021

Closed by #41

@dmikusa dmikusa closed this as completed Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant