Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize automatic garbage collection. #14287

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 48 additions & 29 deletions src/cargo/core/gc.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,6 @@ const DEFAULT_AUTO_FREQUENCY: &str = "1 day";
/// It should be cheap to call this multiple times (subsequent calls are
/// ignored), but try not to abuse that.
pub fn auto_gc(gctx: &GlobalContext) {
if !gctx.cli_unstable().gc {
return;
}
if !gctx.network_allowed() {
// As a conservative choice, auto-gc is disabled when offline. If the
// user is indefinitely offline, we don't want to delete things they
Expand Down Expand Up @@ -176,49 +173,74 @@ impl GcOpts {
let auto_config = gctx
.get::<Option<AutoConfig>>("gc.auto")?
.unwrap_or_default();
self.update_for_auto_gc_config(&auto_config)
self.update_for_auto_gc_config(&auto_config, gctx.cli_unstable().gc)
}

fn update_for_auto_gc_config(&mut self, auto_config: &AutoConfig) -> CargoResult<()> {
fn update_for_auto_gc_config(
&mut self,
auto_config: &AutoConfig,
unstable_allowed: bool,
) -> CargoResult<()> {
macro_rules! config_default {
($auto_config:expr, $field:ident, $default:expr, $unstable_allowed:expr) => {
if !unstable_allowed {
// These config options require -Zgc
$default
} else {
$auto_config.$field.as_deref().unwrap_or($default)
}
};
}

self.max_src_age = newer_time_span_for_config(
self.max_src_age,
"gc.auto.max-src-age",
auto_config
.max_src_age
.as_deref()
.unwrap_or(DEFAULT_MAX_AGE_EXTRACTED),
config_default!(
auto_config,
max_src_age,
DEFAULT_MAX_AGE_EXTRACTED,
unstable_allowed
),
)?;
self.max_crate_age = newer_time_span_for_config(
self.max_crate_age,
"gc.auto.max-crate-age",
auto_config
.max_crate_age
.as_deref()
.unwrap_or(DEFAULT_MAX_AGE_DOWNLOADED),
config_default!(
auto_config,
max_crate_age,
DEFAULT_MAX_AGE_DOWNLOADED,
unstable_allowed
),
)?;
self.max_index_age = newer_time_span_for_config(
self.max_index_age,
"gc.auto.max-index-age",
auto_config
.max_index_age
.as_deref()
.unwrap_or(DEFAULT_MAX_AGE_DOWNLOADED),
config_default!(
auto_config,
max_index_age,
DEFAULT_MAX_AGE_DOWNLOADED,
unstable_allowed
),
)?;
self.max_git_co_age = newer_time_span_for_config(
self.max_git_co_age,
"gc.auto.max-git-co-age",
auto_config
.max_git_co_age
.as_deref()
.unwrap_or(DEFAULT_MAX_AGE_EXTRACTED),
config_default!(
auto_config,
max_git_co_age,
DEFAULT_MAX_AGE_EXTRACTED,
unstable_allowed
),
)?;
self.max_git_db_age = newer_time_span_for_config(
self.max_git_db_age,
"gc.auto.max-git-db-age",
auto_config
.max_git_db_age
.as_deref()
.unwrap_or(DEFAULT_MAX_AGE_DOWNLOADED),
config_default!(
auto_config,
max_git_db_age,
DEFAULT_MAX_AGE_DOWNLOADED,
unstable_allowed
),
)?;
Ok(())
}
Expand Down Expand Up @@ -257,9 +279,6 @@ impl<'a, 'gctx> Gc<'a, 'gctx> {
/// This returns immediately without doing work if garbage collection has
/// been performed recently (since `gc.auto.frequency`).
fn auto(&mut self, clean_ctx: &mut CleanContext<'gctx>) -> CargoResult<()> {
if !self.gctx.cli_unstable().gc {
return Ok(());
}
let auto_config = self
.gctx
.get::<Option<AutoConfig>>("gc.auto")?
Expand All @@ -278,7 +297,7 @@ impl<'a, 'gctx> Gc<'a, 'gctx> {
return Ok(());
}
let mut gc_opts = GcOpts::default();
gc_opts.update_for_auto_gc_config(&auto_config)?;
gc_opts.update_for_auto_gc_config(&auto_config, self.gctx.cli_unstable().gc)?;
self.gc(clean_ctx, &gc_opts)?;
if !clean_ctx.dry_run {
self.global_cache_tracker.set_last_auto_gc()?;
Expand Down
39 changes: 39 additions & 0 deletions src/doc/src/reference/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,9 @@ ENV_VAR_NAME_3 = { value = "relative/path", relative = true }
[future-incompat-report]
frequency = 'always' # when to display a notification about a future incompat report

[gc.auto]
frequency = "1 day" # How often to perform automatic garbage collection

[cargo-new]
vcs = "none" # VCS to use ('git', 'hg', 'pijul', 'fossil', 'none')

Expand Down Expand Up @@ -663,6 +666,42 @@ Controls how often we display a notification to the terminal when a future incom
* `always` (default): Always display a notification when a command (e.g. `cargo build`) produces a future incompat report
* `never`: Never display a notification

### `[gc]`

The `[gc]` table defines settings for garbage collection in Cargo's caches, which will delete old, unused files.

#### `[gc.auto]`

The `[gc.auto]` table defines settings for automatic garbage collection in Cargo's caches.
When running `cargo` commands, Cargo will automatically track which files you are using within the global cache.
Periodically, Cargo will delete files that have not been used for some period of time.
Currently it will delete files that have to be downloaded from the network if they have not been used in 3 months. Files that can be generated without network access will be deleted if they have not been used in 1 month.

The automatic deletion of files only occurs when running commands that are already doing a significant amount of work, such as all of the build commands (`cargo build`, `cargo test`, `cargo check`, etc.), and `cargo fetch`.

Automatic deletion is disabled if cargo is offline such as with `--offline` or `--frozen` to avoid deleting artifacts that may need to be used if you are offline for a long period of time.

> **Note**: This tracking is currently only implemented for the global cache in Cargo's home directory.
> This includes registry indexes and source files downloaded from registries and git dependencies.
> Support for tracking build artifacts is not yet implemented, and tracked in [cargo#13136](https://github.com/rust-lang/cargo/issues/13136).
>
> Additionally, there is an unstable feature to support *manually* triggering garbage collection, and to further customize the configuration options.
> See the [Unstable chapter](unstable.md#gc) for more information.

##### `gc.auto.frequency`
* Type: string
* Default: `"1 day"`
* Environment: `CARGO_GC_AUTO_FREQUENCY`

This option defines how often Cargo will automatically delete unused files in the global cache.
This does *not* define how old the files must be, those thresholds are described [above](#gcauto).

It supports the following settings:

* `"never"` --- Never deletes old files.
* `"always"` --- Checks to delete old files every time Cargo runs.
* An integer followed by "seconds", "minutes", "hours", "days", "weeks", or "months" --- Checks to delete old files at most the given time frame.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“months” is an approximate number.

cargo/src/cargo/core/gc.rs

Lines 373 to 381 in ea14e86

let factor = match right {
"second" | "seconds" => 1,
"minute" | "minutes" => 60,
"hour" | "hours" => 60 * 60,
"day" | "days" => 24 * 60 * 60,
"week" | "weeks" => 7 * 24 * 60 * 60,
"month" | "months" => 2_629_746, // average is 30.436875 days
_ => return None,
};

I can foresee someone will interpret months as monthly and think it will be clean on the same day when set, while it is not true especially in February. I don't think this is a thing we can't change after stabilization. Just calling it out if someone disagrees.

Copy link

@teohhanhui teohhanhui Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just remove months? It'd be counter-intuitive to anyone trying to use it... The user can already specify something like 180 days for ~6 months, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For myself, being able to say "6 months" is much easier than calculating out the number of days and reading the number of days.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@epage But would anyone expect it to be 182.62125 days? Principle of least surprise...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an inherent approximation when giving a unit. The larger the unit, the larger the approximation. If you say "6 months", you shouldn't care whether thats 180, 186, 182, or 182.62125.

btw laughing emoji's in a technical discussion like this come across as rude.

Copy link

@teohhanhui teohhanhui Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw laughing emoji's in a technical discussion like this come across as rude.

Sigh... Here we go again. Intent does not carry across text (or emojis), so please don't jump to conclusions like that.

I was not even disagreeing with what you said. Just pointing out that it'd be surpirising for the user, as the OP of this thread has already pointed out (a different surprising aspect).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@epage But would anyone expect it to be 182.62125 days? Principle of least surprise...

168 would be surprising; a bit over 180 not so much. The kind of surprise we're trying to avoid is purging data way earlier than expected.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For myself, being able to say "6 months" is much easier than calculating out the number of days and reading the number of days.

If the user goes to the trouble of customizing this in the config, I don't think having to calculate the number of days would be much of an extra hurdle. In effect, removing months would just simplify things with no real downside (and prevent future support questions where people are arguing over this again / trying to figure out what's going on with this approximation).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A month equals 30 days is easier for me to accept


### `[http]`

The `[http]` table defines settings for HTTP behavior. This includes fetching
Expand Down
2 changes: 2 additions & 0 deletions src/doc/src/reference/environment-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@ In summary, the supported environment variables are:
* `CARGO_BUILD_DEP_INFO_BASEDIR` --- Dep-info relative directory, see [`build.dep-info-basedir`].
* `CARGO_CARGO_NEW_VCS` --- The default source control system with [`cargo new`], see [`cargo-new.vcs`].
* `CARGO_FUTURE_INCOMPAT_REPORT_FREQUENCY` --- How often we should generate a future incompat report notification, see [`future-incompat-report.frequency`].
* `CARGO_GC_AUTO_FREQUENCY` --- Configures how often automatic garbage collection runs, see [`gc.auto.frequency`].
* `CARGO_HTTP_DEBUG` --- Enables HTTP debugging, see [`http.debug`].
* `CARGO_HTTP_PROXY` --- Enables HTTP proxy, see [`http.proxy`].
* `CARGO_HTTP_TIMEOUT` --- The HTTP timeout, see [`http.timeout`].
Expand Down Expand Up @@ -167,6 +168,7 @@ In summary, the supported environment variables are:
[`cargo-new.email`]: config.md#cargo-newemail
[`cargo-new.vcs`]: config.md#cargo-newvcs
[`future-incompat-report.frequency`]: config.md#future-incompat-reportfrequency
[`gc.auto.frequency`]: config.md#gcautofrequency
[`http.debug`]: config.md#httpdebug
[`http.proxy`]: config.md#httpproxy
[`http.timeout`]: config.md#httptimeout
Expand Down
39 changes: 15 additions & 24 deletions src/doc/src/reference/unstable.md
Original file line number Diff line number Diff line change
Expand Up @@ -1488,36 +1488,18 @@ This will not affect any hard-coded paths in the source code, such as in strings

* Tracking Issue: [#12633](https://github.com/rust-lang/cargo/issues/12633)

The `-Zgc` flag enables garbage-collection within cargo's global cache within the cargo home directory.
This includes downloaded dependencies such as compressed `.crate` files, extracted `src` directories, registry index caches, and git dependencies.
When `-Zgc` is present, cargo will track the last time any index and dependency was used,
and then uses those timestamps to manually or automatically delete cache entries that have not been used for a while.

```sh
cargo build -Zgc
```

### Automatic garbage collection

Automatic deletion happens on commands that are already doing a significant amount of work,
such as all of the build commands (`cargo build`, `cargo test`, `cargo check`, etc.), and `cargo fetch`.
The deletion happens just after resolution and packages have been downloaded.
Automatic deletion is only done once per day (see `gc.auto.frequency` to configure).
Automatic deletion is disabled if cargo is offline such as with `--offline` or `--frozen` to avoid deleting artifacts that may need to be used if you are offline for a long period of time.
The `-Zgc` flag is used to enable certain features related to garbage-collection of cargo's global cache within the cargo home directory.

#### Automatic gc configuration

The automatic gc behavior can be specified via a cargo configuration setting.
The `-Zgc` flag will enable Cargo to read extra configuration options related to garbage collection.
The settings available are:

```toml
# Example config.toml file.

# This table defines the behavior for automatic garbage collection.
[gc.auto]
# The maximum frequency that automatic garbage collection happens.
# Can be "never" to disable automatic-gc, or "always" to run on every command.
frequency = "1 day"
# Anything older than this duration will be deleted in the source cache.
max-src-age = "1 month"
# Anything older than this duration will be deleted in the compressed crate cache.
Expand All @@ -1530,9 +1512,13 @@ max-git-co-age = "1 month"
max-git-db-age = "3 months"
```

Note that the [`gc.auto.frequency`] option was stabilized in Rust 1.82.

[`gc.auto.frequency`]: config.md#gcautofrequency

### Manual garbage collection with `cargo clean`

Manual deletion can be done with the `cargo clean gc` command.
Manual deletion can be done with the `cargo clean gc -Zgc` command.
Deletion of cache contents can be performed by passing one of the cache options:

- `--max-src-age=DURATION` --- Deletes source cache files that have not been used since the given age.
Expand All @@ -1551,9 +1537,9 @@ A DURATION is specified in the form "N seconds/minutes/days/weeks/months" where
A SIZE is specified in the form "N *suffix*" where *suffix* is B, kB, MB, GB, kiB, MiB, or GiB, and N is an integer or floating point number. If no suffix is specified, the number is the number of bytes.

```sh
cargo clean gc
cargo clean gc --max-download-age=1week
cargo clean gc --max-git-size=0 --max-download-size=100MB
cargo clean gc -Zgc
cargo clean gc -Zgc --max-download-age=1week
cargo clean gc -Zgc --max-git-size=0 --max-download-size=100MB
```

## open-namespaces
Expand Down Expand Up @@ -1991,3 +1977,8 @@ default behavior.

See the [build script documentation](build-scripts.md#rustc-check-cfg) for information
about specifying custom cfgs.

## Automatic garbage collection

Support for automatically deleting old files was stabilized in Rust 1.82.
More information can be found in the [config chapter](config.md#gcauto).
Loading