RFC: Set CARGO_CHECK environment variable when type checking #3748

akiselev · 2024-12-21T04:51:26Z

Add a new environment variable CARGO_CHECK that is set to true when running cargo check so build scripts can skip expensive compilation steps that are unnecessary for Rust type checking, such as compiling external C++ code in cxx based projects.

Rendered

weihanglo · 2024-12-21T05:30:17Z

text/3748-cargo_check_environment_variable.md

FWIW,

The corresponding Cargo issue thread: Inform build scripts whether cargo is checking or building cargo#4001.

A recent summary of that thread: Inform build scripts whether cargo is checking or building cargo#4001 (comment).

Thank you for the links! I searched Rust issues and RFCs but forgot to search the cargo repo ☹️

It doesn't look like the discussion has moved much in years, is there a reason against forcing the issue via RFC at this point?

The Rust landscape has changed significantly since the last major activity in the discussion and rust-analyzer seems to be the de-facto LSP implementation at this point. What are the downsides of introducing an LSP specific check command whose only difference (for now) is setting that environment variable? Or the alternative, allowing callers to set a CARGO_NO_BUILD environment variable themselves , one that is officially sanctioned and supported by Rust/Cargo (if not set automatically)?

I understand and appreciate the hesitance to stabilize contracts without ironing out all of the generalities but the discussion about cargo modes and the cargo check && cargo build feels very ivory tower. Rust is increasingly being used to integrate with C++ code beyond *-sys crates as cxx and other tools mature, and I feel like a way to notify build scripts not to run extraneous steps is very much needed regardless of the aforementioned issues. Personally I always configure rust-analyzer to use a subdirectory so that it doesn't block cargo build anyway, so build caching between cargo check and cargo build wouldn't apply (and I'm curious what fraction of the community does too)

That said, I'm biased as I feel acute pain with cxx/cxx-qt, where sccache doesn't seem to help. Worst case scenario I can set the environment variables myself and use a custom fork.

It doesn't look like the discussion has moved much in years, is there a reason against forcing the issue via RFC at this point?

No. While RFC doesn't really force anything until accepted, it is good to open a discussion. Sometimes people hang out in https://internals.rust-lang.org/ first for pre-RFC, before preparing a more formal proposal here.

weihanglo · 2024-12-21T05:34:16Z

text/3748-cargo_check_environment_variable.md

+
+2. Do Nothing: If we do nothing, build scripts will continue to run all build steps even when it's not necessary, significantly impacting Rust ergonomics when interfacing with exernal languages.
+
+# Prior art


Mind providing link to each prior art so we can find reference easier?

kpreid · 2024-12-21T15:07:50Z

text/3748-cargo_check_environment_variable.md

+- Projects using Protocol Buffers generate Rust code from .proto files
+- bindgen generates Rust bindings from C/C++ headers
+
+Currently, every time rust-analyzer runs `cargo check`, all build scripts must execute their full build process, including steps like compiling C++ code that are only needed for linking but not for type checking. This significantly impacts IDE responsiveness, especially in projects with complex build scripts.


Something seems missing from this description. In my understanding of how to use Cargo, a well-behaved build script should produce an appropriate set of rerun-if-changed directives, meaning that the build script will not be run “every time” unless its dependencies are actually changed.

There are of course specific scenarios where it will be rerun regardless, such as if you are editing one of the build-dependencies, but this paragraph suggests that build scripts predictably run on every cargo check in every project that has one, not just the first cargo check or cargo build. In my mind, if that’s happening, that indicates a bug in the build script. (Or at least an incompleteness — rerunning every time is the default behavior of Cargo given zero rerun-if-* directives, but for any build script that runs long enough to notice, that should be addressed by adding those directives.)

Can you clarify this? Is the build script in this scenario unable to use rerun-if directives for some reason?

You are correct, I miswrote that paragraph. I changed it to Currently, every time rust-analyzer runs cargo check, the build script in the changed crate must execute its full build process, ...

But that’s still not true in general. If a build script reruns, then one of the following situations applies:

The build script does not have any rerun-if-* dependencies declared; that can and should be fixed.

You modified one of its [build-dependencies].

You modified a file that is a rerun-if-changed dependency of the build script, so it should rerun.

You (implicitly or explicitly) changed a environment variable the build script depends on with rerun-if-env-changed, e.g. by running cargo check from two different environments (IDE and separate terminal). Again, this is an actual change in a dependency, though possibly an unintended one.

Changing the Rust code of a crate does not guarantee that its build script will rerun. Build scripts are rerun only if one of their dependencies changes, or if they failed to declare dependencies.

Ah I see what you mean now.

The cxx crate parses Rust source files to generate bindings from cxx::bridge macros and the Rust source files get added via cargo::rerun-if-changed. Normally this wouldn't be a problem since C/C++ bindings are rarely changed, but in cxx-qt projects the bridge code is the interface between QML/JS GUI code and the Rust backend so it receives a lot more attention while incurring additional QT build tooling overhead.

I'll add a paragraph explaining that detail. I didn't realize the issue I was experiencing is so specific to cxx though

kpreid · 2024-12-21T15:28:20Z

text/3748-cargo_check_environment_variable.md

+# Reference-level explanation
+[reference-level-explanation]: #reference-level-explanation
+
+Cargo will set the `CARGO_CHECK` environment variable to `true` when running `cargo check`


I think it is unwise to name and describe this as for cargo check in particular, for two reasons:

A future-compatibility hazard, especially in context of extending this to other modes as noted in the future possibilities section — are build scripts going to have to check for CARGO_CHECK || CARGO_DOC || CARGO_CLIPPY to be efficient? I think we’d see future arguments over whether commands are “check-like enough to count” rather than introducing more, which would spread confusion and create a tradeoff we can just avoid from the start.

Build script authors might decide to intentionally do non-additive things based on the command in use (like generating different bindings code that’s more development-friendly in some way), leading to incorrect caching.

I think it would be better to name and describe as something like:

Cargo will set the CARGO_WILL_NOT_LINK environment variable if Cargo’s build plan is known not to include any linking (and therefore is known not to depend on any foreign object files the build script might produce, or any of its cargo::rustc-link-* directives). Whenever CARGO_WILL_NOT_LINK is present, you may skip producing object files. All generated Rust code or other inputs to the Rust compilation (such as files for include_str! or cargo::rustc-cfg directives) must be unaffected.

The value of CARGO_WILL_NOT_LINK is reserved for making finer distinctions in the future; currently, it is always set to "1", but build scripts should check only for its presence.

This does change the behavior because the definition includes all of cargo check, cargo fix, cargo doc, and cargo clippy, but I think that’s the right thing to do. It’s also very specific about what the build script may do and should not do, to ensure that the results of the build script can still validly be cached.

(I specified the value to be "1" and not "true" because that's what Cargo uses for feature flags, the other existing example of “boolean” environment variables.)

These are great points!

I think your alternative is a better approach, especially not to step on the toes of the parallel cargo modes discussion. I'll wait for more input and then probably update the RFC to the alternative method after the holidays

There are a handful of environment variables that don't have the CARGO_ prefix, e.g. OUT_DIR TARGET NUM_JOBS OPT_LEVEL. My read is that these unprefixed variables are loosely meant to be portable to other build systems that might want to reuse build scripts. Since the proposed feature is certainly something that other build systems could make use of, maybe it should also be unprefixed?

A name like CHECK_ONLY would be reasonably intuitive. WILL_NOT_LINK or SKIP_CODEGEN are a bit more technical but maybe more accurate with e.g. rustdoc. NO_CODEGEN is consistent with the unstable rustc -Zno-codegen. I think Cargo currently gets the check-only functionality by passing --emit=dep-info,metadata.

List of the existing environment variables for reference https://doc.rust-lang.org/cargo/reference/environment-variables.html

kpreid · 2024-12-21T15:28:56Z

text/3748-cargo_check_environment_variable.md

+    generate_rust_bindings();
+
+    // Only compile external code when not type checking
+    if std::env::var("CARGO_CHECK").unwrap() != "true" {


The text below says “The environment variable will not be set…” but this code panics if the environment variable is not set.

Suggested change

if std::env::var("CARGO_CHECK").unwrap() != "true" {

if std::env::var("CARGO_CHECK").is_ok() {

Fixed, thank you

…ipts

clarfonthey · 2024-12-21T23:36:00Z

Is there a reason why you propose an environment variable instead of just, a cfg option? Sure, the environment variable can be accessed in the main function in the build.rs script, but it can't be used for conditional compilation even though that would be incredibly handy as well. This would not only remove the runtime penalty of doing all these things, but the compile time penalty of using them in the build script as well.

You would just have to ensure that this cfg option is only available in the build script itself, and not in the actual crate or its dependencies, since you could do some really naughty things otherwise…

tgross35 · 2024-12-22T01:38:21Z

Is there a reason why you propose an environment variable instead of just, a cfg option? Sure, the environment variable can be accessed in the main function in the build.rs script, but it can't be used for conditional compilation even though that would be incredibly handy as well. This would not only remove the runtime penalty of doing all these things, but the compile time penalty of using them in the build script as well.

Wouldn't this be about the same since Cargo needs to build all build-dependencies regardless of whether it is checking or compiling? I would expect that excluding these dependencies would be more significant than time savings from partially configuring out build.rs (since the build script itself usually doesn't contain a lot of slow-to-compile logic), but that would require being able to specify something like codegen-only build dependencies or features.

One advantage of using an environment variable is that you can compile the build script once and use it for both checking and codegen.

tgross35 · 2024-12-22T01:47:25Z

text/3748-cargo_check_environment_variable.md

+- `cargo run`
+- `cargo test`
+
+# Drawbacks


Related to #3748 (comment): as proposed, this is only beneficial in cases where whatever the build script does is more significant than compiling the build script + deps itself. IME this is indeed the common case since often a crate like cmake or bindgen is needed to generate the headers, actually compiling only adds cmake --build.

tgross35 · 2024-12-22T02:51:26Z

text/3748-cargo_check_environment_variable.md

+
+This feature primarily benefits library authors who maintain build scripts, especially those working with external code generation and compilation. Regular Rust developers using these libraries will automatically benefit from improved IDE performance without needing to modify their code.
+
+# Reference-level explanation


This should address things like caching and output generation. Quoting @alexcrichton in rust-lang/cargo#4001 (comment):

I'm not 100% certain this still fits in Cargo myself. The major downside of implementing a feature like this is that cargo check && cargo build gets slower than it currently is. Cargo currently caches build script invocations for those two commands, which means that all the work done by cargo check is reused by cargo build and isn't redone.

[...]

As an author of crates like cc and cmake it's also somewhat ambiguous to me about what "check mode" would do for libraries like that. Should they do nothing? Type-check the C code? Make sure it all compiles? Basically I don't think that C/C++ have any real meaningful distinction like Rust does for cargo check and cargo build, so there's not really an obvious choice of what these library crates would do, which would require even more opt-in or configuration on behalf of all users.

High level proposal: for a build script, CHECK_ONLY should mean:

Do any relevant checks that don't include codegen, if possible

Do what is necessary to check the Rust code

This can always be disregarded in favor of running the full build script

So for C code, CHECK_ONLY would mean doing the C/++ configure (but not build) step and possibly invoking bindgen. Usually this is ./configure or cmake without --build to verify dependencies and environment, and prepare final headers and source files. bindgen can then be run on these final headers. !CHECK_ONLY then means that make, cmake --build, or cc should be run.

OUT_DIR should be the same between check and build so configuration output can be reused. The build script will probably always run the congigure step, cmake knows how to skip a lot if files are already up to date ¹.

For crates, cc should totally ignore this environment variable and not change anything - the build script author will just be have the option to skip invoking cc if CHECK_ONLY is set. cmake's Config should add new methods like only_configure (always called) and only_build (called if CHECK_ONLY is unset), since Config::build currently does both these things. cmake::build and Config::build could optionally listen to CHECK_ONLY to determine whether to run the only_build step.

Footnotes

Cmake's configure caching can sometimes be pretty slow for bigger projects. It would be possible for build scripts to thumbprint relevant environment and skip invoking cmake if OUT_DIR has an updated thumbprint, but this would be advanced usage. ↩

tgross35 · 2024-12-22T03:10:31Z

text/3748-cargo_check_environment_variable.md

+
+3. **Maintenance Burden**: The Rust and Cargo teams will need to maintain this feature and ensure it remains consistent across different commands and contexts.
+
+# Rationale and alternatives


rust-lang/cargo#10126 proposed CARGO_MODE that adds doc, test, doctest, and a handful of others. Personally I would rather have a single environment variable that says whether or not codegen should happen (as is proposed here) rather than needing to know which modes do and don't require codegen, which isn't robust against adding new modes. Somehow indicating bench/test mode could be useful but this seems better as an independent feature.

clarfonthey · 2024-12-22T03:43:32Z

Wouldn't this be about the same since Cargo needs to build all build-dependencies regardless of whether it is checking or compiling? I would expect that excluding these dependencies would be more significant than time savings from partially configuring out build.rs (since the build script itself usually doesn't contain a lot of slow-to-compile logic), but that would require being able to specify something like codegen-only build dependencies or features.

That's fair, although I guess that my thought process was that dependencies don't change often and thus can have their builds cached, whereas build scripts would essentially need to be built and run no matter what.

Since the RFC mentions rust-analyzer specifically, I'd imagine that the primary case is speeding up type checking between changes of the code when dependencies would already be built, but I guess that also changing the build script would be a potential use case.

I guess it is one of those weird cases where compiling two copies only is a big deal if you need to compile both copies often, whereas here you'd likely only be building one or the other.

akiselev added 2 commits December 20, 2024 20:45

Initial commit for CARGO_CHECK env var proposal

424ffc2

update RFC number and fix typo

80d6c3f

akiselev changed the title ~~RFC: Add CARGO_CHECK environment variable when type checking~~ RFC: Set CARGO_CHECK environment variable when type checking Dec 21, 2024

weihanglo reviewed Dec 21, 2024

View reviewed changes

weihanglo added the T-cargo Relevant to the Cargo team, which will review and decide on the RFC. label Dec 21, 2024

kpreid reviewed Dec 21, 2024

View reviewed changes

akiselev added 4 commits December 21, 2024 10:09

Fixed sentence in motivation that erroneously mentioned all build scr…

aadc16a

…ipts

fixed code sample

a637be2

change truthy value to '1'

19c72e3

Added information about rerun-if-changed

7d28a8b

tgross35 reviewed Dec 22, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Set CARGO_CHECK environment variable when type checking #3748

RFC: Set CARGO_CHECK environment variable when type checking #3748

akiselev commented Dec 21, 2024 •

edited by rustbot

Loading

weihanglo Dec 21, 2024

akiselev Dec 21, 2024 •

edited

Loading

weihanglo Dec 21, 2024

weihanglo Dec 21, 2024

kpreid Dec 21, 2024 •

edited

Loading

akiselev Dec 21, 2024

kpreid Dec 21, 2024 •

edited

Loading

akiselev Dec 21, 2024

kpreid Dec 21, 2024 •

edited

Loading

akiselev Dec 21, 2024

tgross35 Dec 22, 2024 •

edited

Loading

kpreid Dec 21, 2024 •

edited

Loading

akiselev Dec 21, 2024

clarfonthey commented Dec 21, 2024

tgross35 commented Dec 22, 2024

tgross35 Dec 22, 2024

tgross35 Dec 22, 2024

tgross35 Dec 22, 2024

clarfonthey commented Dec 22, 2024


		2. Do Nothing: If we do nothing, build scripts will continue to run all build steps even when it's not necessary, significantly impacting Rust ergonomics when interfacing with exernal languages.

		# Prior art

	if std::env::var("CARGO_CHECK").unwrap() != "true" {
	if std::env::var("CARGO_CHECK").is_ok() {


		This feature primarily benefits library authors who maintain build scripts, especially those working with external code generation and compilation. Regular Rust developers using these libraries will automatically benefit from improved IDE performance without needing to modify their code.

		# Reference-level explanation


		3. Maintenance Burden: The Rust and Cargo teams will need to maintain this feature and ensure it remains consistent across different commands and contexts.

		# Rationale and alternatives

RFC: Set CARGO_CHECK environment variable when type checking #3748

Are you sure you want to change the base?

RFC: Set CARGO_CHECK environment variable when type checking #3748

Conversation

akiselev commented Dec 21, 2024 • edited by rustbot Loading

Choose a reason for hiding this comment

akiselev Dec 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kpreid Dec 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kpreid Dec 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kpreid Dec 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tgross35 Dec 22, 2024 • edited Loading

Choose a reason for hiding this comment

kpreid Dec 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

clarfonthey commented Dec 21, 2024

tgross35 commented Dec 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Footnotes

Choose a reason for hiding this comment

clarfonthey commented Dec 22, 2024

akiselev commented Dec 21, 2024 •

edited by rustbot

Loading

akiselev Dec 21, 2024 •

edited

Loading

kpreid Dec 21, 2024 •

edited

Loading

kpreid Dec 21, 2024 •

edited

Loading

kpreid Dec 21, 2024 •

edited

Loading

tgross35 Dec 22, 2024 •

edited

Loading

kpreid Dec 21, 2024 •

edited

Loading