Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unstable --keep-going flag #10383

Merged
merged 1 commit into from
Mar 30, 2022
Merged

Unstable --keep-going flag #10383

merged 1 commit into from
Mar 30, 2022

Conversation

dtolnay
Copy link
Member

@dtolnay dtolnay commented Feb 12, 2022

Summary

This PR adds an unstable --keep-going flag documented as follows:

cargo build --keep-going (and similarly for check, test etc) will build as many crates in the dependency graph as possible, rather than aborting the build at the first one that fails to build.

For example if the current package depends on dependencies fails and works, one of which fails to build, cargo check -j1 may or may not build the one that succeeds (depending on which one of the two builds Cargo picked to run first), whereas cargo check -j1 --keep-going would definitely run both builds, even if the one run first fails.

The -Z unstable-options command-line option must be used in order to use --keep-going while it is not yet stable:

cargo check --keep-going -Z unstable-options

Prior art

Buck and Bazel and Make all have this flag (though Bazel calls it --keep_going 🤮) with exactly this behavior.

Motivation

I need this in order to make https://github.com/dtolnay/trybuild not super slow.

Trybuild wants to run Cargo on a bunch of test cases, each of which is a bin crate. The bad options currently available are:

  • Give each test case its own target dir and run build on them in parallel. This is bad because all the test cases have the same dependencies in common (whatever dev-dependencies are declared by the project). If there are 100 test cases, all the dependencies would end up getting built 100 times, which is 100x slower than necessary despite the parallelism.

  • Reuse a single target dir for all the test cases. Two Cargos can't operate in parallel on the same target directory, so this forces the test cases to be built serially. This is much slower than necessary on a many-core system, and compounds all of the overheads in Cargo because the project structure must be reloaded by each invocation.

The good option I'd like to switch to is:

  • Run cargo build --bins --keep-going --message-format=json to build all the test cases in parallel. Use the filepaths in the JSON messages to ascribe diagnostics to which bin they're from.

@rust-highfive
Copy link

r? @ehuss

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 12, 2022
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I've often wanted a flag similar to this myself, typically for the reason of "please go keep doing work while I go figure out that one error"

I think though that the error handling should be improved here before landing. For example if you use make -j 10 -k then it will still print the errors for all failed compiles, but here all errors after the first are simply swallowed up and not actually emitted anywhere. In the same manner how we display an error and then punt upwards some arbitrary error I think it would be better to display all errors here (in some fashion that makes them distinctly understandable) and then punt up perhaps something like "N build errors occurred".

Other than that though this seems reasonable to me.

src/bin/cargo/commands/test.rs Outdated Show resolved Hide resolved
return Some(e);
} else {
crate::display_error(&e, &mut cx.bcx.config.shell());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that this was preexisting, but reading over this again I'm finding it a bit confusing to follow why above cases use handle_error but self.timings.finish and the writeln! below both use diferent custom logic. I think, however, that it should be fine to use handle_error in both cases?

Or rather put another way I think it'd be good to consistently use handle_error here if possible.

drop(shell.warn("build failed, waiting for other jobs to finish..."));
*first_error = Some(anyhow::format_err!("build failed"));
} else {
*first_error = Some(new_err);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep getting somewhat confused when I think about this personally, and while I realize that this is preserving what's already happening here I think it might be best to restructure this differently. I believe the original intention of this code was, when possible, to return upwards the error we encountered during compilation if only one was encountered. That being said, however, I don't believe we can reliably return upwards an error so I think it would be better to unconditionally, for any errors encountered during the build, to use crate::display_error and print them here. That way there's much less juggling of trying to preserve some "main" error to return while also ensuring that all other errors are printed.

With a refactoring like that I think it might make sense to change DrainErrors from an enum to a struct with a keep_going flag or similar since the only change in behavior would be whether or not more jobs are spawned but otherwise errors are handled consistently between --keep-going-vs-not

@alexcrichton alexcrichton added the T-cargo Team: Cargo label Feb 16, 2022
@alexcrichton
Copy link
Member

r? @alexcrichton

@rfcbot fcp merge

This is an unstable flag but personally I'd also be ok landing this insta-stable since this flag has precedent in other build systems. In any case I also wanted to get feedback from other Cargo folks if y'all have it.

@rfcbot
Copy link
Collaborator

rfcbot commented Feb 16, 2022

Team member @alexcrichton has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added the proposed-final-comment-period An FCP proposal has started, but not yet signed off. label Feb 16, 2022
@rfcbot rfcbot added the disposition-merge FCP with intent to merge label Feb 16, 2022
@fee1-dead
Copy link
Member

Another prior art: portage, Gentoo Linux's package manager also has --keep-going. Looks like applications that do fallible operations to a graph have a lot in common..

@alexcrichton
Copy link
Member

Thanks for the PR, but I'm going to be stepping down from the Cargo team so I'm going to un-assign myself from this. The Cargo team will help review this when they get a chance.

@alexcrichton alexcrichton removed their assignment Mar 8, 2022
bors added a commit that referenced this pull request Mar 21, 2022
Consistently use crate::display_error on errors during drain

As suggested in #10383 (comment) and #10383 (comment).
@bors
Copy link
Contributor

bors commented Mar 21, 2022

☔ The latest upstream changes (presumably #10394) made this pull request unmergeable. Please resolve the merge conflicts.

@ehuss ehuss self-assigned this Mar 21, 2022
@ehuss
Copy link
Contributor

ehuss commented Mar 21, 2022

@dtolnay I'm not sure what the status is here, but if you rebase it now that #10394 is merged, I can re-review it.

@dtolnay dtolnay mentioned this pull request Mar 22, 2022
2 tasks
@dtolnay
Copy link
Member Author

dtolnay commented Mar 22, 2022

Rebased to resolve conflict with #10394.

The meaningful change in this PR is back down to just one line:

-       if errors.count == 0 {
+       if errors.count == 0 || cx.bcx.build_config.keep_going {

i.e. none of 7e890c33fe012cb267d6fff9698b6dc9e3c4ca76 is needed any longer following #10394.

@ehuss
Copy link
Contributor

ehuss commented Mar 23, 2022

@rfcbot cancel

@rfcbot
Copy link
Collaborator

rfcbot commented Mar 23, 2022

@ehuss proposal cancelled.

@rfcbot rfcbot removed proposed-final-comment-period An FCP proposal has started, but not yet signed off. disposition-merge FCP with intent to merge labels Mar 23, 2022
@ehuss
Copy link
Contributor

ehuss commented Mar 23, 2022

@rfcbot merge

@rust-lang/cargo This is adds an unstable flag --keep-going which is described above. There seems to be some precedent for this in other tools. Restarting FCP since this is a new feature which we haven't discussed previously.

@rfcbot
Copy link
Collaborator

rfcbot commented Mar 23, 2022

Team member @ehuss has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Mar 29, 2022
@Mark-Simulacrum
Copy link
Member

@bors retry

(Triaging bors problems, don't mind me.)

@Mark-Simulacrum
Copy link
Member

@bors r- r=ehuss

@bors bors added S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 30, 2022
@bors
Copy link
Contributor

bors commented Mar 30, 2022

📌 Commit 4e45f58 has been approved by ehuss

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. labels Mar 30, 2022
@Mark-Simulacrum
Copy link
Member

@bors r- r=ehuss

@bors bors added S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 30, 2022
@bors
Copy link
Contributor

bors commented Mar 30, 2022

📌 Commit 4e45f58 has been approved by ehuss

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. labels Mar 30, 2022
@bors
Copy link
Contributor

bors commented Mar 30, 2022

⌛ Testing commit 4e45f58 with merge 9090042...

@bors
Copy link
Contributor

bors commented Mar 30, 2022

☀️ Test successful - checks-actions
Approved by: ehuss
Pushing 9090042 to master...

@bors bors merged commit 9090042 into rust-lang:master Mar 30, 2022
@dtolnay dtolnay deleted the keepgoing branch March 30, 2022 18:22
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 31, 2022
Update cargo

13 commits in 109bfbd055325ef87a6e7f63d67da7e838f8300b..1ef1e0a12723ce9548d7da2b63119de9002bead8
2022-03-17 21:43:09 +0000 to 2022-03-31 00:17:18 +0000
- Support `-Zmultitarget` in cargo config (rust-lang/cargo#10473)
- doc: Fix document url for libcurl format (rust-lang/cargo#10515)
- Fix wrong info in "Environment variables" docs (rust-lang/cargo#10513)
- Use the correct flag in --locked --offline error message (rust-lang/cargo#10512)
- Don't treat host/target duplicates as duplicates (rust-lang/cargo#10466)
- Unstable --keep-going flag (rust-lang/cargo#10383)
- Part 1 of RFC2906 - Packages can inherit fields from their root workspace (rust-lang/cargo#10497)
- Remove unused profile support for -Zpanic-abort-tests (rust-lang/cargo#10495)
- HTTP registry implementation (rust-lang/cargo#10470)
- Add a notice about review capacity. (rust-lang/cargo#10501)
- Add tests for ignoring symlinks (rust-lang/cargo#10047)
- Update doc string for deps_of/compute_deps. (rust-lang/cargo#10494)
- Consistently use crate::display_error on errors during drain (rust-lang/cargo#10394)
@rfcbot rfcbot added finished-final-comment-period FCP complete to-announce and removed final-comment-period FCP — a period for last comments before action is taken labels Apr 3, 2022
@ehuss ehuss added this to the 1.61.0 milestone Apr 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge FCP with intent to merge finished-final-comment-period FCP complete S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-cargo Team: Cargo to-announce
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants