Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to turn off subcommands that include dependencies #8384

Closed
wants to merge 7 commits into from

Conversation

est31
Copy link
Member

@est31 est31 commented Jun 19, 2020

There are some dependencies of the cargo crate that are only used by one or two cargo subcommands. Various tools that use cargo from crates.io don't need most of cargo's built in subcommands, but only a subset. For example, cargo-outdated only does resolving, and cargo-udeps only does cargo check with some extras.

This PR adds default-on cargo features for certain subcommands that pull in dependencies. This allows custom cargo subcommands to have a smaller footprint of transitive dependencies and faster build time.

I ran some tests in this repo with cargo +nightly build first with the state prior to this PR and then with --default-features enabled. It showed 5 less crates to compile, as well as 2 seconds in saved build time (57.3s to 55.7s), which mostly stemmed from reduced build time for the cargo crate (28.1s to 26.5s). Apparently the 5 crates avoided weren't part of the critical path of the compilation DAG, but this might be different in scenarios with more or less parallelism than I have. Check how the green line goes down much quicker prior to the 16 second mark:

Graph prior to this PR:
grafik
Graph after this PR with --no-default-features:
grafik

@rust-highfive
Copy link

r? @ehuss

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 19, 2020
@alexcrichton
Copy link
Member

Although checked on CI the downside to this is that this is adding #[cfg] to maintain and keep working over time. The wins here don't really seem all that big (those graphs look almost identical) and it's a lot of annotations throughout we'd have to grapple with. I think the idea here is fine, but it appears that at least this iteration doesn't seem to really pull the weight of the implementation.

@est31
Copy link
Member Author

est31 commented Jun 23, 2020

I've done a bunch of measurements now with varying scenarios. Indeed I have apparently been testing the scenario with the smallest impact. The impact is larger for less powerful configurations. Tables for the whole crate graph time:

description time without PR time with PR + no-default-features reduction seconds reduction percent
debug 8 jobs 53.8s 52.7s 1.1 2.0
release 8 jobs 121.8s 116.3s 5.5 4.5
release 4 jobs 149.8s 142.6s 7.2 4.8
debug 4 jobs 62.1s 60.1s 2.0 3.2

Only cargo crate build time:

description time without PR time with PR + no-default-features reduction seconds reduction percent
debug 8 jobs 27.18s 25.36s 1.82 6.6
release 8 jobs 58.39s 53.58s 4.81 8.2
release 4 jobs 64.76s 59.27s 5.49 8.5
debug 4 jobs 27.3s 25.71s 1.59 5.8

Up to 7 saved seconds, how does that sound?

@est31
Copy link
Member Author

est31 commented Jun 23, 2020

Regarding the concern about the added #[cfg(feature = "...")] statements, I think most of them are quite straightforward, next to modules that concern the subcommands. The only place where they are a bit annoying is in the common_for_install_and_uninstall.rs file, which mainly contains stuff needed by cargo install. What do you think about moving stuff that only cargo install needs into the install-specific file? Would that resolve your concerns?

@alexcrichton
Copy link
Member

TBH the whole crate builds look like the differences are just noise in build times, and the cargo-crate-itself doesn't seem too relevant since it seems like the build time for Cargo only matters when you're building it as a dependency, where it's either never changing or you're building the whole thing from scratch.

The goal here seems to be to reuse Cargo's functionality in external crates. I think that goal is probably best served by splitting Cargo up rather than trying to remove pieces of Cargo that aren't needed. That's both more maintainable for the authors of Cargo and clearer to see what depends on what. It's a much more significant undertaking, though.

@est31
Copy link
Member Author

est31 commented Jun 24, 2020

differences are just noise

I ran the comparison multiple times using multiple scenarios and each time there is a noticeable single digit percent improvement. If it were just noise, there would have been cases where it would be slower or faster on a random basis. I didn't observe that. The difference between the cases stems from the fact that this PR affects them in different ways. In general, it affects buils more that have less cores and more optimizations. I invite you to do cargo +nightly build --release -j 4 --no-default-features -Ztimings yourself on the commit this PR bases on vs the last commit in this PR and check whether it's really 7 seconds shorter. I ran cargo update since I posted the comment with the table above, and now the difference in that specific scenario is even larger, 9 seconds.

For rustc, performance improvement PRs are merged all the time that add single digit percent improvments in compile time, and it's great that this happens. It's the cumulation of multiple such PRs that creates a noticeable "wow" improvement in compile time. See below, I'm already testing how splitting up cargo speeds up the build.

the cargo-crate-itself doesn't seem too relevant

I included it because of the possible impact on the development workflow.

best served by splitting Cargo up rather than trying to remove pieces of Cargo that aren't needed.

IMO ideally it shouldn't be "either or" but "both" :). How would you put subcommands into separate crates so that users of cargo can disable them one by one? There is a lot of subcommands, you'd create lots of very tiny crates. I think for them the optimal solution is to have one crate that contains code for all subcommands and then you have cargo features to turn off demanding subcommands (basically what this PR is doing).

Actually I have done some experiments already, building upon this PR's last commit to do right that, and split up cargo even further (reddit comment). But the wall clock improvements are cumulative, but in a similar ballpark to the improvements of this PR, which you said was noise (hopefully i was able to convince you it isn't!). And it increases the total CPU time by far more than the wall clock time improvements. See these screenshots (8 jobs, --no-default-features enabled in all of them):

baseline as of this PR (52.7s wall clock time):

Screenshot_20200624_174536

cargo-core/cargo split (50.3s wall clock time ; cargo-core finishes a bit faster than the old cargo crate, but the difference is smaller than the added time of the cargo crate that while not affecting wall clock time, does affect total CPU time):

Screenshot_20200624_174347

splitting off cargo-util off cargo-core (49.2s wall clock time ... I agree that this build time improvement is inside the noise interval, improvement might be real but very small):

Screenshot_20200624_174234

@ehuss ehuss assigned alexcrichton and unassigned ehuss Jun 24, 2020
@alexcrichton
Copy link
Member

Sorry I don't mean to say the results are reproducible, I mean moreso to say that they're so small they don't feel to me like they'll carry the weight that comes with the changes. I'm well aware of how the compiler is getting faster with tiny changes, but most of those changes are not causing any real change in maintenance burden. There's a lot of #[cfg] simply sprinkled around as part of this PR, and that's extremely difficult to maintain over time.

I think your use case is using cargo-as-a-library, right? If so I think that the best ways to solve this right now are:

  • Use cargo metadata if you can -- or if you can't try adding metadata printed by cargo metadata to parse
  • Split Cargo into separate crates so you don't have to pull in everything-at-once.

The latter strategy is unfortunately also somewhat tricky because to make Cargo manageable we'll want to make Cargo have its own workspace, but this runs afoul of including Cargo as a submodule into rust-lang/rust where we can't have nested workspaces.

Given that I feel like the best way forward would be to look into seeing if cargo metadata can be extended. That way the build time of cargo itself doesn't matter too too much.

@est31
Copy link
Member Author

est31 commented Jun 26, 2020

I wonder if there are cfg statements that are more straightforward than the ones added by this PR like

#[cfg(feature = "op-package-publish")]
pub fn publish(ws: &Workspace<'_>, opts: &PublishOpts<'_>) -> CargoResult<()> {

and

#[cfg(feature = "op-fix")]
mod fix;

but whatever, I'm giving up trying to convince you that this PR is useful.

Given your reluctance to merge this PR, I doubt I'll invest more time into further improvements, higher hanging fruit etc. I need to be able to trust that my contribution will be valued, that compile time is a priority. This doesn't really induce that trust.

Have a great day.

@est31 est31 closed this Jun 26, 2020
bors added a commit that referenced this pull request Jun 26, 2020
Remove unused remove_dir_all dependency

Originally part of #8384 but sadly the PR got rejected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants