Auto merge of #9955 - ehuss:benchsuite, r=Eh2406

Add the start of a basic benchmarking suite. This adds the start of a basic benchmarking suite for cargo. This is fairly rough, but I figure it will change and evolve over time based on what we decide to add and how we use it. There is some documentation in the `benches/README.md` file which gives an overview of what is here and how to use it. Closes #9935
rust-lang · Oct 12, 2021 · c8b38af · c8b38af
2 parents ad85ec9 + e4da5b2
commit c8b38af
Show file tree

Hide file tree

Showing 18 changed files with 676 additions and 12 deletions.
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
@@ -19,7 +19,7 @@ jobs:
     - run: rustup component add rustfmt
     - run: cargo fmt --all -- --check
     - run: |
-        for manifest in `find crates -name Cargo.toml`
+        for manifest in `find crates benches/benchsuite benches/capture -name Cargo.toml`
         do
           echo check fmt for $manifest
           cargo fmt --all --manifest-path $manifest -- --check
@@ -79,6 +79,15 @@ jobs:
       if: matrix.os == 'macos-latest'
     - run: cargo build --manifest-path crates/credential/cargo-credential-wincred/Cargo.toml
       if: matrix.os == 'windows-latest'
+    - name: Check benchmarks
+      env:
+        # Share the target dir to try to cache a few build-time deps.
+        CARGO_TARGET_DIR: target
+      run: |
+        # This only tests one benchmark since it can take over 10 minutes to
+        # download all workspaces.
+        cargo test --manifest-path benches/benchsuite/Cargo.toml --all-targets -- cargo
+        cargo check --manifest-path benches/capture/Cargo.toml
     - name: Fetch smoke test
       run: ci/fetch-smoke-test.sh
 

diff --git a/benches/README.md b/benches/README.md
@@ -0,0 +1,124 @@
+# Cargo Benchmarking
+
+This directory contains some benchmarks for cargo itself. This uses
+[Criterion] for running benchmarks. It is recommended to read the Criterion
+book to get familiar with how to use it. A basic usage would be:
+
+```sh
+cd benches/benchsuite
+cargo bench
+```
+
+The tests involve downloading the index and benchmarking against some
+real-world and artificial workspaces located in the [`workspaces`](workspaces)
+directory.
+
+**Beware** that the initial download can take a fairly long amount of time (10
+minutes minimum on an extremely fast network) and require significant disk
+space (around 4.5GB). The benchsuite will cache the index and downloaded
+crates in the `target/tmp/bench` directory, so subsequent runs should be
+faster. You can (and probably should) specify individual benchmarks to run to
+narrow it down to a more reasonable set, for example:
+
+```sh
+cargo bench -- resolve_ws/rust
+```
+
+This will only download what's necessary for the rust-lang/rust workspace
+(which is about 330MB) and run the benchmarks against it (which should take
+about a minute). To get a list of all the benchmarks, run:
+
+```sh
+cargo bench -- --list
+```
+
+## Viewing reports
+
+The benchmarks display some basic information on the command-line while they
+run. A more complete HTML report can be found at
+`target/criterion/report/index.html` which contains links to all the
+benchmarks and summaries. Check out the Criterion book for more information on
+the extensive reporting capabilities.
+
+## Comparing implementations
+
+Knowing the raw numbers can be useful, but what you're probably most
+interested in is checking if your changes help or hurt performance. To do
+that, you need to run the benchmarks multiple times.
+
+First, run the benchmarks from the master branch of cargo without any changes.
+To make it easier to compare, Criterion supports naming the baseline so that
+you can iterate on your code and compare against it multiple times.
+
+```sh
+cargo bench -- --save-baseline master
+```
+
+Now you can switch to your branch with your changes. Re-run the benchmarks
+compared against the baseline:
+
+```sh
+cargo bench -- --baseline master
+```
+
+You can repeat the last command as you make changes to re-compare against the
+master baseline.
+
+Without the baseline arguments, it will compare against the last run, which
+can be helpful for comparing incremental changes.
+
+## Capturing workspaces
+
+The [`workspaces`](workspaces) directory contains several workspaces that
+provide a variety of different workspaces intended to provide good exercises
+for benchmarks. Some of these are shadow copies of real-world workspaces. This
+is done with the tool in the [`capture`](capture) directory. The tool will
+copy `Cargo.lock` and all of the `Cargo.toml` files of the workspace members.
+It also adds an empty `lib.rs` so Cargo won't error, and sanitizes the
+`Cargo.toml` to some degree, removing unwanted elements. Finally, it
+compresses everything into a `tgz`.
+
+To run it, do:
+
+```sh
+cd benches/capture
+cargo run -- /path/to/workspace/foo
+```
+
+The resolver benchmarks also support the `CARGO_BENCH_WORKSPACES` environment
+variable, which you can point to a Cargo workspace if you want to try
+different workspaces. For example:
+
+```sh
+CARGO_BENCH_WORKSPACES=/path/to/some/workspace cargo bench
+```
+
+## TODO
+
+This is just a start for establishing a benchmarking suite for Cargo. There's
+a lot that can be added. Some ideas:
+
+* Fix the benchmarks so that the resolver setup doesn't run every iteration.
+* Benchmark [this section of
+  code](https://github.com/rust-lang/cargo/blob/a821e2cb24d7b6013433f069ab3bad53d160e100/src/cargo/ops/cargo_compile.rs#L470-L549)
+  which builds the unit graph. The performance there isn't great, and it would
+  be good to keep an eye on it. Unfortunately that would mean doing a bit of
+  work to make `generate_targets` publicly visible, and there is a bunch of
+  setup code that may need to be duplicated.
+* Benchmark the fingerprinting code.
+* Benchmark running the `cargo` executable. Running something like `cargo
+  build` or `cargo check` with everything "Fresh" would be a good end-to-end
+  exercise to measure the overall overhead of Cargo.
+* Benchmark pathological resolver scenarios. There might be some cases where
+  the resolver can spend a significant amount of time. It would be good to
+  identify if these exist, and create benchmarks for them. This may require
+  creating an artificial index, similar to the `resolver-tests`. This should
+  also consider scenarios where the resolver ultimately fails.
+* Benchmark without `Cargo.lock`. I'm not sure if this is particularly
+  valuable, since we are mostly concerned with incremental builds which will
+  always have a lock file.
+* Benchmark just
+  [`resolve::resolve`](https://github.com/rust-lang/cargo/blob/a821e2cb24d7b6013433f069ab3bad53d160e100/src/cargo/core/resolver/mod.rs#L122)
+  without anything else. This can help focus on just the resolver.
+
+[Criterion]: https://bheisler.github.io/criterion.rs/book/
diff --git a/benches/benchsuite/Cargo.toml b/benches/benchsuite/Cargo.toml
@@ -0,0 +1,21 @@
+[package]
+name = "benchsuite"
+version = "0.1.0"
+edition = "2018"
+license = "MIT OR Apache-2.0"
+homepage = "https://github.com/rust-lang/cargo"
+repository = "https://github.com/rust-lang/cargo"
+documentation = "https://docs.rs/cargo-platform"
+description = "Benchmarking suite for Cargo."
+
+[dependencies]
+cargo = { path = "../.." }
+# Consider removing html_reports in 0.4 and switching to `cargo criterion`.
+criterion = { version = "0.3.5", features = ["html_reports"] }
+flate2 = { version = "1.0.3", default-features = false, features = ["zlib"] }
+tar = { version = "0.4.35", default-features = false }
+url = "2.2.2"
+
+[[bench]]
+name = "resolve"
+harness = false