rustc_metadata: Preprocess search paths for better performance #132910

osiewicz · 2024-11-11T17:42:18Z

Over in Zed we've noticed that loading crates for a large-ish workspace (~100 members of workspace, over 1000 crates being built for the main binary target) can take almost 200ms. We've pinned it down to how rustc searches for paths to dependency files, as it performs a linear search over the list of candidate paths. In our case the candidate list had about 20k entries which we had to iterate over for each dependency being loaded. Our workspace is also pretty bottom-heavy, e.g. most of the workspace members pull in most of the transitive dependencies one way or another, which means that we spend quite some time loading crates at rustc startup.

This commit introduces a simple FilesIndex that's just a BTreeMap under the hood. Since crates are looked up by both prefix and suffix, we perform a range search on said BTree (which constraints the search space based on prefix) and follow up with a linear scan of entries with matching suffixes.

Overall, this commit brings down build time for us in dev scenarios by about 6%. 100ms might not seem like much, but this is a constant cost that each of our workspace crates has to pay, even when said crate is miniscule.

rustbot · 2024-11-11T17:42:27Z

r? @estebank

rustbot has assigned @estebank.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

workingjubilee · 2024-11-11T19:42:17Z

I am pretty sure this should get the attention of

r? @saethlin

compiler/rustc_session/src/search_paths.rs

compiler/rustc_metadata/src/locator.rs

osiewicz · 2024-11-12T18:20:56Z

In 081797d I've additionally started filtering out entries that cannot match any query; basically, given a target spec we can remove entries from the map before any query takes place.
This brings down the runtime for me by a bit; the size of refined set in Zed is ~5k entries versus 25k without refining. Unrefined set includes files such as CGUs. We don't need to check them.

compiler/rustc_metadata/src/creader.rs

saethlin · 2024-11-12T23:31:17Z

Overall, this commit brings down build time for us in dev scenarios by about 6%.

Exactly how did you come up with this figure? I've cloned Zed and I'm looking through profiles of builds of the repo and while I certainly can see an impressive amount of time is being spent in this code, I don't think I can come up with an entire 6%. So I'm wondering if I just don't understand how to interact with this repo or if there's some particular workflow I should be investigating.

osiewicz · 2024-11-13T11:09:21Z

@saethlin this figure is an end-to-end timing of cargo build in root of zed workspace after touching crates/editor/src/editor.rs. This is one of the "hottest" files in our repo and there are multiple crates that depend on editor, thus this is the scenario we've looked at. We use plain touch instead of making code changes as we're not familiar with inner workings of incremental build system of rustc; in theory, touching should be the best case scenario for incremental build, as nothing really changes (well, macros and whatnot can get in a way, but that's not a factor for us) - plus it's enough to trigger a rebuild by cargo.
I've just reran these measurements against the commit this branch was created from (fbcdd72):
fbcdd72: 13.37 12.68 12.66 12.75 13.09 (avg: 12.91)
this branch: 11.48 11.59 11.68 11.52 11.81 (avg: 11.61)

I did not really mean for this figure to stand as an argument for merging this; it's just that I'm aware that 100ms (or whatever the figure is, depending on a package being built) might not seem like much, but it does add up in the long run for us.

compiler/rustc_session/src/search_paths.rs

saethlin · 2024-11-14T02:24:23Z

I did not really mean for this figure to stand as an argument for merging this; it's just that I'm aware that 100ms (or whatever the figure is, depending on a package being built) might not seem like much, but it does add up in the long run for us.

I think you are underestimating how much we care about compiler performance. In general, a 6% win is quite significant.

I don't think I've seen any of the pathology you've identified via Zed in any of our existing benchmark suite. I just checked (what I think is our go-to "big crate"?) cargo-0.60.0 Debug IncrUnchanged (touch a file and run cargo build) and it looks completely different from Zed. I've also profiled builds of Zed a fair bit now and I think there are a number of code paths in rustc that are O(n) but scale with either workspace size or dependency tree size. Hopefully some of my other in-progress work will help.

But with just the two style nits, I think this PR is good to go. I'm very happy to see people identify a problem with the compiler and try to fix it :)

saethlin · 2024-11-14T15:20:56Z

@bors r+ rollup=never

bors · 2024-11-14T15:21:01Z

📌 Commit b07313a has been approved by saethlin

It is now in the queue for this repository.

lqd · 2024-11-14T22:09:37Z

maybe cleanup the commit history?

saethlin · 2024-11-14T22:24:10Z

My kingdom for bors squash.

@bors r-

@osiewicz can you squash the commits together?

Over in Zed we've noticed that loading crates for a large-ish workspace can take almost 200ms. We've pinned it down to how rustc searches for paths, as it performs a linear search over the list of candidate paths. In our case the candidate list had about 20k entries which we had to iterate over for each dependency being loaded. This commit introduces a simple FilesIndex that's just a sorted Vec under the hood. Since crates are looked up by both prefix and suffix, we perform a range search on said Vec (which constraints the search space based on prefix) and follow up with a linear scan of entries with matching suffixes. FilesIndex is also pre-filtered before any queries are performed using available target information; query prefixes/sufixes are based on the target we are compiling for, so we can remove entries that can never match up front. Overall, this commit brings down build time for us in dev scenarios by about 6%. 100ms might not seem like much, but this is a constant cost that each of our workspace crates has to pay, even when said crate is miniscule.

lqd · 2024-11-15T11:47:10Z

@bors r=saethlin

bors · 2024-11-15T11:47:14Z

📌 Commit 42e71bb has been approved by saethlin

It is now in the queue for this repository.

bors · 2024-11-15T11:48:24Z

⌛ Testing commit 42e71bb with merge 76fd471...

bors · 2024-11-15T15:05:53Z

☀️ Test successful - checks-actions
Approved by: saethlin
Pushing 76fd471 to master...

rust-timer · 2024-11-15T16:23:40Z

Finished benchmarking commit (76fd471): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

If the regression was expected or you think it can be justified,
please write a comment with sufficient written justification, and add
@rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
If you think that you know of a way to resolve the regression, try to create
a new PR with a fix for the regression.
If you do not understand the regression or you think that it is just noise,
you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	0.0%	[0.0%, 0.0%]	1
Regressions ❌ (secondary)	2.2%	[0.1%, 4.4%]	2
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.2%	[-0.2%, -0.2%]	2
All ❌✅ (primary)	0.0%	[0.0%, 0.0%]	1

Max RSS (memory usage)

Results (primary -3.0%, secondary 2.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	4.5%	[0.9%, 9.6%]	7
Improvements ✅ (primary)	-3.0%	[-3.0%, -3.0%]	1
Improvements ✅ (secondary)	-2.4%	[-4.3%, -1.6%]	4
All ❌✅ (primary)	-3.0%	[-3.0%, -3.0%]	1

Cycles

Results (primary 2.3%, secondary 2.7%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.3%	[2.3%, 2.3%]	1
Regressions ❌ (secondary)	2.7%	[2.7%, 2.7%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.3%	[2.3%, 2.3%]	1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 788.822s -> 786.926s (-0.24%)
Artifact size: 335.37 MiB -> 335.38 MiB (0.00%)

saethlin · 2024-11-15T16:42:08Z

@rustbot label: +perf-regression-triaged

The reported regressions are extremely marginal, and the discussion above: #132910 (comment) contains instructions for how to demonstrate a quite significant benefit of this change.

rustbot assigned estebank Nov 11, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 11, 2024

This comment has been minimized.

Sign in to view

rustbot assigned saethlin and unassigned estebank Nov 11, 2024

Mark-Simulacrum reviewed Nov 11, 2024

View reviewed changes

compiler/rustc_session/src/search_paths.rs Outdated Show resolved Hide resolved

petrochenkov self-assigned this Nov 12, 2024

petrochenkov reviewed Nov 12, 2024

View reviewed changes

compiler/rustc_metadata/src/locator.rs Outdated Show resolved Hide resolved

compiler/rustc_metadata/src/locator.rs Show resolved Hide resolved

petrochenkov removed their assignment Nov 12, 2024

bjorn3 reviewed Nov 12, 2024

View reviewed changes

compiler/rustc_metadata/src/creader.rs Outdated Show resolved Hide resolved

saethlin reviewed Nov 13, 2024

View reviewed changes

compiler/rustc_session/src/search_paths.rs Outdated Show resolved Hide resolved

compiler/rustc_session/src/search_paths.rs Outdated Show resolved Hide resolved

This comment has been minimized.

Sign in to view

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 14, 2024

bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Nov 14, 2024

osiewicz force-pushed the crate-loader-smarter-queries branch from b07313a to 42e71bb Compare November 15, 2024 09:37

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 15, 2024

bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 15, 2024

bors merged commit 76fd471 into rust-lang:master Nov 15, 2024
7 checks passed

rustbot added this to the 1.84.0 milestone Nov 15, 2024

bors mentioned this pull request Nov 15, 2024

Support multiple stability attributes on items #131824

Open

rustbot added the perf-regression Performance regression. label Nov 15, 2024

rustbot added the perf-regression-triaged The performance regression has been triaged. label Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rustc_metadata: Preprocess search paths for better performance #132910

rustc_metadata: Preprocess search paths for better performance #132910

osiewicz commented Nov 11, 2024 •

edited

Loading

rustbot commented Nov 11, 2024

This comment has been minimized.

workingjubilee commented Nov 11, 2024

osiewicz commented Nov 12, 2024

saethlin commented Nov 12, 2024

osiewicz commented Nov 13, 2024 •

edited

Loading

saethlin commented Nov 14, 2024

This comment has been minimized.

saethlin commented Nov 14, 2024

bors commented Nov 14, 2024

lqd commented Nov 14, 2024

saethlin commented Nov 14, 2024

lqd commented Nov 15, 2024

bors commented Nov 15, 2024

bors commented Nov 15, 2024

bors commented Nov 15, 2024

rust-timer commented Nov 15, 2024

saethlin commented Nov 15, 2024

rustc_metadata: Preprocess search paths for better performance #132910

rustc_metadata: Preprocess search paths for better performance #132910

Conversation

osiewicz commented Nov 11, 2024 • edited Loading

rustbot commented Nov 11, 2024

This comment has been minimized.

workingjubilee commented Nov 11, 2024

osiewicz commented Nov 12, 2024

saethlin commented Nov 12, 2024

osiewicz commented Nov 13, 2024 • edited Loading

saethlin commented Nov 14, 2024

This comment has been minimized.

saethlin commented Nov 14, 2024

bors commented Nov 14, 2024

lqd commented Nov 14, 2024

saethlin commented Nov 14, 2024

lqd commented Nov 15, 2024

bors commented Nov 15, 2024

bors commented Nov 15, 2024

bors commented Nov 15, 2024

rust-timer commented Nov 15, 2024

Overall result: ❌✅ regressions and improvements - please read the text below

Instruction count

Max RSS (memory usage)

Cycles

Binary size

saethlin commented Nov 15, 2024

osiewicz commented Nov 11, 2024 •

edited

Loading

osiewicz commented Nov 13, 2024 •

edited

Loading