Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc_metadata: Preprocess search paths for better performance #132910

Merged
merged 1 commit into from
Nov 15, 2024

Conversation

osiewicz
Copy link
Contributor

@osiewicz osiewicz commented Nov 11, 2024

Over in Zed we've noticed that loading crates for a large-ish workspace (~100 members of workspace, over 1000 crates being built for the main binary target) can take almost 200ms. We've pinned it down to how rustc searches for paths to dependency files, as it performs a linear search over the list of candidate paths. In our case the candidate list had about 20k entries which we had to iterate over for each dependency being loaded. Our workspace is also pretty bottom-heavy, e.g. most of the workspace members pull in most of the transitive dependencies one way or another, which means that we spend quite some time loading crates at rustc startup.

This commit introduces a simple FilesIndex that's just a BTreeMap under the hood. Since crates are looked up by both prefix and suffix, we perform a range search on said BTree (which constraints the search space based on prefix) and follow up with a linear scan of entries with matching suffixes.

Overall, this commit brings down build time for us in dev scenarios by about 6%. 100ms might not seem like much, but this is a constant cost that each of our workspace crates has to pay, even when said crate is miniscule.

@rustbot
Copy link
Collaborator

rustbot commented Nov 11, 2024

r? @estebank

rustbot has assigned @estebank.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 11, 2024
@rust-log-analyzer

This comment has been minimized.

@workingjubilee
Copy link
Member

I am pretty sure this should get the attention of

r? @saethlin

@rustbot rustbot assigned saethlin and unassigned estebank Nov 11, 2024
@petrochenkov petrochenkov self-assigned this Nov 12, 2024
@petrochenkov petrochenkov removed their assignment Nov 12, 2024
@osiewicz
Copy link
Contributor Author

In 081797d I've additionally started filtering out entries that cannot match any query; basically, given a target spec we can remove entries from the map before any query takes place.
This brings down the runtime for me by a bit; the size of refined set in Zed is ~5k entries versus 25k without refining. Unrefined set includes files such as CGUs. We don't need to check them.

@saethlin
Copy link
Member

Overall, this commit brings down build time for us in dev scenarios by about 6%.

Exactly how did you come up with this figure? I've cloned Zed and I'm looking through profiles of builds of the repo and while I certainly can see an impressive amount of time is being spent in this code, I don't think I can come up with an entire 6%. So I'm wondering if I just don't understand how to interact with this repo or if there's some particular workflow I should be investigating.

@osiewicz
Copy link
Contributor Author

osiewicz commented Nov 13, 2024

@saethlin this figure is an end-to-end timing of cargo build in root of zed workspace after touching crates/editor/src/editor.rs. This is one of the "hottest" files in our repo and there are multiple crates that depend on editor, thus this is the scenario we've looked at. We use plain touch instead of making code changes as we're not familiar with inner workings of incremental build system of rustc; in theory, touching should be the best case scenario for incremental build, as nothing really changes (well, macros and whatnot can get in a way, but that's not a factor for us) - plus it's enough to trigger a rebuild by cargo.
I've just reran these measurements against the commit this branch was created from (fbcdd72):
fbcdd72: 13.37 12.68 12.66 12.75 13.09 (avg: 12.91)
this branch: 11.48 11.59 11.68 11.52 11.81 (avg: 11.61)

I did not really mean for this figure to stand as an argument for merging this; it's just that I'm aware that 100ms (or whatever the figure is, depending on a package being built) might not seem like much, but it does add up in the long run for us.

@saethlin
Copy link
Member

I did not really mean for this figure to stand as an argument for merging this; it's just that I'm aware that 100ms (or whatever the figure is, depending on a package being built) might not seem like much, but it does add up in the long run for us.

I think you are underestimating how much we care about compiler performance. In general, a 6% win is quite significant.

I don't think I've seen any of the pathology you've identified via Zed in any of our existing benchmark suite. I just checked (what I think is our go-to "big crate"?) cargo-0.60.0 Debug IncrUnchanged (touch a file and run cargo build) and it looks completely different from Zed. I've also profiled builds of Zed a fair bit now and I think there are a number of code paths in rustc that are O(n) but scale with either workspace size or dependency tree size. Hopefully some of my other in-progress work will help.

But with just the two style nits, I think this PR is good to go. I'm very happy to see people identify a problem with the compiler and try to fix it :)

@rust-log-analyzer

This comment has been minimized.

@saethlin
Copy link
Member

@bors r+ rollup=never

@bors
Copy link
Contributor

bors commented Nov 14, 2024

📌 Commit b07313a has been approved by saethlin

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 14, 2024
@lqd
Copy link
Member

lqd commented Nov 14, 2024

maybe cleanup the commit history?

@saethlin
Copy link
Member

My kingdom for bors squash.

@bors r-

@osiewicz can you squash the commits together?

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Nov 14, 2024
Over in Zed we've noticed that loading crates for a large-ish workspace can take almost 200ms. We've pinned it down to how rustc searches for paths, as it performs a linear search over the list of candidate paths. In our case the candidate list had about 20k entries which we had to iterate over for each dependency being loaded.

This commit introduces a simple FilesIndex that's just a sorted Vec under the hood. Since crates are looked up by both prefix and suffix, we perform a range search on said Vec (which constraints the search space based on prefix) and follow up with a linear scan of entries with matching suffixes.
FilesIndex is also pre-filtered before any queries are performed using available target information; query prefixes/sufixes are based on the target we are compiling for, so we can remove entries that can never match up front.

Overall, this commit brings down build time for us in dev scenarios by about 6%.
100ms might not seem like much, but this is a constant cost that each of our workspace crates has to pay, even when said crate is miniscule.
@osiewicz osiewicz force-pushed the crate-loader-smarter-queries branch from b07313a to 42e71bb Compare November 15, 2024 09:37
@lqd
Copy link
Member

lqd commented Nov 15, 2024

@bors r=saethlin

@bors
Copy link
Contributor

bors commented Nov 15, 2024

📌 Commit 42e71bb has been approved by saethlin

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 15, 2024
@bors
Copy link
Contributor

bors commented Nov 15, 2024

⌛ Testing commit 42e71bb with merge 76fd471...

@bors
Copy link
Contributor

bors commented Nov 15, 2024

☀️ Test successful - checks-actions
Approved by: saethlin
Pushing 76fd471 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 15, 2024
@bors bors merged commit 76fd471 into rust-lang:master Nov 15, 2024
7 checks passed
@rustbot rustbot added this to the 1.84.0 milestone Nov 15, 2024
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (76fd471): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

  • If the regression was expected or you think it can be justified,
    please write a comment with sufficient written justification, and add
    @rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
  • If you think that you know of a way to resolve the regression, try to create
    a new PR with a fix for the regression.
  • If you do not understand the regression or you think that it is just noise,
    you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
    were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
0.0% [0.0%, 0.0%] 1
Regressions ❌
(secondary)
2.2% [0.1%, 4.4%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.2% [-0.2%, -0.2%] 2
All ❌✅ (primary) 0.0% [0.0%, 0.0%] 1

Max RSS (memory usage)

Results (primary -3.0%, secondary 2.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
4.5% [0.9%, 9.6%] 7
Improvements ✅
(primary)
-3.0% [-3.0%, -3.0%] 1
Improvements ✅
(secondary)
-2.4% [-4.3%, -1.6%] 4
All ❌✅ (primary) -3.0% [-3.0%, -3.0%] 1

Cycles

Results (primary 2.3%, secondary 2.7%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.3% [2.3%, 2.3%] 1
Regressions ❌
(secondary)
2.7% [2.7%, 2.7%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.3% [2.3%, 2.3%] 1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 788.822s -> 786.926s (-0.24%)
Artifact size: 335.37 MiB -> 335.38 MiB (0.00%)

@rustbot rustbot added the perf-regression Performance regression. label Nov 15, 2024
@saethlin
Copy link
Member

@rustbot label: +perf-regression-triaged

The reported regressions are extremely marginal, and the discussion above: #132910 (comment) contains instructions for how to demonstrate a quite significant benefit of this change.

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.