Querify MonoItem collection #132566

saethlin · 2024-11-03T18:49:57Z

Factored out of #131650. These changes are required for post-mono MIR opts, because the previous implementation would load the MIR for every Instance that we traverse (as well as invoke queries on it). The cost of that would grow massively with post-mono MIR opts because we'll need to load new MIR for every Instance, instead of re-using the optimized_mir for every Instance with the same DefId.

So the approach here is to add two new queries, items_of_instance and size_estimate, which contain the specific information about an Instance's MIR that MirUsedCollector and CGU partitioning need, respectively. Caching these significantly increases the size of the query cache, but that's justified by our improved incrementality (I'm sure walking all the MIR for a huge crate scales quite poorly).

This also changes MonoItems into a type that will retain the traversal order (otherwise we perturb a bunch of diagnostics), and will also eliminate duplicate findings. Eliminating duplicates removes about a quarter of the query cache size growth.

The perf improvements in this PR are inflated because rustc-perf uses -Zincremental-verify-ich, which makes loading MIR a lot slower because MIR contains a lot of Spans and computing the stable hash of a Span is slow. And the primary goal of this PR is to load less MIR. Some squinting at collector profile_local perf-record +stage1 runs suggests the magnitude of the improvements in this PR would be decreased by between a third and a half if that flag weren't being used. Though this effect may apply to the regressions too since most are incr-full and this change also causes such builds to encode more Spans.

saethlin · 2024-11-03T18:50:08Z

@bors try @rust-timer queue

Querify mir collection Factored out of rust-lang#131650, these changes are required for post-mono MIR opts but I want to benchmark them on their own so that I can tune the implementation. r? ghost

bors · 2024-11-03T18:55:06Z

⌛ Testing commit 785a96b with merge 2c06ca1...

bors · 2024-11-03T19:07:22Z

💔 Test failed - checks-actions

saethlin · 2024-11-03T21:39:13Z

@bors try @rust-timer queue

bors · 2024-11-03T21:40:23Z

⌛ Trying commit 7155c30 with merge 292dbc9...

…try> Querify mir collection Factored out of rust-lang#131650, these changes are required for post-mono MIR opts but I want to benchmark them on their own so that I can tune the implementation. r? ghost

bors · 2024-11-03T23:33:23Z

☀️ Try build successful - checks-actions
Build commit: 292dbc9 (292dbc9fe6c4cd01f1788c79bc0170de7fe824f6)

rust-timer · 2024-11-04T02:08:07Z

Finished benchmarking commit (292dbc9): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	1.9%	[0.4%, 6.7%]	9
Regressions ❌ (secondary)	1.6%	[0.2%, 4.0%]	28
Improvements ✅ (primary)	-6.5%	[-24.7%, -0.2%]	69
Improvements ✅ (secondary)	-10.2%	[-67.2%, -0.2%]	21
All ❌✅ (primary)	-5.5%	[-24.7%, 6.7%]	78

Max RSS (memory usage)

Results (primary -7.7%, secondary -5.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	4.2%	[3.2%, 5.2%]	2
Regressions ❌ (secondary)	6.3%	[6.3%, 6.3%]	1
Improvements ✅ (primary)	-8.1%	[-23.2%, -1.3%]	67
Improvements ✅ (secondary)	-5.9%	[-16.3%, -2.1%]	27
All ❌✅ (primary)	-7.7%	[-23.2%, 5.2%]	69

Cycles

Results (primary -11.4%, secondary -24.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	5.7%	[5.7%, 5.7%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-11.7%	[-32.2%, -1.5%]	56
Improvements ✅ (secondary)	-24.6%	[-64.6%, -2.0%]	10
All ❌✅ (primary)	-11.4%	[-32.2%, 5.7%]	57

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 780.261s -> 782.209s (0.25%)
Artifact size: 335.33 MiB -> 335.25 MiB (-0.02%)

saethlin · 2024-11-04T04:58:01Z

@bors try @rust-timer queue

bors · 2024-11-04T04:59:10Z

⌛ Trying commit 1add34f with merge 381511a...

…try> Querify mir collection Factored out of rust-lang#131650, these changes are required for post-mono MIR opts but I want to benchmark them on their own so that I can tune the implementation. r? ghost

bors · 2024-11-04T06:55:13Z

☀️ Try build successful - checks-actions
Build commit: 381511a (381511aae66e203fd99050cf70e18ecdac59cf78)

rust-timer · 2024-11-04T10:56:25Z

Finished benchmarking commit (381511a): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	1.3%	[0.2%, 6.3%]	16
Regressions ❌ (secondary)	1.4%	[0.2%, 3.7%]	30
Improvements ✅ (primary)	-6.4%	[-24.6%, -0.2%]	69
Improvements ✅ (secondary)	-10.7%	[-67.2%, -0.2%]	20
All ❌✅ (primary)	-5.0%	[-24.6%, 6.3%]	85

Max RSS (memory usage)

Results (primary -7.8%, secondary -5.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.7%	[1.2%, 4.2%]	2
Regressions ❌ (secondary)	5.8%	[5.8%, 5.8%]	1
Improvements ✅ (primary)	-8.1%	[-23.0%, -1.2%]	66
Improvements ✅ (secondary)	-6.0%	[-16.4%, -2.0%]	24
All ❌✅ (primary)	-7.8%	[-23.0%, 4.2%]	68

Cycles

Results (primary -11.0%, secondary -22.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.6%	[0.6%, 0.6%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-11.2%	[-32.0%, -1.3%]	57
Improvements ✅ (secondary)	-22.6%	[-65.7%, -2.8%]	11
All ❌✅ (primary)	-11.0%	[-32.0%, 0.6%]	58

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 780.005s -> 780.627s (0.08%)
Artifact size: 335.27 MiB -> 335.17 MiB (-0.03%)

Kobzol · 2024-11-04T11:09:56Z

Awesome results! ~10-15% icount reduction on debug cargo builds is an incredible improvement. It does seem to grow the query cache/depgraph metadata on disk by 10-20%, but that is IMO acceptable for such a large incremental win.

saethlin · 2024-11-05T20:16:16Z

r? compiler

klensy · 2024-11-05T21:13:45Z

compiler/rustc_monomorphize/src/collector.rs

+
+    fn extend(&mut self, iter: impl IntoIterator<Item = Spanned<MonoItem<'tcx>>>) {
+        for item in iter {
+            self.push(item.clone());


clone is useless (item is Copy anyway)

rust-log-analyzer · 2024-11-08T00:21:34Z

The job mingw-check failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

#16 2.938 Building wheels for collected packages: reuse
#16 2.939   Building wheel for reuse (pyproject.toml): started
#16 3.190   Building wheel for reuse (pyproject.toml): finished with status 'done'
#16 3.191   Created wheel for reuse: filename=reuse-4.0.3-cp310-cp310-manylinux_2_35_x86_64.whl size=132720 sha256=026f3bb0f1aa8090b861fd0a0939cb1a782396d84c8aab7875096557d637a0f6
#16 3.191   Stored in directory: /tmp/pip-ephem-wheel-cache-10dqia70/wheels/3d/8d/0a/e0fc6aba4494b28a967ab5eaf951c121d9c677958714e34532
#16 3.194 Installing collected packages: boolean-py, binaryornot, tomlkit, reuse, python-debian, markupsafe, license-expression, jinja2, chardet, attrs
#16 3.603 Successfully installed attrs-23.2.0 binaryornot-0.4.4 boolean-py-4.0 chardet-5.2.0 jinja2-3.1.4 license-expression-30.3.0 markupsafe-2.1.5 python-debian-0.1.49 reuse-4.0.3 tomlkit-0.13.0
#16 3.603 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
#16 DONE 3.7s
---
 Documenting rustc_next_trait_solver v0.0.0 (/checkout/compiler/rustc_next_trait_solver)
error: unresolved link to `rustc_monomorphize::collector`
   --> compiler/rustc_middle/src/mir/mono.rs:537:32
    |
537 | /// See module-level docs of [`rustc_monomorphize::collector`] on some contect for "mentioned" items.
    |
    = note: `-D rustdoc::broken-intra-doc-links` implied by `-D warnings`
    = help: to override `-D warnings` add `#[allow(rustdoc::broken_intra_doc_links)]`


error: could not document `rustc_middle`
warning: build failed, waiting for other jobs to finish...
Command has failed. Rerun with -v to see more details.
  local time: Fri Nov  8 00:21:25 UTC 2024
  network time: Fri, 08 Nov 2024 00:21:25 GMT
##[error]Process completed with exit code 1.
Post job cleanup.

saethlin · 2024-11-08T01:32:03Z

Ouch. Not sure what to do about that now that the item needs to live in a different crate.

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 3, 2024