Shrink `TyKind::FnPtr`. #128812

nnethercote · 2024-08-08T07:20:48Z

By splitting the FnSig within TyKind::FnPtr into FnSigTys and FnHeader, which can be packed more efficiently. This reduces the size of the hot TyKind type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI.

r? @compiler-errors

rustbot · 2024-08-08T07:20:53Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

Some changes occurred in exhaustiveness checking

cc @Nadrieril

changes to the core type system

cc @compiler-errors, @lcnr

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred in compiler/rustc_sanitizers

cc @rust-lang/project-exploit-mitigations, @rcvalle

Some changes occurred in src/tools/clippy

cc @rust-lang/clippy

nnethercote · 2024-08-08T07:35:18Z

@bors try @rust-timer queue

…try> Shrink `TyKind::FnPtr`. r? `@ghost`

bors · 2024-08-08T07:36:29Z

⌛ Trying commit f8ed8f1 with merge 5cf59d6...

bors · 2024-08-08T09:29:44Z

☀️ Try build successful - checks-actions
Build commit: 5cf59d6 (5cf59d66c11c39fb631bbe241ebfe4cf84296da7)

compiler-errors · 2024-08-08T15:44:12Z

(premature note given that this is still in testing phase, and you already are likely planning on it, but if i don't note it now i will certainly forget: given that this touches rustc type ir, make sure that either lcnr or i get to review this when it's ready)

rust-timer · 2024-08-08T22:17:57Z

Finished benchmarking commit (5cf59d6): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.5%	[-0.5%, -0.5%]	1
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results (primary -2.1%, secondary -0.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	6.9%	[6.9%, 6.9%]	1
Regressions ❌ (secondary)	4.1%	[1.9%, 5.6%]	5
Improvements ✅ (primary)	-3.4%	[-5.9%, -0.5%]	7
Improvements ✅ (secondary)	-2.4%	[-3.7%, -1.1%]	11
All ❌✅ (primary)	-2.1%	[-5.9%, 6.9%]	8

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 762.874s -> 762.449s (-0.06%)
Artifact size: 337.10 MiB -> 337.11 MiB (0.00%)

nnethercote · 2024-08-09T03:41:05Z

r? @compiler-errors

This is a peak memory use win in exchange for strictly worse ergonomics when dealing with TyKind::FnPtr. I think it's probably worth it: memory use wins are hard to find, TyKind is one of the most often instantiated types in the entire compiler, it has 21 variants and TyKind::FnPtr is the biggest one by eight bytes, and it's bigger only because the data it contains has too much padding. With my memory profiling hat on, it's screaming out to be optimized. If you check "Show non-relevant results" in the CI perf results you see it's a 0.4% peak memory reduction when averaged across every single benchmark, which might sound small but is actually pretty big. For several benchmarks the reduction is above 2%. Cache misses and page faults have similar numbers.

Here's the type size data:

print-type-size type: `rustc_middle::ty::ty_kind::TyKind<rustc_middle::ty::TyCtxt<'_>>`: 32 bytes, alignment: 8 bytes
print-type-size     discriminant: 1 bytes
print-type-size     variant `FnPtr`: 31 bytes
print-type-size         padding: 7 bytes
print-type-size         field `.0`: 24 bytes, alignment: 8 bytes

print-type-size type: `rustc_middle::infer::canonical::rustc_type_ir::Binder<rustc_middle::ty::TyCtxt<'_>, rustc_middle::ty::ty_kind::FnSig<rustc_middle::ty::TyCtxt<'_>>>`: 24 bytes, alignment: 8 bytes
print-type-size     field `.bound_vars`: 8 bytes
print-type-size     field `.value`: 16 bytes 

print-type-size type: `rustc_middle::ty::ty_kind::FnSig<rustc_middle::ty::TyCtxt<'_>>`: 16 bytes, alignment: 8 bytes
print-type-size     field `.inputs_and_output`: 8 bytes
print-type-size     field `.abi`: 2 bytes
print-type-size     field `.c_variadic`: 1 bytes
print-type-size     field `.safety`: 1 bytes
print-type-size     end padding: 4 bytes

The post-discriminant padding in TyKind is 7 bytes. When the FnSig in broken into two pieces, the compiler moves the four bytes abi/c_variadic/safety data into that 7 bytes.

I previously tried a different tack, interning FnSig and then putting a pointer to the interned value in TyKind::FnPtr, but the rustc_type_ir/rustc_middle split made it much uglier and more invasive than this approach.

rustbot · 2024-08-09T03:41:18Z

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

compiler/rustc_type_ir/src/ty_kind.rs

compiler/rustc_type_ir/src/fast_reject.rs

compiler-errors · 2024-08-09T04:03:37Z

I'm generally of the opinion that the impact to the usability of ty::FnPtr is low enough, if you think from a performance perspective that this perf win is worthwhile. 0.4~2% peak memory reduction does seem pretty small to me lol, but I also don't know what those numbers mean in production.

I'm not certain if you're willing to waste time applying my naming nits now given that there's a chance that someone @rust-lang/types might have a differing opinion, given the nature of the change being more invasive than other "change the level of derefs that need to be applied to the type" type changes like interning data. I'd be happy to hold off on unnecessary churn until we can get anyone who wants to speak their mind a chance to do so.

compiler-errors · 2024-08-09T04:19:30Z

Would be worthwhile also to fill out the PR description summarizing the motivation and effect of this change, for anyone who sees this PR tomorrow.

It's unused.

I think it's a little clearer and nicer that way.

By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI.

nnethercote · 2024-08-09T04:39:19Z

I updated the names. It wasn't hard and your suggestions were a clear improvement on the original version.

lcnr · 2024-08-09T08:56:56Z

I think this change is acceptable to decrease the type size by 8 bytes 👍

compiler-errors

OK one final round of tweaks

compiler-errors · 2024-08-09T04:05:00Z

compiler/rustc_codegen_gcc/src/type_of.rs

-                    cx.fn_ptr_backend_type(cx.fn_abi_of_fn_ptr(sig, ty::List::empty()))
-                }
+                ty::FnPtr(ins_out, csa) => cx
+                    .fn_ptr_backend_type(cx.fn_abi_of_fn_ptr(ins_out.with(csa), ty::List::empty())),


Suggested change

.fn_ptr_backend_type(cx.fn_abi_of_fn_ptr(ins_out.with(csa), ty::List::empty())),

.fn_ptr_backend_type(cx.fn_abi_of_fn_ptr(ins_out.with(csa), ty::List::empty())),

One last bikeshed: I think that it's probably better if we expose the with function on the header -- i.e. hdr.with(sig_tys). What do you think?

I don't think there's much difference either way. There's no obvious natural order. Having said that, this order (sig_tys first, then hdr) matches the field order I used in the TyKind::FnPtr variant, which matches the field order in FnSig. So I'm inclined to leave it as is.

compiler/rustc_middle/src/ty/flags.rs

compiler/rustc_middle/src/ty/layout.rs

compiler/rustc_middle/src/ty/walk.rs

compiler/rustc_trait_selection/src/traits/select/candidate_assembly.rs

`is_fn_trait_compatible` is defined on both `FnSig` and `Binder<FnSig>`.

nnethercote · 2024-08-12T06:07:32Z

There are four new commits. I made the changes you suggested, except for the with case, for the reasons explained above.

compiler-errors

One more change that I don't think got applied

compiler/rustc_trait_selection/src/traits/select/candidate_assembly.rs

compiler-errors · 2024-08-14T00:29:28Z

@bors r+ rollup=never

bors · 2024-08-14T00:29:30Z

📌 Commit bbd1c3a has been approved by compiler-errors

It is now in the queue for this repository.

bors · 2024-08-14T00:56:56Z

⌛ Testing commit bbd1c3a with merge e9c965d...

bors · 2024-08-14T03:45:46Z

☀️ Test successful - checks-actions
Approved by: compiler-errors
Pushing e9c965d to master...

rust-timer · 2024-08-14T05:00:53Z

Finished benchmarking commit (e9c965d): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.2%	[-0.6%, -0.1%]	3
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results (primary -2.7%, secondary 0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.0%	[1.0%, 1.0%]	1
Regressions ❌ (secondary)	4.1%	[2.4%, 5.3%]	4
Improvements ✅ (primary)	-3.6%	[-6.1%, -1.0%]	4
Improvements ✅ (secondary)	-2.5%	[-3.3%, -1.4%]	6
All ❌✅ (primary)	-2.7%	[-6.1%, 1.0%]	5

Cycles

Results (secondary 2.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.1%	[2.1%, 2.1%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 752.981s -> 754.011s (0.14%)
Artifact size: 341.50 MiB -> 341.48 MiB (-0.01%)

…ompiler-errors Shrink `TyKind::FnPtr`. By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI. r? `@compiler-errors`

Upgrades toolchain to 08/28 Culprit upstream changes: 1. rust-lang/rust#128812 2. rust-lang/rust#128703 3. rust-lang/rust#127679 4. rust-lang/rust-clippy#12993 5. rust-lang/cargo#14370 6. rust-lang/rust#128806 Resolves #3429 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

…ompiler-errors Shrink `TyKind::FnPtr`. By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI. r? `@compiler-errors`

nnethercote marked this pull request as draft August 8, 2024 07:20

nnethercote force-pushed the shrink-TyKind-FnPtr branch from e617475 to f8ed8f1 Compare August 8, 2024 07:30

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 8, 2024

bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 8, 2024

Auto merge of rust-lang#128812 - nnethercote:shrink-TyKind-FnPtr, r=<…

5cf59d6

…try> Shrink `TyKind::FnPtr`. r? `@ghost`

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 8, 2024

nnethercote force-pushed the shrink-TyKind-FnPtr branch 2 times, most recently from 790b78a to 3c7065d Compare August 9, 2024 00:57

rustbot assigned compiler-errors Aug 9, 2024

nnethercote marked this pull request as ready for review August 9, 2024 03:41

compiler-errors requested changes Aug 9, 2024

View reviewed changes

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 9, 2024

nnethercote added 3 commits August 9, 2024 14:21

Remove Binder::split.

3fc5469

It's unused.

Split split_inputs_and_output in two.

8640998

I think it's a little clearer and nicer that way.

rust-lang deleted a comment from rustbot Aug 9, 2024

nnethercote force-pushed the shrink-TyKind-FnPtr branch from 3c7065d to c4717cc Compare August 9, 2024 04:39

compiler-errors reviewed Aug 10, 2024

View reviewed changes

nnethercote added 4 commits August 12, 2024 15:20

Remove some unnecessary skip_binder calls.

9e95a2b

`is_fn_trait_compatible` is defined on both `FnSig` and `Binder<FnSig>`.

Avoid a FnPtr deconstruct-and-recreate.

f4a3ed0

Give the field in FnSigTys a name.

b9f3db6

Streamline some inputs/output traversals.

bbd1c3a

compiler-errors reviewed Aug 14, 2024

View reviewed changes

compiler/rustc_trait_selection/src/traits/select/candidate_assembly.rs Show resolved Hide resolved

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 14, 2024

bors added the merged-by-bors This PR was explicitly merged by bors. label Aug 14, 2024

bors merged commit e9c965d into rust-lang:master Aug 14, 2024
7 checks passed

rustbot added this to the 1.82.0 milestone Aug 14, 2024

bors mentioned this pull request Aug 14, 2024

Async drop codegen #123948

Open

nnethercote mentioned this pull request Aug 14, 2024

Change the return type of Ty::kind from &TyKind to TyKind. #126069

Closed

jaisnan mentioned this pull request Aug 28, 2024

Upgrade toolchain to 08/28 model-checking/kani#3454

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shrink `TyKind::FnPtr`. #128812

Shrink `TyKind::FnPtr`. #128812

nnethercote commented Aug 8, 2024 •

edited

Loading

rustbot commented Aug 8, 2024

nnethercote commented Aug 8, 2024

This comment has been minimized.

bors commented Aug 8, 2024

This comment has been minimized.

bors commented Aug 8, 2024

This comment has been minimized.

compiler-errors commented Aug 8, 2024

rust-timer commented Aug 8, 2024

nnethercote commented Aug 9, 2024

rustbot commented Aug 9, 2024

compiler-errors commented Aug 9, 2024

compiler-errors commented Aug 9, 2024

nnethercote commented Aug 9, 2024

lcnr commented Aug 9, 2024

compiler-errors left a comment

compiler-errors Aug 9, 2024 •

edited

Loading

nnethercote Aug 12, 2024 •

edited

Loading

nnethercote commented Aug 12, 2024

compiler-errors left a comment

compiler-errors commented Aug 14, 2024

bors commented Aug 14, 2024

bors commented Aug 14, 2024

bors commented Aug 14, 2024

rust-timer commented Aug 14, 2024

	.fn_ptr_backend_type(cx.fn_abi_of_fn_ptr(ins_out.with(csa), ty::List::empty())),
	.fn_ptr_backend_type(cx.fn_abi_of_fn_ptr(ins_out.with(csa), ty::List::empty())),

Shrink TyKind::FnPtr. #128812

Shrink TyKind::FnPtr. #128812

Conversation

nnethercote commented Aug 8, 2024 • edited Loading

rustbot commented Aug 8, 2024

nnethercote commented Aug 8, 2024

This comment has been minimized.

bors commented Aug 8, 2024

This comment has been minimized.

bors commented Aug 8, 2024

This comment has been minimized.

compiler-errors commented Aug 8, 2024

rust-timer commented Aug 8, 2024

Overall result: ✅ improvements - no action needed

nnethercote commented Aug 9, 2024

rustbot commented Aug 9, 2024

compiler-errors commented Aug 9, 2024

compiler-errors commented Aug 9, 2024

nnethercote commented Aug 9, 2024

lcnr commented Aug 9, 2024

compiler-errors left a comment

Choose a reason for hiding this comment

compiler-errors Aug 9, 2024 • edited Loading

Choose a reason for hiding this comment

nnethercote Aug 12, 2024 • edited Loading

Choose a reason for hiding this comment

nnethercote commented Aug 12, 2024

compiler-errors left a comment

Choose a reason for hiding this comment

compiler-errors commented Aug 14, 2024

bors commented Aug 14, 2024

bors commented Aug 14, 2024

bors commented Aug 14, 2024

rust-timer commented Aug 14, 2024

Overall result: ✅ improvements - no action needed

Shrink `TyKind::FnPtr`. #128812

Shrink `TyKind::FnPtr`. #128812

nnethercote commented Aug 8, 2024 •

edited

Loading

compiler-errors Aug 9, 2024 •

edited

Loading

nnethercote Aug 12, 2024 •

edited

Loading