-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shrink TyKind::FnPtr
.
#128812
Shrink TyKind::FnPtr
.
#128812
Conversation
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt Some changes occurred to the CTFE / Miri engine cc @rust-lang/miri Some changes occurred in exhaustiveness checking cc @Nadrieril changes to the core type system Some changes occurred in compiler/rustc_codegen_cranelift cc @bjorn3 Some changes occurred in compiler/rustc_sanitizers cc @rust-lang/project-exploit-mitigations, @rcvalle Some changes occurred in src/tools/clippy cc @rust-lang/clippy |
e617475
to
f8ed8f1
Compare
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
…try> Shrink `TyKind::FnPtr`. r? `@ghost`
This comment has been minimized.
This comment has been minimized.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
(premature note given that this is still in testing phase, and you already are likely planning on it, but if i don't note it now i will certainly forget: given that this touches rustc type ir, make sure that either lcnr or i get to review this when it's ready) |
Finished benchmarking commit (5cf59d6): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results (primary -2.1%, secondary -0.4%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 762.874s -> 762.449s (-0.06%) |
790b78a
to
3c7065d
Compare
This is a peak memory use win in exchange for strictly worse ergonomics when dealing with Here's the type size data:
The post-discriminant padding in I previously tried a different tack, interning |
Some changes occurred in compiler/rustc_codegen_gcc |
I'm generally of the opinion that the impact to the usability of I'm not certain if you're willing to waste time applying my naming nits now given that there's a chance that someone @rust-lang/types might have a differing opinion, given the nature of the change being more invasive than other "change the level of derefs that need to be applied to the type" type changes like interning data. I'd be happy to hold off on unnecessary churn until we can get anyone who wants to speak their mind a chance to do so. |
Would be worthwhile also to fill out the PR description summarizing the motivation and effect of this change, for anyone who sees this PR tomorrow. |
It's unused.
I think it's a little clearer and nicer that way.
By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI.
3c7065d
to
c4717cc
Compare
I updated the names. It wasn't hard and your suggestions were a clear improvement on the original version. |
I think this change is acceptable to decrease the type size by 8 bytes 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK one final round of tweaks
cx.fn_ptr_backend_type(cx.fn_abi_of_fn_ptr(sig, ty::List::empty())) | ||
} | ||
ty::FnPtr(ins_out, csa) => cx | ||
.fn_ptr_backend_type(cx.fn_abi_of_fn_ptr(ins_out.with(csa), ty::List::empty())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.fn_ptr_backend_type(cx.fn_abi_of_fn_ptr(ins_out.with(csa), ty::List::empty())), | |
.fn_ptr_backend_type(cx.fn_abi_of_fn_ptr(ins_out.with(csa), ty::List::empty())), |
One last bikeshed: I think that it's probably better if we expose the with
function on the header -- i.e. hdr.with(sig_tys)
. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's much difference either way. There's no obvious natural order. Having said that, this order (sig_tys
first, then hdr
) matches the field order I used in the TyKind::FnPtr
variant, which matches the field order in FnSig
. So I'm inclined to leave it as is.
compiler/rustc_trait_selection/src/traits/select/candidate_assembly.rs
Outdated
Show resolved
Hide resolved
`is_fn_trait_compatible` is defined on both `FnSig` and `Binder<FnSig>`.
There are four new commits. I made the changes you suggested, except for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more change that I don't think got applied
@bors r+ rollup=never |
☀️ Test successful - checks-actions |
Finished benchmarking commit (e9c965d): comparison URL. Overall result: ✅ improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results (primary -2.7%, secondary 0.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary 2.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 752.981s -> 754.011s (0.14%) |
…ompiler-errors Shrink `TyKind::FnPtr`. By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI. r? `@compiler-errors`
Upgrades toolchain to 08/28 Culprit upstream changes: 1. rust-lang/rust#128812 2. rust-lang/rust#128703 3. rust-lang/rust#127679 4. rust-lang/rust-clippy#12993 5. rust-lang/cargo#14370 6. rust-lang/rust#128806 Resolves #3429 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.
…ompiler-errors Shrink `TyKind::FnPtr`. By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI. r? `@compiler-errors`
By splitting the
FnSig
withinTyKind::FnPtr
intoFnSigTys
andFnHeader
, which can be packed more efficiently. This reduces the size of the hotTyKind
type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI.r? @compiler-errors