Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement unwrap_unchecked using transmutes when niche-optimizations are in play #102151

Closed
wants to merge 1 commit into from

Conversation

the8472
Copy link
Member

@the8472 the8472 commented Sep 22, 2022

No description provided.

@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Sep 22, 2022
@rust-highfive
Copy link
Collaborator

r? @thomcc

(rust-highfive has picked a reviewer for you, use r? to override)

@rustbot
Copy link
Collaborator

rustbot commented Sep 22, 2022

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 22, 2022
@the8472
Copy link
Member Author

the8472 commented Sep 22, 2022

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 22, 2022
@bors
Copy link
Contributor

bors commented Sep 22, 2022

⌛ Trying commit b852d54f74955dd93d08f60cb1d5f31fc17d1104 with merge a4bad50fb7be69e1d11952a70824ba9949d48ad7...

library/core/src/result.rs Outdated Show resolved Hide resolved
@the8472 the8472 force-pushed the unwrap-transmute branch 2 times, most recently from 013a7f9 to b3ca318 Compare September 22, 2022 20:13
@the8472
Copy link
Member Author

the8472 commented Sep 22, 2022

@bors try

@bors
Copy link
Contributor

bors commented Sep 22, 2022

⌛ Trying commit b3ca318d62aa09f0e6ce1e234b4932a98d53d724 with merge 3c5bba5b33e8613c13885b42641ada47186c9186...

@bors
Copy link
Contributor

bors commented Sep 22, 2022

☀️ Try build successful - checks-actions
Build commit: 3c5bba5b33e8613c13885b42641ada47186c9186 (3c5bba5b33e8613c13885b42641ada47186c9186)

1 similar comment
@bors
Copy link
Contributor

bors commented Sep 22, 2022

☀️ Try build successful - checks-actions
Build commit: 3c5bba5b33e8613c13885b42641ada47186c9186 (3c5bba5b33e8613c13885b42641ada47186c9186)

@rust-timer
Copy link
Collaborator

Queued 3c5bba5b33e8613c13885b42641ada47186c9186 with parent e7119a0, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (3c5bba5b33e8613c13885b42641ada47186c9186): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
0.4% [0.2%, 1.0%] 21
Regressions ❌
(secondary)
0.5% [0.1%, 1.1%] 10
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.4% [0.2%, 1.0%] 21

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
2.7% [2.7%, 2.7%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.7% [-1.7%, -1.7%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.5% [-1.7%, 2.7%] 2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.5% [-2.8%, -2.2%] 2
Improvements ✅
(secondary)
-1.8% [-1.8%, -1.8%] 1
All ❌✅ (primary) -2.5% [-2.8%, -2.2%] 2

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Sep 23, 2022
@thomcc
Copy link
Member

thomcc commented Sep 23, 2022

Hm, very surprising that this would have any overhead. I kind of suspect it's an artifact of the compilation process, but who knows.

@scottmcm
Copy link
Member

Maybe it's just that the MIR for it is already pretty good. Essentially it's just

if compute_discriminant(self) == 1 {
    _0 = move ((_1 as Some).0: T);
} else {
    call std::hint::unreachable()
}

That memcpy is roughly the same to LLCM as the transmute-copy, for a niched layout, and noticing that it doesn't need the condition at all and dropping unnecessary code is already pretty cheap for LLVM.

And rust ends up emitting all the discriminant calculation LLVM-IR anyway, since folding the if away doesn't happen in the polymorphic MIR, and thus cg_llvm will still need to output that code, which seems like it loses any advantage that it might have over from the transmute being easier for it to understand.

Maybe that could be improved by making it if const { mem::size_of::<T>() == mem::size_of::<Self>() } (thanks, #96557!) and reviving some of #91222 to be smarter about the resulting constant branches in codegen.

// SAFETY: Size equality implies niches are involved. And with niches
// transmutes are ok because they don't change bits, only make use of invalid values
unsafe {
let val = mem::transmute_copy(&self);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YMMV: if you want to save the separate forget call, I think you can write this as

Suggested change
let val = mem::transmute_copy(&self);
return mem::transmute_copy(&ManuallyDrop::new(self));

(Since forget is just putting it in a ManuallyDrop and ignoring it these days anyway.)

@the8472
Copy link
Member Author

the8472 commented Sep 23, 2022

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 23, 2022
@bors
Copy link
Contributor

bors commented Sep 23, 2022

⌛ Trying commit 7157cfe with merge 24860e60db7c91164ed67469532d69c3ab700541...

@the8472
Copy link
Member Author

the8472 commented Sep 23, 2022

Maybe it's just that the MIR for it is already pretty good. Essentially it's just

But the LLVM-IR contains an assume which aiui is not without downsides. Transmuting avoids it entirely. https://rust.godbolt.org/z/4xorq6E5T

@bors
Copy link
Contributor

bors commented Sep 23, 2022

☀️ Try build successful - checks-actions
Build commit: 24860e60db7c91164ed67469532d69c3ab700541 (24860e60db7c91164ed67469532d69c3ab700541)

@rust-timer
Copy link
Collaborator

Queued 24860e60db7c91164ed67469532d69c3ab700541 with parent 9a963e3, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (24860e60db7c91164ed67469532d69c3ab700541): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
0.4% [0.2%, 0.7%] 34
Regressions ❌
(secondary)
0.5% [0.2%, 1.1%] 13
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.4% [0.2%, 0.7%] 34

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
3.7% [3.7%, 3.7%] 1
Regressions ❌
(secondary)
2.0% [2.0%, 2.0%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 3.7% [3.7%, 3.7%] 1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
3.2% [3.2%, 3.2%] 1
Regressions ❌
(secondary)
2.6% [2.6%, 2.6%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.9% [-3.9%, -3.9%] 1
All ❌✅ (primary) 3.2% [3.2%, 3.2%] 1

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 23, 2022
@the8472
Copy link
Member Author

the8472 commented Sep 23, 2022

compile-time losses are consistent with the previous run, but so are binary-size changes (especially in opt-full builds) and it is spending more time in LLVM, so it's having an effect on the optimizer, just not the one expected.

I'll take a look at the generated assembly maybe they're diffable.

@the8472
Copy link
Member Author

the8472 commented Sep 24, 2022

This code is somewhere in RawVec, the right side is this branch.

image

I think llvm makes use of that assume downstream of a next_unchecked's result being re-packaged into another Option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants