Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get !nonnull metadata on slice iterators, without assumes #113344

Merged
merged 1 commit into from
Jul 21, 2023

Conversation

scottmcm
Copy link
Member

@scottmcm scottmcm commented Jul 4, 2023

This updates the non-ZST paths to read the end pointer through a pointer-to-NonNull, so that they all get !nonnull metadata.

That means that the last assume(!ptr.is_null()) can be deleted, without impacting codegen -- the codegen tests confirm the LLVM-IR ends up exactly the same as before.

@rustbot
Copy link
Collaborator

rustbot commented Jul 4, 2023

r? @joshtriplett

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 4, 2023
@@ -68,7 +68,7 @@ pub struct Iter<'a, T: 'a> {
/// For non-ZSTs, the non-null pointer to the past-the-end element.
///
/// For ZSTs, this is `ptr::invalid(len)`.
end: *const T,
end_or_len: *const T,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish this could use a union, but it can't because that makes it able to store undef, which we don't want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time for a MostCertainlyInitialized<T>? 😄

pub fn slice_iter_is_empty(it: &std::slice::Iter<'_, u32>) -> bool {
// CHECK: %[[ENDP:.+]] = getelementptr{{.+}}ptr %it,{{.+}} 1
// CHECK: %[[END:.+]] = load ptr, ptr %[[ENDP]]
// CHECK-SAME: !nonnull
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bors
Copy link
Contributor

bors commented Jul 9, 2023

☔ The latest upstream changes (presumably #113508) made this pull request unmergeable. Please resolve the merge conflicts.

@scottmcm scottmcm force-pushed the alt-slice-zst-handing branch from 54997bf to 2e899a0 Compare July 19, 2023 02:39
@scottmcm
Copy link
Member Author

r? @the8472

@rustbot rustbot assigned the8472 and unassigned joshtriplett Jul 19, 2023
@the8472
Copy link
Member

the8472 commented Jul 20, 2023

Except for the comment it looks good.

Probably should do a perf run, although I do expect that if there are any regressions they would be due to llvm doing more work.

@bors
Copy link
Contributor

bors commented Jul 20, 2023

☔ The latest upstream changes (presumably #113758) made this pull request unmergeable. Please resolve the merge conflicts.

@scottmcm scottmcm force-pushed the alt-slice-zst-handing branch from 2e899a0 to 34732e8 Compare July 20, 2023 18:34
@scottmcm
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 20, 2023
@bors
Copy link
Contributor

bors commented Jul 20, 2023

⌛ Trying commit 34732e8 with merge f1e8277ffa6108869c09034cdcea0749e0856e26...

@bors
Copy link
Contributor

bors commented Jul 20, 2023

☀️ Try build successful - checks-actions
Build commit: f1e8277ffa6108869c09034cdcea0749e0856e26 (f1e8277ffa6108869c09034cdcea0749e0856e26)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (f1e8277ffa6108869c09034cdcea0749e0856e26): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.6% [0.4%, 0.9%] 3
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-5.2% [-8.2%, -0.3%] 5
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -3.0% [-8.2%, 0.9%] 8

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
4.7% [2.4%, 7.4%] 5
Regressions ❌
(secondary)
2.2% [2.2%, 2.2%] 1
Improvements ✅
(primary)
-2.2% [-2.6%, -1.2%] 12
Improvements ✅
(secondary)
-2.5% [-3.0%, -2.0%] 2
All ❌✅ (primary) -0.2% [-2.6%, 7.4%] 17

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.3% [0.8%, 1.9%] 2
Regressions ❌
(secondary)
1.8% [1.8%, 1.8%] 1
Improvements ✅
(primary)
-6.7% [-8.7%, -0.8%] 4
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -4.0% [-8.7%, 1.9%] 6

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.4% [0.1%, 1.2%] 90
Regressions ❌
(secondary)
0.6% [0.0%, 3.8%] 81
Improvements ✅
(primary)
-0.3% [-0.8%, -0.1%] 5
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.4% [-0.8%, 1.2%] 95

Bootstrap: 649.387s -> 652.821s (0.53%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jul 20, 2023
@scottmcm
Copy link
Member Author

That's probably more neutral than the summary looks, since the 8% wins are in an opt-incr (though squinting at the graph I don't think they're noise?), but I think the perf results are entirely tolerable for landing this. There's unfortunately some more code in debug, since not being able to use a union makes the work this has to do more of a mess than I'd like (rust-lang/unsafe-code-guidelines#73 (comment)), but if it was really bad for perf it'd have a much worse perf impact than this, and this gives LLVM more annotations on the reads, so hopefully will help runtime more too.

@bors r=the8472

@bors
Copy link
Contributor

bors commented Jul 20, 2023

📌 Commit 34732e8 has been approved by the8472

It is now in the queue for this repository.

@bors bors removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 20, 2023
@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Jul 20, 2023
@bors
Copy link
Contributor

bors commented Jul 21, 2023

⌛ Testing commit 34732e8 with merge c720a9c...

@bors
Copy link
Contributor

bors commented Jul 21, 2023

☀️ Test successful - checks-actions
Approved by: the8472
Pushing c720a9c to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 21, 2023
@bors bors merged commit c720a9c into rust-lang:master Jul 21, 2023
@rustbot rustbot added this to the 1.73.0 milestone Jul 21, 2023
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (c720a9c): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
8.3% [8.3%, 8.3%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.1% [-8.3%, -0.5%] 63
Improvements ✅
(secondary)
-0.7% [-1.1%, -0.3%] 15
All ❌✅ (primary) -1.0% [-8.3%, 8.3%] 64

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.6% [2.4%, 2.9%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.2% [-2.3%, -2.1%] 4
Improvements ✅
(secondary)
-2.9% [-2.9%, -2.9%] 1
All ❌✅ (primary) -0.6% [-2.3%, 2.9%] 6

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
9.2% [9.2%, 9.2%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-5.9% [-9.1%, -3.4%] 9
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -4.4% [-9.1%, 9.2%] 10

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.4% [0.0%, 1.8%] 93
Regressions ❌
(secondary)
0.6% [0.0%, 3.8%] 81
Improvements ✅
(primary)
-0.2% [-0.2%, -0.2%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.4% [-0.2%, 1.8%] 95

Bootstrap: 651.863s -> 649.563s (-0.35%)

@scottmcm scottmcm deleted the alt-slice-zst-handing branch July 21, 2023 04:31
@scottmcm
Copy link
Member Author

The one regression appears to be some really bad luck in codegen, as something went very weird in LTO:
image

Not sure how this change would affect that one so badly, but not other things.

@nnethercote
Copy link
Contributor

Huh, I never realized that kind of graphical view was available on perf.rust-lang.org. TIL.

@the8472
Copy link
Member

the8472 commented Jul 21, 2023

Interesting that it makes such a difference. I would have assumed that the noundef loads and the sub nuw pointer math would already unlock most optimizations.

vec::IntoIter does quite similar stuff. I'll check if this trick can be applied there too.

@pnkfelix
Copy link
Member

Visiting for weekly performance triage

  • cranelift-codegen-0.82.1 opt-full regressed by 8.31%
  • a slew of other benchmarks improved (regex-1.5.5 incr-patched by -8.28%, bitmaps incr by 1.2-1.4%, the rest by -1% or less)
  • overall, a nice win. And @the8472 said they are going to check if they can apply a known trick to address the one regression that was flagged here. That's enough to let me mark this as triaged.

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Jul 26, 2023
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 21, 2023
mark vec::IntoIter pointers as `!nonnull`

This applies the same NonNull optimizations to `vec::IntoIter` as  rust-lang#113344 did for `slice::Iter`

[Godbolt](https://rust.godbolt.org/z/n1cTea718) showing the test IR on current nightly, note the absence of `!nonnull` on the loads.

r? `@scottmcm`
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 21, 2023
mark vec::IntoIter pointers as `!nonnull`

This applies the same NonNull optimizations to `vec::IntoIter` as  rust-lang#113344 did for `slice::Iter`

[Godbolt](https://rust.godbolt.org/z/n1cTea718) showing the test IR on current nightly, note the absence of `!nonnull` on the loads.

r? `@scottmcm`
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 4, 2024
mark vec::IntoIter pointers as `!nonnull`

This applies the same NonNull optimizations to `vec::IntoIter` as  rust-lang#113344 did for `slice::Iter`

[Godbolt](https://rust.godbolt.org/z/n1cTea718) showing the test IR on current nightly, note the absence of `!nonnull` on the loads.

r? `@scottmcm`
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 7, 2024
mark vec::IntoIter pointers as `!nonnull`

This applies the same NonNull optimizations to `vec::IntoIter` as  rust-lang#113344 did for `slice::Iter`

[Godbolt](https://rust.godbolt.org/z/n1cTea718) showing the test IR on current nightly, note the absence of `!nonnull` on the loads.

r? `@scottmcm`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants