Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use partition_point instead of binary_search when looking up source lines #101999

Merged
merged 1 commit into from
Sep 22, 2022

Conversation

the8472
Copy link
Member

@the8472 the8472 commented Sep 18, 2022

In local benchmarks this results in 0.4% fewer cycles in a critical sequential section when compiling libcore.

@rust-highfive
Copy link
Collaborator

r? @davidtwco

(rust-highfive has picked a reviewer for you, use r? to override)

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Sep 18, 2022
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 18, 2022
@the8472
Copy link
Member Author

the8472 commented Sep 18, 2022

let's see if perf agrees.

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 18, 2022
@rust-log-analyzer

This comment has been minimized.

…ines

In local benchmarks this results in 0.4% fewer cycles in a critical sequential
section when compiling libcore.
@the8472 the8472 force-pushed the source-lines-partition-point branch from 3619b0c to 40b3726 Compare September 18, 2022 22:48
@the8472
Copy link
Member Author

the8472 commented Sep 18, 2022

@bors try

@bors
Copy link
Contributor

bors commented Sep 18, 2022

⌛ Trying commit 40b3726 with merge 28a407643f8543f58b9b3cc67e70649d5827ae9c...

@bors
Copy link
Contributor

bors commented Sep 19, 2022

☀️ Try build successful - checks-actions
Build commit: 28a407643f8543f58b9b3cc67e70649d5827ae9c (28a407643f8543f58b9b3cc67e70649d5827ae9c)

@rust-timer
Copy link
Collaborator

Queued 28a407643f8543f58b9b3cc67e70649d5827ae9c with parent a37499a, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (28a407643f8543f58b9b3cc67e70649d5827ae9c): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.3% [-0.9%, -0.2%] 52
Improvements ✅
(secondary)
-0.6% [-1.6%, -0.2%] 35
All ❌✅ (primary) -0.3% [-0.9%, -0.2%] 52

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.2% [-3.2%, -3.1%] 2
All ❌✅ (primary) - - 0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
6.0% [6.0%, 6.0%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 6.0% [6.0%, 6.0%] 1

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 19, 2022
@davidtwco
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Sep 21, 2022

📌 Commit 40b3726 has been approved by davidtwco

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 21, 2022
@bors
Copy link
Contributor

bors commented Sep 22, 2022

⌛ Testing commit 40b3726 with merge 3e50038...

@bors
Copy link
Contributor

bors commented Sep 22, 2022

☀️ Test successful - checks-actions
Approved by: davidtwco
Pushing 3e50038 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 22, 2022
@bors bors merged commit 3e50038 into rust-lang:master Sep 22, 2022
@rustbot rustbot added this to the 1.66.0 milestone Sep 22, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (3e50038): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.3% [-0.7%, -0.2%] 62
Improvements ✅
(secondary)
-0.6% [-1.6%, -0.2%] 41
All ❌✅ (primary) -0.3% [-0.7%, -0.2%] 62

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
3.7% [3.7%, 3.7%] 1
Regressions ❌
(secondary)
4.1% [2.5%, 6.8%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.9% [-2.9%, -2.9%] 1
All ❌✅ (primary) 3.7% [3.7%, 3.7%] 1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
2.8% [2.8%, 2.8%] 1
Regressions ❌
(secondary)
7.9% [7.9%, 7.9%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.8% [2.8%, 2.8%] 1

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@nnethercote
Copy link
Contributor

@the8472: Nice improvement! How did you find it?

@nnethercote
Copy link
Contributor

I'm also surprised it's such a good win, given that partition_point just calls binary_search_by.

@the8472
Copy link
Member Author

the8472 commented Sep 27, 2022

How did you find it?

I noticed that libcore has a long sequential bottleneck and ran it under perf and filtered the results down to the bottlenecked thread. The binary search stood out as something that I was familiar with. Almost all of the bottleneck is still there if you want to take a stab 🗡️.

I'm also surprised it's such a good win, given that partition_point just calls binary_search_by.

It turns an unpredictable 3-way branch (lt, eq, gt) into an also unpredictable 2-way branch. Which I think then get turned into cmovs, but I didn't check the assembly. binary_search is worthwhile when you expect to hit the element exactly, i.e. when the eq case can short-circuit. But for getting source lines that would mean looking up the first byte of a line which is unlikely, so partition_point is fine.

@scottmcm
Copy link
Member

scottmcm commented Sep 30, 2022

@nnethercote binary_search used to be the correct thing here, but #74024 changed its implementation to a form that's better for expensive comparators but worse for cheap comparators -- like BytePos comparison.

(On average, the item is found only in the last iteration, since all the iterations before the last one only looked at half the items. Thus checking for equality to early-exit makes all the iterations slower but on average only saves one iteration. It also adds a second exit condition to the loop, which makes it harder to canonicalize and thus can keep LLVM from being able to unroll it.)

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Oct 18, 2022
…r=m-ou-se

More slice::partition_point examples

After seeing the discussion of `binary_search` vs `partition_point` in rust-lang#101999, I thought some more example code could be helpful.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Oct 18, 2022
…r=m-ou-se

More slice::partition_point examples

After seeing the discussion of `binary_search` vs `partition_point` in rust-lang#101999, I thought some more example code could be helpful.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants