-
-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use binary search instead of linear for get_val in merge #1548
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1548 +/- ##
=======================================
Coverage 93.93% 93.93%
=======================================
Files 249 249
Lines 46066 46066
=======================================
Hits 43274 43274
Misses 2792 2792
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
.binary_search_by(|&offset| offset.cmp(&pos)) | ||
.unwrap_or_else(|pos| pos - 1) as DocId; // Offsets start at 0, so -1 is safe | ||
|
||
// There are duplicates in the data, but we want the one, where the next offset is not the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be possible to get this behaviour directly by using partition_point
instead of binary_search_by
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of something like
let new_doc_id = self.offsets.partition_point(|&offset| offset > pos) as DocId - 1;
which should point at last entry with offset <= pos
immediately without the second loop, shouldn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool that method is rater new, but I had to flip the condition to make the partition work ([true, true, .., false, false, ..]
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I am sorry, I got confused. Yes, you did flip the condition, but should the extra while loop be unnecessary now because you already get the last index which fulfils the original condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes, I forgot to remove that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are the mutable binding and the extra braces inside the closure left from the original implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you @adamreichold
8a59df8
to
15f21d7
Compare
- 1u32; | ||
// the offsets are strictly increasing so we can do a binary search on it. | ||
|
||
let new_doc_id: DocId = self.offsets.partition_point(|&offset| offset <= pos) as DocId - 1; // Offsets start at 0, so -1 is safe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great comment!
No description provided.