-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply optimization from #44355 to retain #48065
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise seems fine. I'd like another set of eyes though since I'm not really familiar with this code... r? @BurntSushi
src/liballoc/vec.rs
Outdated
@@ -813,14 +813,19 @@ impl<T> Vec<T> { | |||
for i in 0..len { | |||
if !f(&v[i]) { | |||
del += 1; | |||
unsafe { | |||
ptr::read(&v[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be a call to drop_in_place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes it should. I didn't know that function existed, so thank you!
@BurntSushi this PR needs your review! |
cc @gankro |
Thanks @Xaeroxe! I wonder, could this perhaps be reimplemented now as |
Probably yes, the only advantage of this approach is that the drop_in_place call here makes the job of the optimizer easier during compilation. That being said, reducing code duplication may outweigh that benefit. |
Ah yeah for now we've mostly been optimizing for readability rather than codegen in libstd so there's no need to worry, this for sure isn't the only case! For now though yeah let's optimize for reducing code duplication |
@alexcrichton Done! Thanks for your help! |
@bors: r+ |
📌 Commit fbad3b2 has been approved by |
Apply optimization from rust-lang#44355 to retain As discussed in rust-lang#44355 this PR applies a similar optimization to `Vec::retain`. For `drain_filter`, a very similar function, this improved performance by up to 20%.
This PR reverts rust-lang#48065, which aimed to optimize `Vec::retain` by making use of `Vec::drain_filter`. Unfortunately at that time, `drain_filter` was unsound. The soundness hole in `Vec::drain_filter` was fixed in rust-lang#61224 by guaranteeing that cleanup logic runs via a nested `Drop`, even in the event of a panic. Implementing this nested drop affects codegen (apparently?) and results in slower code. Fixes rust-lang#65970
Restore original implementation of Vec::retain This PR reverts rust-lang#48065, which aimed to optimize `Vec::retain` by making use of `Vec::drain_filter`. Unfortunately at that time, `drain_filter` was unsound. The soundness hole in `Vec::drain_filter` was fixed in rust-lang#61224 by guaranteeing that cleanup logic runs via a nested `Drop`, even in the event of a panic. Implementing this nested drop affects codegen (apparently?) and results in slower code. Fixes rust-lang#65970
Optimize Vec::retain Use `copy_non_overlapping` instead of `swap` to reduce memory writes, like what we've done in rust-lang#44355 and `String::retain`. rust-lang#48065 already tried to do this optimization but it is reverted in rust-lang#67300 due to bad codegen of `DrainFilter::drop`. This PR re-implement the drop-then-move approach. I did a [benchmark](https://gist.github.com/oxalica/3360eec9376f22533fcecff02798b698) on small-no-drop, small-need-drop, large-no-drop elements with different predicate functions. It turns out that the new implementation is >20% faster in average for almost all cases. Only 2/24 cases are slower by 3% and 5%. See the link above for more detail. I think regression in may-panic cases is due to drop-guard preventing some optimization. If it's permitted to leak elements when predicate function of element's `drop` panic, the new implementation should be almost always faster than current one. I'm not sure if we should leak on panic, since there is indeed an issue (rust-lang#52267) complains about it before.
As discussed in #44355 this PR applies a similar optimization to
Vec::retain
. Fordrain_filter
, a very similar function, this improved performance by up to 20%.