-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
branchless .filter(_).count() #39107
Conversation
r? @sfackler (rust_highfive has picked a reviewer for you, use r? to override) |
// branchless count | ||
c += (&mut predicate)(&x) as usize; | ||
} | ||
c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe one doesn't want to argue style for trivial things, but I find the code cramped when simple should work well:
fn count(mut self) -> usize {
// Attempt to produce branchless count if possible
let mut count = 0;
for elt in self.iter {
count += (self.predicate)(&elt) as usize;
}
count
}
The surrounding methods use .by_ref()
which seems archaic (just &mut self.iter
would be fine if mutable reference only was required)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, will change shortly (perhaps also change the other methods en passant).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implementation looks good to me. Will wait for others to see if they agree with this special case.
Thoughts @alexcrichton @aturon? I feel somewhat conflicted. |
Seems fine to me, although it'd be nice to have some tests specifically for this combinator now that we're specializing the implementation. |
Only one of the jobs failed, but the error looks related. Strange.
|
I think my concern with this is not the specific change, but rather, what's to stop someone from removing this specialization in the future because their particular data runs faster without it? Is a comment documenting it enough? |
Only one job actually compiles rust and runs the tests, so that's expected. @BurntSushi Do you mean that if this is added, that it's part of guaranteed behaviour? I don't think it should be, and it should be possible to reevaluate changes like this. |
@bluss No, I mean, what's to prevent someone, say, a year from now submitting a PR that removes this specialization for similar reasons that @llogiq provided? I'm not referring to the behavior, but rather, the state of adding/removing optimizations like this? I suppose the best answer to my concern is an executable benchmark, but I'm not sure we have the infrastructure in place for that? (And maybe this concern just doesn't matter much in practice.) |
@BurntSushi optimizations like this can be removed/added as we see fit, I don't think we're making a binding contract to implement this method exactly this way, so we're still open to future improvements if necessary. In any case this looks good to go, so @bors: r+ |
📌 Commit bfabe81 has been approved by |
…lexcrichton branchless .filter(_).count() I found that the branchless version is only slower if we have little to no branch misses, which usually isn't the case. I notice speedups between -5% (perfect prediction) and 60% (real world data).
…lexcrichton branchless .filter(_).count() I found that the branchless version is only slower if we have little to no branch misses, which usually isn't the case. I notice speedups between -5% (perfect prediction) and 60% (real world data).
I found that the branchless version is only slower if we have little to no branch misses, which usually isn't the case. I notice speedups between -5% (perfect prediction) and 60% (real world data).