-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add iterator specialisations for Repeat and Cycle #47370
Conversation
Could you clarify the rationale for doing this? Who is asking for the maximum value of an infinite iterator? |
29ba2f2
to
d6e2867
Compare
@sfackler: I think this provides the least surprising behaviour, is semantically correct (if we can determine the result of a method call, I think we should do it, rather than looping) and provides more flexibility for generic functions — it's an edge case, but if it's one the compiler can worry about instead of the user, that seems like a plus. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of this change; it feels very brittle. There are a ton of infinite iterators where this wouldn't be the case, including ones very similar to ones where it applies -- even just .filter(|_| true)
stops these updates from working.
src/libcore/iter/mod.rs
Outdated
@@ -643,6 +643,41 @@ impl<I> Iterator for Cycle<I> where I: Clone + Iterator { | |||
_ => (usize::MAX, None) | |||
} | |||
} | |||
|
|||
#[inline] | |||
fn all<F>(&mut self, f: F) -> bool where F: FnMut(Self::Item) -> bool { self.orig.clone().all(f) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a new clone of orig
instead of using iter
is wrong here, since it'll short-circuit in a different place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's true.
#[inline] | ||
fn size_hint(&self) -> (usize, Option<usize>) { (usize::MAX, None) } | ||
|
||
#[inline] | ||
fn nth(&mut self, _: usize) -> Option<A> { self.next() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this one more than the others here, but it's an observable change if A::clone
has side-effects. Is that allowable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't consider that. It seems to me that A::clone
is an implementation detail that shouldn't be relied upon, but as it's technically observable at the moment, and the documentation doesn't specify otherwise, it might be worth clarifying this.
src/libcore/iter/sources.rs
Outdated
fn nth(&mut self, _: usize) -> Option<A> { self.next() } | ||
|
||
#[inline] | ||
fn all<F>(&mut self, f: F) -> bool where F: FnMut(A) -> bool { self.any(f) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by this one, since it looks like it can still be infinite. If we want this kind of behavior, wouldn't it be f(self.element.clone())
?
Which generic functions? |
IIUC some of these specializations make functions that previously looped infinitely return some value. Similar changes were previously proposed in #47169 where I and others argued against them. The arguments against behavioral changes apply to these behavioral changes as well (the ones about |
Seeing how this change is a little controversial, maybe I should have opened an issue to discuss it first (#47169 discusses some related aspects, but I think the main issues there were not all the same as here as I'll mention below) — but maybe it's helpful to have an example implementation to look at while discussing. (Sorry, slightly long post, but I do really think this behaviour is more sensible and intuitive than the existing behaviour, though I agree it does need to be discussed before making a decision.)
|
I believe there is a lot of value in the default implementations being "canonical" and all specializations implementing the exact same behavior, just with different performance.1 (Third party iterators can always screw this up, of course.) It means that programmers don't need to know at all whether a specific iterator overrides some method, not even for reasoning about termination or rare and obscure side effects. It would be extremely surprising and counter-intuitive if, for example, adding a In short, as I already expressed in #47169, I believe it's an attractive nuisance to try and "Do What I Mean" about infinite iterator some of the time, because then the many more complex cases where we can't DWIM will be very confusing. 1 This means I would actually be fine with overriding |
|
||
#[inline] | ||
fn all<F>(&mut self, f: F) -> bool where F: FnMut(Self::Item) -> bool { | ||
self.iter.clone().chain(self.orig.clone()).all(f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this short-circuits with an element inside self.orig
, this iterator should continue with the next element after the matching one in self.orig
. Currently it'll instead restart with the first element in self.orig
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's a good point. I'll fix that if the decision is to merge.
|
||
#[inline] | ||
fn min(self) -> Option<Self::Item> where Self::Item: cmp::Ord { | ||
cmp::min(self.iter.min(), self.orig.clone().min()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.iter
is a subset of self.orig
, so the minimum of self.orig
will be at least as small as the minimum of self.iter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah — this is necessary to preserve the consistency of the behaviour of min
when multiple elements are equally minimum. It's necessary to return the first/leftmost one, which in this case may not be the same as the first one in self.orig
.
Could similar optimizations be performed for |
@oberien: There's a separate PR for |
👎 on the general idea of replacing an unbounded sequence with a shortcut that is surprising and possibly quickly disappears when the iterator is part of a composition with adaptors. In particular, I think this breaks equivalence between
Anything you improve in |
Okay, I'm inclined to agree that
Edit: Without defining these specialisations for all relevant methods, it gets messy with chaining. Maybe it is too late to make a change to the semantics now, without introducing explicit new traits. |
It seems there's a general consensus about the behaviour of methods on infinite iterators, so I'll go ahead and close the PR. I think a documentation tweak is in order to clarify these details though, as following these discussions I don't think there's a single behaviour that is completely obvious. Thanks for all your input everyone! One last matter I wanted to clear up regarding Does anyone feel there are issues with even |
Not quite what you're talking about, but we do take the liberty to omit clone calls for things that are Copy: https://github.com/rust-lang/rfcs/blob/master/text/1521-copy-clone-semantics.md However, the justification for that is stronger than "Clone should not have side effects period". |
There's at least an effort to maintain the clone order for |
@rkruppe: That does seem to be in the same vein. Arguably, it makes more sense to have to justify why calling @qnighy: Sorry, I don't quite see what you're referring to there. Isn't that an optimisation for |
@varkor Sorry for less explanation. The struct Foo;
impl Clone for Foo {
fn clone(&self) -> Self {
println!("Foo::clone()");
Foo
}
}
fn main() {
[Foo, Foo].iter().cloned().zip([0].into_iter()).collect::<Vec<_>>();
} produces two calls to Back to Of course, it doesn't necessarily mean that there is a general consensus about the number of clone calls. It's just an example of existing conservative implementation. |
@varkor it sounds like you're planning on closing the PR in #47370 (comment), so I'm going to go ahead and do so now and then you can open a PR for future work. Feel free to continue any discussions, take them to internals etc! If you want to reopen because you're actually not finished with this, ping me and I'll do so. |
@aidanhs: that's fine — I was going to close it soon myself! |
@varkor I definitely think that you should create an RFC for these suggestions. Together with a solution for #47082 , there are several performance benefits to be had. While the exact behaviour is up for discussion, an RFC regarding optimizing adapters on infinite iterators could unveil more edge-cases and further ideas for optimizations. |
Repeat
andCycle
are infinite iterators that make use of the default iterator implementations, which means that for some methods for which there is a determinable return value, they infinitely loop instead.This adds specialisations so that the iterators return values as expected. Also optimises
nth
forRepeat
.r? @alexcrichton