Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding method to consume an iterator efficiently #64117

Closed
Phosphorus15 opened this issue Sep 3, 2019 · 5 comments
Closed

Adding method to consume an iterator efficiently #64117

Phosphorus15 opened this issue Sep 3, 2019 · 5 comments
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@Phosphorus15
Copy link
Contributor

Currently, mapping methods for iterator, e.g. map and filter are implemented in a lazy manner. Which means they would not be calculated until further consumption.
However, sometimes it can be the case that given an Iterator (whatever it actually is) . we just wanted its side-effect and to simply consume it. for instance :

#[derive(Debug)]
struct Item(i32);

fn take_iter<T : Iterator>(iter: T) {
    // we just want to consume the `iter` here
    // do something else
}

fn main() {
    let ve = vec![Item(1), Item(2), Item(3), Item(1)];

    take_iter(ve.iter().filter(|value| value.0 < 3).map(|value| {
        println!("{:?}", value);
        value
    }));
}

Currently, there're several ways of achieving this goal, for example :

fn take_iter<T : Iterator>(iter: T) {
    iter.for_each(|_| {}); // consumed
     // do something else
}
fn take_iter<T : Iterator>(iter: T) {
    iter.count(); // consumed
    // do something else
}

While they're both usable, it's obvious that these methods are at the cost of efficiency, for_each yields a call to closure which is totally unnecessary, while count finally counting the number of elements in vain.

@Phosphorus15
Copy link
Contributor Author

I suggest adding a method consume for the Iterator trait, which takes no params and effectively consumes each element in an iterator to preserve their side-effects only. Then the example above could be written as :

fn take_iter<T : Iterator>(iter: T) {
    iter.consume(); // consumed
    // do something else
}

@jakubadamw
Copy link
Contributor

jakubadamw commented Sep 3, 2019

for_each yields a call to closure which is totally unnecessary.

That call to a no-op closure will most certainly be optimized out in release builds.

In the case of the code that you've provided using map() for side effects is not really idiomatic. The lazy inspect() combinator or the strict for_each() consumer would be more suited for invoking a println!() on each element. inspect() for when you're trying to observe/debug the evaluation of the lazy iterator and for_each() for when you want to consume the iterator and produce predictable output. I'm not sure which it is in your case. I guess the latter?

In either case, I think the proposed scenario does not deserve a separate method. If you find yourself in need of force-consuming an iterator that produces side effects, that is a strong hint that you should move the effectful operations (like println!()) from closures within the lazy composition of your iterator to the part of the program that consumes the iterator, be that a for_each, a for loop or whatever else.

@Centril Centril added C-feature-request Category: A feature request, i.e: not implemented / a PR. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Sep 3, 2019
@Phosphorus15
Copy link
Contributor Author

Yea, most of the time a for_each() is quiet enough, and it would be a rare case that one would like to preserve the side effect of previous mappings only.
But in case that someone wants to perform an I/O action, or just to swiftly printing some debug output by inspect etc. . It would do no harm that we add this method for good (while indeed, not that necessary).

@clarfonthey
Copy link
Contributor

I recommend the idiom .for_each(drop) because it's clearer; what you're suggesting is to basically make an alias for this. And, like was said, this is usually optimised out.

@scottmcm
Copy link
Member

scottmcm commented Sep 4, 2019

As @clarfon noted, this has been discussed before (#48945), leading to the current suggestion to use .for_each(drop) for this, which is recommended in warnings as well:

#[must_use = "if you really need to exhaust the iterator, consider `.for_each(drop)` instead"]
fn collect<B: FromIterator<Self::Item>>(self) -> B where Self: Sized {

As such I'm going to close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants