Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Iterator::map_windows #82413

Conversation

LukasKalbertodt
Copy link
Member

This method returns an iterator over mapped windows of the starting iterator. Adding the more straight-forward Iterator::windows is not easily possible right now as the items are stored in the iterator type, meaning the next call would return references to self. This is not allowed by the current Iterator trait design. This might change once GATs have landed.

The idea has been brought up by @m-ou-se here.

I think this method is quite versatile and useful. I hope the examples in the doc comment sufficiently demonstrate that.

One open question: the current design basically requires that the internal array buffer is shifted on each next call, meaning it has an O(iter.count() * N) runtime complexity per next (where N is the const parameter). I expect this method to be called with very small Ns only, but of course it could be used with large arrays. To avoid this shifting, we would need an array version of VecDeque basically. But this would also mean the closure wouldn't get a simple parameter anymore. I think we want this simpler but "slower for large Ns" version, but what do others thing? Maybe the docs should mention this caveat?

@rust-highfive
Copy link
Collaborator

r? @sfackler

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 22, 2021
Comment on lines 37 to 38
buffer.rotate_left(1);
buffer[N - 1] = next;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly wasteful in theory: the first element does not need to be copied to the last slot as it's overwritten later anyway. I expect this to be optimized out anyway, but I haven't checked.

I haven't found a better non-unsafe way to do this. copy_within requires T: Copy, swap would work, but I doubt doing a bunch of swaps is faster than the current version. Of course, the unsafe would be straight-forward but I still wanted to avoid it for now. Any ideas or opinions?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using unsafe in std lib is acceptable if it leads to a performance advantage.

Copy link
Contributor

@JulianKnodt JulianKnodt Feb 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to just write to the first element then rotate?

Edit: On reflection I think this is probably not that costly either way, have to look at godbolt for it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing to the first element doesn't seem to change that much compared to the current approach.

I played a bit on godbolt and I found 3 situations:

  • They're optimized almost the same way. This often happen for tiny Ts, but sometimes also for bigger ones.
  • The unsafe version may be optimized to 4-5x less assembly. This often happen for bigger Ts but small N.
  • The unsafe version is optimized to a memmove call and the safe version to either a lot of moves or a loop. This often happen for bigger Ts and bigger N

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that, if .rotate_right(1) and .rotate_left(1) don't optimize to the obvious memmove, then that's a bug in those methods that should be fixed.

(I vaguely recall someone on Discord was going to file me a bug about this...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SkiFire13 Thanks for testing this stuff. But I agree with @scottmcm in that this should probably optimize to the optimal code basically. The implementation of this can always be changed later, so I wouldn't say it is hugely important to get this as efficient as possible in this PR. Especially before we even agreed we want this method.

@rust-log-analyzer

This comment has been minimized.

@leonardo-m

This comment has been minimized.

@the8472
Copy link
Member

the8472 commented Feb 22, 2021

One open question: the current design basically requires that the internal array buffer is shifted on each next call, meaning it has an O(iter.count() * N) runtime complexity per next (where N is the const parameter). I expect this method to be called with very small Ns only, but of course it could be used with large arrays. To avoid this shifting, we would need an array version of VecDeque basically.

There are a few places where this can be done more efficiently: Vec, BinaryHeap and [].
Adding a []::windowed_iter<const N>() could then return overlapping slices. Perhaps we should offer it only where it's efficient?

@SkiFire13
Copy link
Contributor

Adding a []::windowed_iter() could then return overlapping slices. Perhaps we should offer it only where it's efficient?

Wouldn't this be the same as array_windows?

IMO a great advantage of map_windows compared to array_windows is that it doesn't require to borrow anything, so you can return it from functions that don't or can't take the slice as parameter.

Anyway I don't think this is ready to be in the stdlib. I think it would make more sense to have it in itertools for now. There's also the possibility that we could get the same result by combining an hypotetical StreamingIterator::windows with StreamingIterator::map_iter or something like that.

@the8472
Copy link
Member

the8472 commented Feb 23, 2021

Adding a []::windowed_iter() could then return overlapping slices. Perhaps we should offer it only where it's efficient?

Wouldn't this be the same as array_windows?

Oh. Yeah, exactly what I had in mind.

IMO a great advantage of map_windows compared to array_windows is that it doesn't require to borrow anything, so you can return it from functions that don't or can't take the slice as parameter.

Vec::into_array_windows() could work if you need an owned version.

Of course this is still limiting compared to a general iterator adapter, but you could collect into an intermediate Vec and then create the windows more cheaply from that, e.g.

iter.skip(...).filter_map(...).collect::<Vec<_>>().into_array_windows().fold(...)

@scottmcm
Copy link
Member

A possible way to avoid the excess data movement: use a 2*N-length internal array. Then most of the time you don't need to move anything, but every N elements you need to move the N-1 things in the buffer back to the beginning (which would be a copy_nonoverlapping). That way it'd be amortized-O(1) copying per step, and when it does a copy it'd be in a nice big chunk that should copy efficiently.

@LukasKalbertodt
Copy link
Member Author

Anyway I don't think this is ready to be in the stdlib. I think it would make more sense to have it in itertools for now.

Could you elaborate on that? Why do you feel this is not ready yet?

There's also the possibility that we could get the same result by combining an hypotetical StreamingIterator::windows with StreamingIterator::map_iter or something like that.

Absolutely, and I would love to immediately go that path. But as I said, StreamingIterator (well, hopefully it will be the same as Iterator) is probably still quite a while out.

A possible way to avoid the excess data movement: [...]

That's a great idea! Do you think I should already change my implementation?

@JulianKnodt
Copy link
Contributor

Is there an unsafe way to concat two slices of known size to a new size using transmute or try_into? This would help fix the time complexity while still having a nice type &[T; { tail_to_head + head_to_tail }]?

@the8472
Copy link
Member

the8472 commented Feb 27, 2021

Is there an unsafe way to concat two slices of known size to a new size using transmute or try_into?

Even ignoring the issue that it isn't safe in general it would only work on linear memory. If the data is out of order (i.e. needs rotation) then the slices aren't pointing to a linear section of memory.

@JulianKnodt
Copy link
Contributor

JulianKnodt commented Feb 27, 2021

Hm that's true, but in theory we could add a layer of indirection [&T; <something>](which might be inefficient) but not sure if that's desirable.

Edit:
It's costly because you have to do two memcopies presumably, but here's the general idea: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=a8b8d805a7e52bd2eb634ee681528928

Long post-facto correction: it won't be memcpies but just writing ranges of ptrs into an array, which should be fast?

@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 16, 2021
@m-ou-se
Copy link
Member

m-ou-se commented Mar 21, 2021

Thinking about this some more, I think this api (too) easily hides that .iter().map_windows() on a slice doesn't actually do .array_windows() but instead keeps a separate array of references which it keeps fully shifting at every iteration, which is a lot less efficient.

Maybe .pairwise_map(|a, &b| ..) would be less of a footgun?

@crlf0710 crlf0710 added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Apr 9, 2021
@LukasKalbertodt
Copy link
Member Author

To be honest, I'm not sure. pairwise_map seems like a useful API as well, but as it's more specific, it is not a fully replacement for map_windows.

When it comes to map_windows on a slice iterator: yep, I agree that's unfortunate and would happen in the real world from time to time. But I'm not sure if that's really such a big problem, especially if we point it out in the docs. There are plenty other ways people can end up with iterator-chains that are slower than they could be. vec.iter().cloned() is a classical example here.

@Dylan-DPC-zz
Copy link

@sfackler any updates on this?

@bstrie bstrie added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 12, 2021
@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 31, 2021
@JohnTitor
Copy link
Member

r? @m-ou-se as Steven is now alumni and Mara left some thoughts.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 31, 2021
/// # Panics
///
/// Panics if `N` is 0. This check will most probably get changed to a
/// compile time error before this method gets stabilized.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added an item for this to the tracking issue.

// to 0, we treat them as uninitialized and treat their copies
// as initialized.
unsafe {
ptr::copy_nonoverlapping(buffer_ptr.add(N), buffer_ptr, N - 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The index math on this line seems wrong.

Repro:

fn main() {
    for () in std::iter::repeat("0".to_owned())
        .map_windows(|_: &[_; 3]| {})
        .take(4) {}
}
free(): double free detected in tcache 2
Aborted (core dumped)

@dtolnay dtolnay added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 3, 2021
@LukasKalbertodt
Copy link
Member Author

@dtolnay Thanks for catching that logic bug. I was surprised my tests did not catch that. Digging deeper, I now believe my whole logic is flawed. In particular, I prepare out inside of next, then modify the underlying array and then return out. That seems utterly broken. My tests also have very weird behavior that I suspect might be a sign of UB. In addition to that, the comment by @SkiFire13 is also a good one: iterators adaptors usually don't call next on the underlying iterator on creation.

Soooo I think I will have to rethink my approach. I'm not sure how quickly I can find the time to do that. Just as a heads up that me fixing this could take a while.

@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 22, 2021
@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 6, 2021
@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 27, 2021
@bors
Copy link
Contributor

bors commented Oct 9, 2021

☔ The latest upstream changes (presumably #89703) made this pull request unmergeable. Please resolve the merge conflicts.

@JohnCSimon
Copy link
Member

Triage:
This PR has sat idle for a while. Maybe close it as inactive?

@LukasKalbertodt
Copy link
Member Author

Yeah let's close it. I don't see myself fixing this PR anytime soon. I still think the API is very interesting and worth playing around with, but right now I don't have the resources to land this.

@the8472 the8472 added the S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. label Jan 1, 2022
@frank-king frank-king mentioned this pull request Mar 6, 2022
16 tasks
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request May 18, 2023
…, r=Mark-Simulacrum

Add `Iterator::map_windows`

Tracking issue:  rust-lang#87155.

This is inherited from the old PR  rust-lang#82413.

Unlike rust-lang#82413, this PR implements the `MapWindows` to be lazy: only when pulling from the outer iterator, `.next()` of the inner iterator will be called.

## Implementaion Steps
- [x] Implement `MapWindows` to keep the iterators' [*Laziness*](https://doc.rust-lang.org/std/iter/index.html#laziness) contract.
- [x] Fix the known bug of memory access error.
- [ ] Full specialization of iterator-related traits for `MapWindows`.
    - [x] `Iterator::size_hint`,
    - [x] ~`Iterator::count`~,
    - [x] `ExactSizeIterator` (when `I: ExactSizeIterator`),
    - [x] ~`TrustedLen` (when `I: TrustedLen`)~,
    - [x] `FusedIterator`,
    - [x] ~`Iterator::advance_by`~,
    - [x] ~`Iterator::nth`~,
    - [ ] ...
- [ ] More tests and docs.

## Unresolved Questions:
- [ ] Is there any more iterator-related traits should be specialized?
- [ ] Is the double-space buffer worth?
- [ ] Should there be `rmap_windows` or something else?
- [ ] Taking GAT for consideration, should the mapper function be `FnMut(&[I::Item; N]) -> R` or something like `FnMut(ArrayView<'_, I::Item, N>) -> R`? Where `ArrayView` is mentioned in rust-lang/generic-associated-types-initiative#2.
    - It can save memory, only the same size as the array window is needed,
    - It is more efficient, which requires less data copies,
    - It is possibly compatible with the GATified version of `LendingIterator::windows`.
    - But it prevents the array pattern matching like `iter.map_windows(|_arr: [_; N]| ())`, unless we extend the array pattern to allow matching the `ArrayView`.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request May 19, 2023
…, r=Mark-Simulacrum

Add `Iterator::map_windows`

Tracking issue:  rust-lang#87155.

This is inherited from the old PR  rust-lang#82413.

Unlike rust-lang#82413, this PR implements the `MapWindows` to be lazy: only when pulling from the outer iterator, `.next()` of the inner iterator will be called.

## Implementaion Steps
- [x] Implement `MapWindows` to keep the iterators' [*Laziness*](https://doc.rust-lang.org/std/iter/index.html#laziness) contract.
- [x] Fix the known bug of memory access error.
- [ ] Full specialization of iterator-related traits for `MapWindows`.
    - [x] `Iterator::size_hint`,
    - [x] ~`Iterator::count`~,
    - [x] `ExactSizeIterator` (when `I: ExactSizeIterator`),
    - [x] ~`TrustedLen` (when `I: TrustedLen`)~,
    - [x] `FusedIterator`,
    - [x] ~`Iterator::advance_by`~,
    - [x] ~`Iterator::nth`~,
    - [ ] ...
- [ ] More tests and docs.

## Unresolved Questions:
- [ ] Is there any more iterator-related traits should be specialized?
- [ ] Is the double-space buffer worth?
- [ ] Should there be `rmap_windows` or something else?
- [ ] Taking GAT for consideration, should the mapper function be `FnMut(&[I::Item; N]) -> R` or something like `FnMut(ArrayView<'_, I::Item, N>) -> R`? Where `ArrayView` is mentioned in rust-lang/generic-associated-types-initiative#2.
    - It can save memory, only the same size as the array window is needed,
    - It is more efficient, which requires less data copies,
    - It is possibly compatible with the GATified version of `LendingIterator::windows`.
    - But it prevents the array pattern matching like `iter.map_windows(|_arr: [_; N]| ())`, unless we extend the array pattern to allow matching the `ArrayView`.
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request May 19, 2023
…, r=Mark-Simulacrum

Add `Iterator::map_windows`

Tracking issue:  rust-lang#87155.

This is inherited from the old PR  rust-lang#82413.

Unlike rust-lang#82413, this PR implements the `MapWindows` to be lazy: only when pulling from the outer iterator, `.next()` of the inner iterator will be called.

## Implementaion Steps
- [x] Implement `MapWindows` to keep the iterators' [*Laziness*](https://doc.rust-lang.org/std/iter/index.html#laziness) contract.
- [x] Fix the known bug of memory access error.
- [ ] Full specialization of iterator-related traits for `MapWindows`.
    - [x] `Iterator::size_hint`,
    - [x] ~`Iterator::count`~,
    - [x] `ExactSizeIterator` (when `I: ExactSizeIterator`),
    - [x] ~`TrustedLen` (when `I: TrustedLen`)~,
    - [x] `FusedIterator`,
    - [x] ~`Iterator::advance_by`~,
    - [x] ~`Iterator::nth`~,
    - [ ] ...
- [ ] More tests and docs.

## Unresolved Questions:
- [ ] Is there any more iterator-related traits should be specialized?
- [ ] Is the double-space buffer worth?
- [ ] Should there be `rmap_windows` or something else?
- [ ] Taking GAT for consideration, should the mapper function be `FnMut(&[I::Item; N]) -> R` or something like `FnMut(ArrayView<'_, I::Item, N>) -> R`? Where `ArrayView` is mentioned in rust-lang/generic-associated-types-initiative#2.
    - It can save memory, only the same size as the array window is needed,
    - It is more efficient, which requires less data copies,
    - It is possibly compatible with the GATified version of `LendingIterator::windows`.
    - But it prevents the array pattern matching like `iter.map_windows(|_arr: [_; N]| ())`, unless we extend the array pattern to allow matching the `ArrayView`.
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Aug 13, 2023
…, r=Mark-Simulacrum

Add `Iterator::map_windows`

Tracking issue:  rust-lang#87155.

This is inherited from the old PR  rust-lang#82413.

Unlike rust-lang#82413, this PR implements the `MapWindows` to be lazy: only when pulling from the outer iterator, `.next()` of the inner iterator will be called.

## Implementaion Steps
- [x] Implement `MapWindows` to keep the iterators' [*Laziness*](https://doc.rust-lang.org/std/iter/index.html#laziness) contract.
- [x] Fix the known bug of memory access error.
- [ ] Full specialization of iterator-related traits for `MapWindows`.
    - [x] `Iterator::size_hint`,
    - [x] ~`Iterator::count`~,
    - [x] `ExactSizeIterator` (when `I: ExactSizeIterator`),
    - [x] ~`TrustedLen` (when `I: TrustedLen`)~,
    - [x] `FusedIterator`,
    - [x] ~`Iterator::advance_by`~,
    - [x] ~`Iterator::nth`~,
    - [ ] ...
- [ ] More tests and docs.

## Unresolved Questions:
- [ ] Is there any more iterator-related traits should be specialized?
- [ ] Is the double-space buffer worth?
- [ ] Should there be `rmap_windows` or something else?
- [ ] Taking GAT for consideration, should the mapper function be `FnMut(&[I::Item; N]) -> R` or something like `FnMut(ArrayView<'_, I::Item, N>) -> R`? Where `ArrayView` is mentioned in rust-lang/generic-associated-types-initiative#2.
    - It can save memory, only the same size as the array window is needed,
    - It is more efficient, which requires less data copies,
    - It is possibly compatible with the GATified version of `LendingIterator::windows`.
    - But it prevents the array pattern matching like `iter.map_windows(|_arr: [_; N]| ())`, unless we extend the array pattern to allow matching the `ArrayView`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.