Stop re-implementing slice iterators in `vec::IntoIter` #124421

scottmcm · 2024-04-26T19:59:59Z

Today, vec::IntoIter has essentially an inlined copy of an out-of-date version of slice::Iter, including all the distinct T::IS_ZST checks and everything. That's because it can't actually use slice::Iter<'a, T> due to the lifetime and the references.

But it's silly that it needs to reimplement all that logic, and that we thus end up needing things like #114205 to redo various optimizations on it after they're done on slice iterators.

This PR addresses that as follows:

Makes a new internal slice::NonNullIter<T> implemented using the same iterator! macro that implements Iterator for Iter and IterMut, so that it'll have all the optimizations without needing to duplicate code, but without the problematic lifetime and SB/TB implications of references.
Wraps that into a slice::DrainRaw<T> that moves/drops the items in the iterated range.
Implements both array::Drain<'a, T> and vec::IntoIter<T> using DrainRaw, rather than them both needing to have the "read the pointer on next, drop_in_place on drop" logic.

Thus vec::IntoIter's implementation gets a bunch shorter since it no longer has to worry about things like the best way to iterate a slice of ZSTs. It can just focus on ensuring that the deallocating happens in the correct place & order.

(This intentionally leaves slice::Iter and slice::ItemMut almost entirely alone, as the details there are very hot.)

rustbot · 2024-04-26T20:00:10Z

r? @m-ou-se

rustbot has assigned @m-ou-se.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

scottmcm · 2024-04-26T20:03:52Z

library/core/src/slice/iter.rs

+
+/// Iterator over all the `NonNull<T>` pointers to the elements of a slice.
+#[must_use = "iterators are lazy and do nothing unless consumed"]
+pub struct NonNullIter<T> {


This might be useful to expose one day, like @dtolnay's #91390 explored before, but for this PR it's just pub(in core::slice) since it's never needed outside there and doesn't have an ACP.

library/core/src/slice/iter.rs

the8472 · 2024-04-27T12:54:50Z

library/alloc/src/vec/into_iter.rs

+        // FIXME: for some reason, just `self.drain.__iterator_get_unchecked(i)`
+        // never worked for me.  If you know a way to fix that, please do.
+


Due to the Copy bound? try_get_unchecked would probably fix that... which is not public.
But reimplementing the pointer logic seems simple enough.

I tried impl TRANC for IntoIter where DrainRaw: TRANC which I thought would have made it work, and saved the extra NonDrop copy, but it didn't :(

I agree it's no big deal since the logic is simple.

the8472

I think it'd make sense to run the microbenchmarks we have for alloc and core here. They're quite noisy so you'll have to quiet your system (disabling turbo boost can help) but they cover more than the codegen tests.

And could DrainRaw help with vec::Drain and vec::Splice too?

the8472 · 2024-04-27T13:03:15Z

library/core/src/slice/drain.rs

+// SAFETY: If they're accessing things in random order we don't know when to drop
+// things, so only allow this for `Copy` things where that doesn't matter.


It's not the random order that's causing it, it's that the accesses don't update the pointers, which the TRA contract does not require. Like the NonDrop already says.

the8472 · 2024-04-27T13:20:41Z

library/core/src/slice/drain.rs

+        if len >= N {
+            // SAFETY: If we have more elements than were requested, they can be
+            // read directly because arrays need no extra alignment.
+            Ok(unsafe { to_copy.cast::<[T; N]>().read() })
+        } else {
+            let mut raw_ary = MaybeUninit::uninit_array();
+            // SAFETY: If we don't have enough elements left, then copy all the
+            // ones we do have into the local array, which cannot overlap because
+            // new locals are always distinct storage.
+            Err(unsafe {
+                MaybeUninit::<T>::slice_as_mut_ptr(&mut raw_ary)
+                    .copy_from_nonoverlapping(to_copy.as_mut_ptr(), len);
+                array::IntoIter::new_unchecked(raw_ary, 0..len)
+            })


Have you checked the codegen of this? When I wrote the vec::IntoIter one I found that copying into the same uninit array in both arms helped a little. But it's possible that it's no longer necessary.

Looks like both are codegen'ing way better than I remember, actually. I think the new dead_on_unwind flag on the sret is helping a ton for things like this. (It's in https://releases.llvm.org/18.1.0/docs/LangRef.html but not in https://releases.llvm.org/17.0.1/docs/LangRef.html, so new in nightly as of #120055 in Feb.)

I'll add a codegen test.

For example, this PR produces the following:

; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn uwtable define void @vec_iter_next_chunk_long(ptr dead_on_unwind noalias nocapture noundef writable writeonly sret([152 x i8]) align 8 dereferenceable(152) %_0, ptr noalias nocapture noundef align 8 dereferenceable(32) %it) unnamed_addr #3 personality ptr @__CxxFrameHandler3 { start: %_2 = getelementptr inbounds i8, ptr %it, i64 8 tail call void @llvm.experimental.noalias.scope.decl(metadata !54) tail call void @llvm.experimental.noalias.scope.decl(metadata !57) %self1.i.i = getelementptr inbounds i8, ptr %it, i64 16 %end.i.i = load ptr, ptr %self1.i.i, align 8, !alias.scope !59, !noalias !54, !nonnull !5, !noundef !5 %subtracted.i.i = load ptr, ptr %_2, align 8, !alias.scope !57, !noalias !54, !nonnull !5, !noundef !5 %0 = ptrtoint ptr %end.i.i to i64 %1 = ptrtoint ptr %subtracted.i.i to i64 %2 = sub nuw i64 %0, %1 %_0.sroa.0.0.sroa.speculated.i.i = tail call noundef i64 @llvm.umin.i64(i64 %2, i64 123) %_13.i.i = getelementptr inbounds i8, ptr %subtracted.i.i, i64 %_0.sroa.0.0.sroa.speculated.i.i store ptr %_13.i.i, ptr %_2, align 8, !alias.scope !62, !noalias !54 %_7.i = icmp ugt i64 %2, 122 br i1 %_7.i, label %bb2.i, label %bb3.i bb3.i: ; preds = %start %_11.sroa.5.0..sroa_idx.i = getelementptr inbounds i8, ptr %_0, i64 24 tail call void @llvm.memcpy.p0.p0.i64(ptr nonnull align 8 %_11.sroa.5.0..sroa_idx.i, ptr nonnull align 1 %subtracted.i.i, i64 %2, i1 false), !noalias !57 %3 = getelementptr inbounds i8, ptr %_0, i64 8 store i64 0, ptr %3, align 8, !alias.scope !54, !noalias !57 %_11.sroa.4.0..sroa_idx.i = getelementptr inbounds i8, ptr %_0, i64 16 store i64 %2, ptr %_11.sroa.4.0..sroa_idx.i, align 8, !alias.scope !54, !noalias !57 br label %"_ZN96_$LT$core..slice..drain..DrainRaw$LT$T$GT$$u20$as$u20$core..iter..traits..iterator..Iterator$GT$10next_chunk17h4db823eaa1311d9bE.exit" bb2.i: ; preds = %start %4 = getelementptr inbounds i8, ptr %_0, i64 1 tail call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 1 dereferenceable(123) %4, ptr noundef nonnull align 1 dereferenceable(123) %subtracted.i.i, i64 123, i1 false), !noalias !57 br label %"_ZN96_$LT$core..slice..drain..DrainRaw$LT$T$GT$$u20$as$u20$core..iter..traits..iterator..Iterator$GT$10next_chunk17h4db823eaa1311d9bE.exit" "_ZN96_$LT$core..slice..drain..DrainRaw$LT$T$GT$$u20$as$u20$core..iter..traits..iterator..Iterator$GT$10next_chunk17h4db823eaa1311d9bE.exit": ; preds = %bb3.i, %bb2.i %.sink.i = phi i8 [ 0, %bb2.i ], [ 1, %bb3.i ] store i8 %.sink.i, ptr %_0, align 8, !alias.scope !54, !noalias !57 ret void }

which is nearly identical to https://rust.godbolt.org/z/zEddbzcYG.

The difference is that now LLVM has decided to hoist the "move the start pointer forward" into the header instead of having each arm do it. Seems at least plausibly better -- a cmov for something we already have a data dependency on anyway is probably better than each branch having a different store.

Oh, no, maybe it's way newer than I'd realized: #121298

the8472 · 2024-04-27T13:27:47Z

library/core/src/slice/iter/macros.rs

+            #[inline]
+            pub(crate) fn make_shortlived_slice<'b>(&'b self) -> &'b [T] {
+                // SAFETY: Everything expanded with this macro is readable while
+                // the iterator exists and is unchanged, so we can force a shorter
+                // lifetime than `'a` to make this safe to call.
+                unsafe { self.make_nonnull_slice().as_ref() }


I don't get why the explicit lifetime here is necessary. &self borrows usually already are shorter than Self<'a> lifetimes.

It's definitely not necessary. I just wanted to be explicit about it when as_ref returns an unbounded lifetime and there's an 'a in scope from the impl. I've updated the comment to hopefully communicate better.

Rather than encoding things as method calls or token expansions. This isn't directly useful for Iter/IterMut, but sets this up for the next commit that adds another invocation of the macro.

scottmcm · 2024-04-27T20:55:46Z

And could DrainRaw help with vec::Drain and vec::Splice too?

Good idea. Looking into that a bit I end up at

            // drop_ptr comes from a slice::Iter which only gives us a &[T] but for drop_in_place
            // a pointer with mutable provenance is necessary.

which looks highly promising that this will help simplify things.

bors · 2024-05-20T03:42:37Z

☔ The latest upstream changes (presumably #123878) made this pull request unmergeable. Please resolve the merge conflicts.

Add `slice::DrainRaw` for internal use I'd like to separate out this hopefully-non-controversial part of rust-lang#124421 to have fewer commits to juggle there :) r? `@the8472`

rustbot assigned m-ou-se Apr 26, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 26, 2024

scottmcm commented Apr 26, 2024

View reviewed changes

This comment has been minimized.

Sign in to view

fbstj reviewed Apr 26, 2024

View reviewed changes

library/core/src/slice/iter.rs Outdated Show resolved Hide resolved

scottmcm force-pushed the slice-drain-iter branch from 6a434c9 to 0178215 Compare April 26, 2024 21:26

This comment has been minimized.

Sign in to view

scottmcm force-pushed the slice-drain-iter branch 3 times, most recently from 16a13b6 to 01617bb Compare April 27, 2024 01:49

This comment has been minimized.

Sign in to view

scottmcm force-pushed the slice-drain-iter branch from a396abc to dc7e707 Compare April 27, 2024 03:42

the8472 reviewed Apr 27, 2024

View reviewed changes

scottmcm force-pushed the slice-drain-iter branch from dc7e707 to dc17e00 Compare April 27, 2024 18:19

Replace various iterator! parameters with self calls

58f8939

Rather than encoding things as method calls or token expansions. This isn't directly useful for Iter/IterMut, but sets this up for the next commit that adds another invocation of the macro.

scottmcm force-pushed the slice-drain-iter branch 2 times, most recently from ffaafa4 to f4fbb48 Compare April 27, 2024 19:45

scottmcm added 7 commits April 27, 2024 14:55

Add an internal slice::NonNullIter

1af2be8

Add slice::DrainRaw that moves out of a NonNull<[T]>

a251587

Implement array::Drain using slice::DrainRaw

fead097

Move vec::IntoIter to using slice::DrainRaw

5a53c6a

no_global_oom_handling is so annoying :(

f279596

Add codegen tests for vec::IntoIter::next_chunk

0fe05cc

Use DrainRaw in vec::Drain too.

c52540b

scottmcm force-pushed the slice-drain-iter branch from f4fbb48 to c52540b Compare April 27, 2024 22:53

scottmcm marked this pull request as draft July 5, 2024 00:35

scottmcm mentioned this pull request Jul 5, 2024

Add slice::DrainRaw for internal use #127348

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop re-implementing slice iterators in `vec::IntoIter` #124421

Stop re-implementing slice iterators in `vec::IntoIter` #124421

scottmcm commented Apr 26, 2024

rustbot commented Apr 26, 2024

scottmcm Apr 26, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

the8472 Apr 27, 2024

scottmcm Apr 27, 2024 •

edited

Loading

the8472 left a comment

the8472 Apr 27, 2024

the8472 Apr 27, 2024

scottmcm Apr 27, 2024

scottmcm Apr 27, 2024

scottmcm Apr 29, 2024

the8472 Apr 27, 2024

scottmcm Apr 27, 2024

scottmcm commented Apr 27, 2024 •

edited

Loading

bors commented May 20, 2024

		// FIXME: for some reason, just `self.drain.__iterator_get_unchecked(i)`
		// never worked for me. If you know a way to fix that, please do.

		// SAFETY: If they're accessing things in random order we don't know when to drop
		// things, so only allow this for `Copy` things where that doesn't matter.

Stop re-implementing slice iterators in vec::IntoIter #124421

Are you sure you want to change the base?

Stop re-implementing slice iterators in vec::IntoIter #124421

Conversation

scottmcm commented Apr 26, 2024

rustbot commented Apr 26, 2024

Choose a reason for hiding this comment

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Choose a reason for hiding this comment

scottmcm Apr 27, 2024 • edited Loading

Choose a reason for hiding this comment

the8472 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scottmcm commented Apr 27, 2024 • edited Loading

bors commented May 20, 2024

Stop re-implementing slice iterators in `vec::IntoIter` #124421

Stop re-implementing slice iterators in `vec::IntoIter` #124421

scottmcm Apr 27, 2024 •

edited

Loading

scottmcm commented Apr 27, 2024 •

edited

Loading