Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add slice methods for indexing via an array of indices. #83608

Merged
merged 1 commit into from
Nov 22, 2022

Conversation

Kimundi
Copy link
Member

@Kimundi Kimundi commented Mar 28, 2021

Disclaimer: It's been a while since I contributed to the main Rust repo, apologies in advance if this is large enough already that it should've been an RFC.


Update:

  • Based on feedback, removed the &[T] variant of this API, and removed the requirements for the indices to be sorted.

Description

This adds the following slice methods to core:

impl<T> [T] {
    pub unsafe fn get_many_unchecked_mut<const N: usize>(&mut self, indices: [usize; N]) -> [&mut T; N];
    pub fn get_many_mut<const N: usize>(&mut self, indices: [usize; N]) -> Option<[&mut T; N]>;
}

This allows creating multiple mutable references to disjunct positions in a slice, which previously required writing some awkward code with split_at_mut() or iter_mut(). For the bound-checked variant, the indices are checked against each other and against the bounds of the slice, which requires N * (N + 1) / 2 comparison operations.

This has a proof-of-concept standalone implementation here: https://crates.io/crates/index_many

Care has been taken that the implementation passes miri borrow checks, and generates straight-forward assembly (though this was only checked on x86_64).

Example

let v = &mut [1, 2, 3, 4];
let [a, b] = v.get_many_mut([0, 2]).unwrap();
std::mem::swap(a, b);
*v += 100;
assert_eq!(v, &[3, 2, 101, 4]);

Codegen Examples

Click to expand!

Disclaimer: Taken from local tests with the standalone implementation.

Unchecked Indexing:

pub unsafe fn example_unchecked(slice: &mut [usize], indices: [usize; 3]) -> [&mut usize; 3] {
    slice.get_many_unchecked_mut(indices)
}
example_unchecked:
 mov     rcx, qword, ptr, [r9]
 mov     r8, qword, ptr, [r9, +, 8]
 mov     r9, qword, ptr, [r9, +, 16]
 lea     rcx, [rdx, +, 8*rcx]
 lea     r8, [rdx, +, 8*r8]
 lea     rdx, [rdx, +, 8*r9]
 mov     qword, ptr, [rax], rcx
 mov     qword, ptr, [rax, +, 8], r8
 mov     qword, ptr, [rax, +, 16], rdx
 ret

Checked Indexing (Option):

pub unsafe fn example_option(slice: &mut [usize], indices: [usize; 3]) -> Option<[&mut usize; 3]> {
    slice.get_many_mut(indices)
}
 mov     r10, qword, ptr, [r9, +, 8]
 mov     rcx, qword, ptr, [r9, +, 16]
 cmp     rcx, r10
 je      .LBB0_7
 mov     r9, qword, ptr, [r9]
 cmp     rcx, r9
 je      .LBB0_7
 cmp     rcx, r8
 jae     .LBB0_7
 cmp     r10, r9
 je      .LBB0_7
 cmp     r9, r8
 jae     .LBB0_7
 cmp     r10, r8
 jae     .LBB0_7
 lea     r8, [rdx, +, 8*r9]
 lea     r9, [rdx, +, 8*r10]
 lea     rcx, [rdx, +, 8*rcx]
 mov     qword, ptr, [rax], r8
 mov     qword, ptr, [rax, +, 8], r9
 mov     qword, ptr, [rax, +, 16], rcx
 ret
.LBB0_7:
 mov     qword, ptr, [rax], 0
 ret

Checked Indexing (Panic):

pub fn example_panic(slice: &mut [usize], indices: [usize; 3]) -> [&mut usize; 3] {
    let len = slice.len();
    match slice.get_many_mut(indices) {
        Some(s) => s,
        None => {
            let tmp = indices;
            index_many::sorted_bound_check_failed(&tmp, len)
        }
    }
}
example_panic:
 sub     rsp, 56
 mov     rax, qword, ptr, [r9]
 mov     r10, qword, ptr, [r9, +, 8]
 mov     r9, qword, ptr, [r9, +, 16]
 cmp     r9, r10
 je      .LBB0_6
 cmp     r9, rax
 je      .LBB0_6
 cmp     r9, r8
 jae     .LBB0_6
 cmp     r10, rax
 je      .LBB0_6
 cmp     rax, r8
 jae     .LBB0_6
 cmp     r10, r8
 jae     .LBB0_6
 lea     rax, [rdx, +, 8*rax]
 lea     r8, [rdx, +, 8*r10]
 lea     rdx, [rdx, +, 8*r9]
 mov     qword, ptr, [rcx], rax
 mov     qword, ptr, [rcx, +, 8], r8
 mov     qword, ptr, [rcx, +, 16], rdx
 mov     rax, rcx
 add     rsp, 56
 ret
.LBB0_6:
 mov     qword, ptr, [rsp, +, 32], rax
 mov     qword, ptr, [rsp, +, 40], r10
 mov     qword, ptr, [rsp, +, 48], r9
 lea     rcx, [rsp, +, 32]
 mov     edx, 3
 call    index_many::bound_check_failed
 ud2

Extensions

There are multiple optional extensions to this.

Indexing With Ranges

This could easily be expanded to allow indexing with [I; N] where I: SliceIndex<Self>. I wanted to keep the initial implementation simple, so I didn't include it yet.

Panicking Variant

We could also add this method:

impl<T> [T] {
    fn index_many_mut<const N: usize>(&mut self, indices: [usize; N]) -> [&mut T; N];
}

This would work similar to the regular index operator and panic with out-of-bound indices. The advantage would be that we could more easily ensure good codegen with a useful panic message, which is non-trivial with the Option variant.

This is implemented in the standalone implementation, and used as basis for the codegen examples here and there.

@rust-highfive
Copy link
Collaborator

r? @sfackler

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 28, 2021
@rust-log-analyzer

This comment has been minimized.

@leonardo-m
Copy link

To reduce the API surface is it a good idea to offer only the mut versions? Getting multiple immutable references is already possible in simpler ways.

And regarding the Presorted, in the D language stdlib, the sort functions return a SortedRange (https://dlang.org/phobos/std_algorithm_sorting.html#sort ), it's a type that can be converted to a regular slice, but it can also be used as is. A type that remembers the sortdness of a slice is handy, it performs a safer binary search (the compiler forbids you from performing a binary search on an unsorted range by mistake. You have assumeSorted function (https://dlang.org/phobos/std_range.html#.SortedRangeOptions ), but in debug mode it performs a statistically cheap test that the range is actually sorted) and you can use it in other cases like indexing for your functions.

@leonardo-m
Copy link

leonardo-m commented Mar 28, 2021

See also #78541 where I asked this for just two indices... that's my most common case.

@Kimundi
Copy link
Member Author

Kimundi commented Mar 28, 2021

The std lib usually provides variants for both mutable and immutable slices the same way, even if the immutable one can be expressed differently already. See split_at and split_at_mut for example.

@Kimundi Kimundi changed the title [WIP] Add slice methods for indexing via an array of indices. Add slice methods for indexing via an array of indices. Mar 30, 2021
@burdges
Copy link

burdges commented Apr 4, 2021

I'm unsure how exactly how the rules changed, but.. this is simple enough it might not require an RFC under the current rules.

I give some related convenience methods around split_at_mut in rust-lang/rfcs#3100 (comment) so if this does not pass then maybe some slice enhancement crate makes sense or already exists.

@bors
Copy link
Contributor

bors commented Apr 5, 2021

☔ The latest upstream changes (presumably #83530) made this pull request unmergeable. Please resolve the merge conflicts.

@leonardo-m
Copy link

The std lib usually provides variants for both mutable and immutable slices the same way,

Yeah, but increasing API surface has a cognitive and other kind of costs. So if a function isn't that useful, it could be better to break the symmetry.

@burdges
Copy link

burdges commented Apr 5, 2021

An immutable variant often helps whenever anyone writes another method with both mutable and immutable variants, because then their two methods read identically. Our eyes run %s/[_ ]mut//g somewhat accurately over small areas. ;)

What code replaces get_many here?

let a = slice[5];
let b = slice[7];
let c = slice[13];

or

let [a,b,c] = [5,7,13].iter().map(|idx| slice[idx]).collect::<[T;3]>();

@leonardo-m
Copy link

On Nightly we can write shorter code now :-)

#![feature(array_map)]

let [a, b, c] = [5, 7, 13].map(|idx| slice[idx]);

@burdges
Copy link

burdges commented Apr 5, 2021

Nice. :) Appears the reserve_mut variant of split_at_mut almost covers the mutable case too, except indexing requires some care.

#[derive(Default)]
pub struct Increments(Option<usize>);
impl Increments {
    fn next(&must self, o: usize) -> usize {
            if let Some(offset) = self.0.clone() {
                assert!(o > offset);
                let n = o-offset;
                *self = Some(o);
                n
            } else {
                offset = Some(o);
                o
            }
    }
}

impl<T> [T] {
    fn get_many_mut<'a,const : usize>(mut &'a mut self, offsets: [usize; N]) -> Option<[&'a mut T ; N]>
    {
        let mut off = Increments::default();
        offsets.map( |o| (&mut self).reserve_mut(off.next(o)).first_mut().unwrap() )
    }
}

@branpk
Copy link

branpk commented Apr 5, 2021

Shouldn't this panic if the indices are unsorted? Seems like a pitfall that it silently returns None (which is also used to indicate the indices being out of bounds).

Also since the common case will be for these arrays to be relatively small (I would imagine 2 is the most common case), I wonder if it's even worth requiring the indices to be sorted? Or maybe there could be two separate methods? I'm thinking about other collections like HashMap where we will presumably want this operation as well, eventually. Inconsistent preconditions between HashMap and slice could easily lead to confusion/bugs.

@crlf0710 crlf0710 added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 23, 2021
@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 9, 2021
@camelid camelid added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 28, 2021
@Dylan-DPC-zz
Copy link

r? @m-ou-se

@rust-highfive rust-highfive assigned m-ou-se and unassigned sfackler May 28, 2021
Copy link
Member

@m-ou-se m-ou-se left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making this!

Also since the common case will be for these arrays to be relatively small (I would imagine 2 is the most common case), I wonder if it's even worth requiring the indices to be sorted?

While discussing this in the libs team meeting recently, we were asking ourselves the same question. We imagine this will mostly be used with small values of N (e.g. 2 or 3), in which case checking for duplicates without requiring it to be sorted would be fine. (Or maybe it could first check if it's sorted to have a fast common case, but fall back to the O(N²) check otherwise?) (And since N will be low in most cases, maybe multiple might be a better name than many. But we can leave the bikeshedding for later. ^^)

That would solve the question of what to do for unsorted indices (panic vs returning None) by making the question irrelevant.

It'd also match better with the unchecked version. Otherwise we'd have cases where the unchecked version can safely be used, (e.g. [1, 0]), but the checked version would panic or return None, which can be confusing.

And for that same reason, I think it'd be best to weaken the requirements on the non-mut version as much as possible too.

What do you think?

A also have a few comments on the implementation:

Comment on lines 3511 to 3533
let mut arr: mem::MaybeUninit<[&T; N]> = mem::MaybeUninit::uninit();
let arr_ptr = arr.as_mut_ptr();

// SAFETY: We expect `indices` to contain disjunct values that are
// in bounds of `self`.
unsafe {
for i in 0..N {
let idx = *indices.get_unchecked(i);
*(*arr_ptr).get_unchecked_mut(i) = &*slice.get_unchecked(idx);
}
arr.assume_init()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use array::map to simplify this:

Suggested change
let mut arr: mem::MaybeUninit<[&T; N]> = mem::MaybeUninit::uninit();
let arr_ptr = arr.as_mut_ptr();
// SAFETY: We expect `indices` to contain disjunct values that are
// in bounds of `self`.
unsafe {
for i in 0..N {
let idx = *indices.get_unchecked(i);
*(*arr_ptr).get_unchecked_mut(i) = &*slice.get_unchecked(idx);
}
arr.assume_init()
}
indices.map(|i| unsafe { &*slice.get_unchecked(i) })

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_unchecked() returns &T, so the &* would be unneeded. Apart from that - I started out with the .map function, but it generated worse code when I tried it. Believe me, I did not arrive at this complicated piece of unsafe code as a first choice. 😄

Comment on lines +3559 to +4135
let mut arr: mem::MaybeUninit<[&mut T; N]> = mem::MaybeUninit::uninit();
let arr_ptr = arr.as_mut_ptr();

// SAFETY: We expect `indices` to contain disjunct values that are
// in bounds of `self`.
unsafe {
for i in 0..N {
let idx = *indices.get_unchecked(i);
*(*arr_ptr).get_unchecked_mut(i) = &mut *slice.get_unchecked_mut(idx);
}
arr.assume_init()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut arr: mem::MaybeUninit<[&mut T; N]> = mem::MaybeUninit::uninit();
let arr_ptr = arr.as_mut_ptr();
// SAFETY: We expect `indices` to contain disjunct values that are
// in bounds of `self`.
unsafe {
for i in 0..N {
let idx = *indices.get_unchecked(i);
*(*arr_ptr).get_unchecked_mut(i) = &mut *slice.get_unchecked_mut(idx);
}
arr.assume_init()
}
indices.map(|i| unsafe { &mut *slice.get_unchecked_mut(i) })

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • get_unchecked_mut() returns &mut T, so the &mut * would not be needed.
  • This would not compile due to the static borrow checker for mutable borrows
  • i.map(|i| unsafe { &mut *(s.get_unchecked_mut(i) as *mut _) }) to make it raw pointers does not pass miri checks.

library/core/src/slice/mod.rs Outdated Show resolved Hide resolved
library/core/src/slice/mod.rs Outdated Show resolved Hide resolved
@m-ou-se m-ou-se added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 5, 2021
@bors
Copy link
Contributor

bors commented Jun 10, 2022

☔ The latest upstream changes (presumably #91970) made this pull request unmergeable. Please resolve the merge conflicts.

@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 3, 2022
@Kimundi
Copy link
Member Author

Kimundi commented Aug 6, 2022

@Ten0 It was never abandoned from my side. Rather, the idea was that once we have this merged, we could have a follow-up PR for replacing the hardcoded types with generics that do what you propose. Of course, I did not expect this PR to remain open so long...

@rustbot
Copy link
Collaborator

rustbot commented Aug 6, 2022

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@Kimundi
Copy link
Member Author

Kimundi commented Aug 6, 2022

@rustbot label +T-libs-api -T-libs

@bors
Copy link
Contributor

bors commented Aug 20, 2022

☔ The latest upstream changes (presumably #100809) made this pull request unmergeable. Please resolve the merge conflicts.

@JohnCSimon
Copy link
Member

Triage:
@Kimundi can you please address the merge conflicts?

@Mark-Simulacrum Mark-Simulacrum force-pushed the index_many branch 2 times, most recently from 38d91fc to 95e46a4 Compare November 20, 2022 16:17
@Mark-Simulacrum
Copy link
Member

Filed a tracking issue (#104642), rebased, and added the link to it from the feature gates here. I think this can go in as-is, but I noted the unresolved questions on the tracking issue in terms of refactoring/expanding the API to fit more use cases.

Normally we'd push back for an ACP but this has received attention from a number of libs-api meetings (albeit a while back) and the desire for it seems clear, even if we wish to iterate on the precise API, which can happen in future PRs.

@bors r+

@bors
Copy link
Contributor

bors commented Nov 20, 2022

📌 Commit 3fe37b8 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 20, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Nov 22, 2022
…earth

Rollup of 7 pull requests

Successful merges:

 - rust-lang#83608 (Add slice methods for indexing via an array of indices.)
 - rust-lang#95583 (Deprecate the unstable `ptr_to_from_bits` feature)
 - rust-lang#101655 (Make the Box one-liner more descriptive)
 - rust-lang#102207 (Constify remaining `Layout` methods)
 - rust-lang#103193 (mark sys_common::once::generic::Once::new const-stable)
 - rust-lang#104622 (Use clang for the UEFI targets)
 - rust-lang#104638 (Move macro_rules diagnostics to diagnostics module)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 1dd515f into rust-lang:master Nov 22, 2022
@rustbot rustbot added this to the 1.67.0 milestone Nov 22, 2022
ItsDoot pushed a commit to ItsDoot/bevy that referenced this pull request Feb 1, 2023
# Objective

-  std's new APIs do the same thing as `Query::get_multiple_mut`, but are called `get_many`: rust-lang/rust#83608

## Solution

- Find and replace `get_multiple` with `get_many`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.