Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Vec::flatten (as unstable, of course) #61680

Closed
wants to merge 1 commit into from

Conversation

scottmcm
Copy link
Member

@scottmcm scottmcm commented Jun 9, 2019

This has been a common question on IRC for years -- with people wanting to flatten Vecs of [u8; 4] colours or [f32; 3] linalg vectors. I'm irrationally excited that const generics has a reached a state where this works now.

(Named after Iterator::flatten, which is the same conceptual operation, just slower and doesn't currently work right with array Items.)

@rust-highfive
Copy link
Collaborator

r? @shepmaster

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 9, 2019
@Mark-Simulacrum
Copy link
Member

FWIW I suspect we don't technically need the const generics here with some creative design. However, I suspect @Centril will want to raise language team concerns here?

@scottmcm
Copy link
Member Author

scottmcm commented Jun 9, 2019

@Mark-Simulacrum I tried doing it with FixedSizeArray and was unhappy with how it came out -- it added an unnecessary generic parameter and meant that the ZST cases didn't work (since you can't tell how many items are in an array of ZSTs by its size). Bounding it with a new trait might possibly work, but then that trait would need with macro'd or const-generics'd impls, so that feels meh.

I'm hoping that @Centril would be ok with this since it's unstable, and the unstable message explicitly says that it's waiting on const generics stabilization before being stabilized itself.

Centril
Centril previously requested changes Jun 9, 2019
src/liballoc/vec.rs Outdated Show resolved Hide resolved
src/liballoc/vec.rs Outdated Show resolved Hide resolved
This has been a common question on IRC for years -- with people wanting to flatten Vecs of `[u8; 4]` colours or `[f32; 3]` linalg vectors.  I'm irrationally excited that const generics has a reached a state where this works now.

(Named after `Iterator::flatten`, which is the same conceptual operation, just slower and doesn't currently work right with array `Item`s.)
@Centril
Copy link
Contributor

Centril commented Jun 9, 2019

I'm hoping that @Centril would be ok with this since it's unstable, and the unstable message explicitly says that it's waiting on const generics stabilization before being stabilized itself.

Yep; 👍 -- As long as no one wants to rely on const generics in a stable fashion I'm OK with it.

(Named after Iterator::flatten, which is the same conceptual operation, just slower and doesn't currently work right with array Items.)

Uhm... Iterator::flatten is called so because it is performing the join operation on a monad although there's some funny business re. traits vs. type constructors. I'm not sure that Vec<[T; N]> -> Vec<T> constitutes a monadic join... but I suppose it works if you squint a while at it? cc resident category theorist @varkor.

@Centril Centril dismissed their stale review June 9, 2019 05:22

Done

@scottmcm
Copy link
Member Author

scottmcm commented Jun 9, 2019

@Centril I just mean that v.flatten() does the same thing as v.into_iter().flatten().collect::<Vec<_>>(), but faster and without waiting for us to finally impl IntoIterator for [T; N].

I wasn't trying to make a category theory point 😛

// But for types with an actual size these multiplications
// cannot overflow as the memory is allocated in self, so don't
// codegen a reference to panic machinery.
(self.len() * N, self.capacity() * N)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(self.len() * N, self.capacity() * N)
(self.len() * N, self.capacity() * N)

@shepmaster
Copy link
Member

v.flatten() does the same thing as v.into_iter().flatten().collect::<Vec<_>>(),

I'm not going to get into category theory, but I'll disagree with this point. Vec::flatten is not the same:

fn main() {
    let a = vec![Some(42), None, Some(99)];
    let b: Vec<_> = a.into_iter().flatten().collect();
    println!("{:?}", b);
}
[42, 99]

Notably, this function is much less generic (and thus more powerful), so I think it should not have the same name.

@shepmaster
Copy link
Member

shepmaster commented Jun 9, 2019

Code itself seems reasonable. I'm not sure I agree with the frequency of use.

This has been a common question on IRC for years

Pitting your anecdata against my anecdata, I don't recall this ever coming up on Stack Overflow. Some nearby things I found with a few minutes of searching:

I guess the tracking issue for this could serve as the popularity contest for stabilization though.

@clarfonthey
Copy link
Contributor

clarfonthey commented Jun 9, 2019

It would make sense to also have a version of flatten for nested arrays and slices, although one for Vec would be added on top of that. (Maybe for Box/Rc/Arc<[T]> too?)

I am a bit confused why we can't just make the bounds be IntoIterator and specialise for arrays, though.

// codegen a reference to panic machinery.
(self.len() * N, self.capacity() * N)
};
mem::forget(self);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of unfortunate that we need to do a forget here, but I can't really think of what the alternative would be. Converting to a box first would shrink the capacity, which isn't really desired.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have into_raw_parts(self) -> (*mut T, usize, usize), the reverse of Vec::from_raw_parts. It may be a slight footgun remembering which usize is the length and capacity though.

Copy link
Contributor

@clarfonthey clarfonthey Jun 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in hindsight it would have been much better for from_raw_parts to take *mut [T] and usize, rather than *mut T, and two usizes. Unfortunately, we can't change this now.

@varkor
Copy link
Member

varkor commented Jun 10, 2019

Vec::flatten is not the same

Continuing the naming bikeshed, I think something along the lines of concat is more appropriate. "Flatten" to me implies some sort of operation Collection<Collection<T>> -> Collection<T> (with arrays, this might be something like [[T; N]; M] -> [T; N * M]).

@shepmaster
Copy link
Member

I think any one word name is going to obscure the fact that this is a pretty specialty operation and make it look like it's more generic. An utterly verbose name would be {flatten,concat}_arrays_in_place.

I like the in_place modifier as a key, the arrays bit is annoying but unless we want to have an unsafe trait HasMemoryRepresentationLikeArray {}, it seems like we can't accept anything else.

Vec already has *_slice methods, so _array doesn't seem terrible. Vec used to have map_in_place and there are other *_in_place methods in the standard library.

@scottmcm
Copy link
Member Author

Vec<Rgb<u8>> -> Vec<u8>

The is probably the more common case, I agree.

unless we want to have an unsafe trait HasMemoryRepresentationLikeArray

I've been thinking that might be the right way to go, actually. It was talked about a bunch in rust-lang/rfcs#1915, and I think it could still be useful even with const generics. If we have things like <[T; N]>::map, it could well be on a core::array::Array trait, and then it wouldn't be unreasonable for that to be an unsafe trait (like FixedSizeArray is) that things like struct RGBA { or struct Vec3f { could implement (given an appropriate repr).

So I wonder if I should abandon this for now and wait for something like that.

Vec::flatten is not the same:

But it is the same as this:

fn main() {
    let a = vec![vec![42], vec![13], vec![99]];
    let b: Vec<_> = a.into_iter().flatten().collect();
    println!("{:?}", b);
}
[42, 13, 99]

(I just can't use vec![ [42], [13], [9] ] because of #25725.)

@clarfonthey
Copy link
Contributor

unless we want to have an unsafe trait HasMemoryRepresentationLikeArray

Couldn't repr(transparent) be used for this?

@scottmcm
Copy link
Member Author

@clarfon The ones I've seen have wanted named fields, so it's more like struct RGB { r: u8, g: u8, b: u8 }, which would probably be safe with repr(C), or could maybe use a hypothetical repr(array).

@clarfonthey
Copy link
Contributor

@scottmcm that's very fair, although I feel like having precisely named fields like that is something that's not 100% necessary. With point and vector types, it's not really a detriment to use methods to obtain the components, and that it's reasonable to ask people to either a) reconstruct a new vector by component or b) use a separate method which returns a mutable reference whenever they want to modify components individually.

Usually, vectors should be modified using vector functions, rather than scalar functions on their components. But this is my opinion and others probably disagree.

Adding repr(array) for the sake of allowing structs with only fields of the same type to be essentially treated as repr(transparent) over an array seems… excessive.

@shepmaster
Copy link
Member

But it is the same as this:

I'm missing a nuance of your example. vec![vec![1], vec![2]] would not work with Vec::flatten_placeholder_name because the memory layout would differ, right? I feel like your example benefits my counterexample :-)

Going really generic, we could have the hypothetical setup:

unsafe trait HasSameMemoryLayoutAs<T> {}

impl Vec<T> {
    fn flatten_in_place<U>(self) -> Vec<U>
    where
        U: HasSameMemoryLayoutAs<T>,
    {
        // Handwaving goes here
    }
}

Then rub some const generics on it to get blanket implementations for nested arrays... allow third-parties to participate in the trait... and I ran out of brainpower trying to think through the implications...

@scottmcm
Copy link
Member Author

I think I figured out the disconnect:

  • You're saying it's not flatten because there are scenarios that work with flatten but not with this
  • I'm saying that it's flatten because if this works then flatten would do the same thing as this

Is that right? If so, I agree with both positions 🙂

@shepmaster
Copy link
Member

  • You're saying it's not flatten because there are scenarios that work with flatten but not with this

Yep. I think this is my primary issue with this addition.

  • I'm saying that it's flatten because if this works then flatten would do the same thing as this

Kind of. I'd quibble about "same thing" because while the results from both functions will compare equal, they must do different things otherwise you wouldn't need to add this new function 😉 Namely, the proposed function is guaranteed to perform zero memory allocation; it's effectively a fancy transmute.

@shepmaster
Copy link
Member

In the same vein as this change, what about transformations from Vec<[T; 2N]> to Vec<[T; N]>, vice-versa, or even Vec<T> -> Vec<[T; N]>? Similar in spirit to the TryFrom impls for slices to arrays or my PR (#61515) for boxes?

Would any of that give us better names? Can this just be an Into / From implementation?

@scottmcm
Copy link
Member Author

scottmcm commented Jun 13, 2019

I'm going to close this for now; it seems like the right thing for me to do here is go and see if I can come up with a more holistic proposal (maybe like #49792 (comment)) and write a libs RFC for that.

Edits in the future:

@scottmcm scottmcm closed this Jun 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants