Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: overloaded slice notation #198

Merged
merged 0 commits into from
Sep 11, 2014
Merged

Conversation

aturon
Copy link
Member

@aturon aturon commented Aug 14, 2014

This RFC adds overloaded slice notation:

  • foo[] for foo.as_slice()
  • foo[n..m] for foo.slice(n, m)
  • foo[n..] for foo.slice_from(n)
  • foo[..m] for foo.slice_to(m)
  • mut variants of all the above

via two new traits, Slice and SliceMut.

It also changes the notation for range match patterns to ..., to
signify that they are inclusive whereas .. in slices are exclusive.

Rendered

Active: https://github.com/rust-lang/rfcs/blob/master/text/0198-slice-notation.md

@metajack
Copy link

I like this. +1

@metajack
Copy link

It occurs to me that other languages tend to use foo[:] for the as_slice part.

@steveklabnik
Copy link
Member

Super 👍 here. I'm not sure I'm aware of precedent either way for [] vs [:], but I don't mind either syntax. foo[..] is another variant that might be more symmetric?

Also, in Ruby, .. vs ... is a question of inclusive vs exclusive, is that worth having?

@aturon
Copy link
Member Author

aturon commented Aug 14, 2014

I think foo[:] is used in Python, where : is used like .. here.

I agree that [..] would be reasonable, but since as_slice is so very common, IMO the shorter the better (and I think [] is reasonably clear).

I think inclusive/exclusive is probably overkill here.

@metajack
Copy link

Python uses [:], but so does Julia, Matlab, and probably a few others. I'm fine with [] I think, I just thought I'd bring up what other syntaxes exist in case it hadn't been considered.

@Valloric
Copy link

+1. Would love to see this!


```rust
trait Slice<Idx, S> {
fn as_slice<'a>(&'a self) -> &'a S;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, returning a reference like this limits the trait to contiguous-memory slices, e.g. it wouldn't be possible to write x[n..m] on a RingBuf, or rope, or a hypothetical StridedSlice<'a, T> { data: &'a [T], step: uint } type that represents every stepth element of data (i.e. strided[i] == data[i * step]).

I think this may mean it is only possible to use with [T]/str and Vec<T>/String.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to use this with other unsized types as well (i.e., structs with embedded [T] fields).

There are two ways to get the kind of generality you have in mind:

  • HKT. In that case, the type "S" here could instead be a lifetime-indexed family of types, so you could use StridedSlice for example.
  • Lifetime lifting. That is, rather than have the trait methods take &self, have them take self and impl the trait on reference types. (That's more generally a way to encode HKTs.) Problem is, auto-deref and auto-borrowing for method receivers doesn't work for this kind of case: if foo: Vec<T>, you'd have to write (&foo)[] to get a full slice, for example. You can hack around this in various ways, but it quickly gets pretty ugly.

I'm not sure if there's a design that would allow us to smoothly transition to an HKT type parameter later on. Worth thinking about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to use this with other unsized types as well (i.e., structs with embedded [T] fields).

How many such types will we have for which slicing makes sense? I can only think of minor variations on the 4 types above e.g. different allocators (which Vec<T, Allocator> would solve anyway) and interned strings ala libsyntax's InternedString.

(My point here is this RFC isn't particularly flexible; it's really nice for the places it does work.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this to the list of things that HKT will allow us ;)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand exactly why HKT is invoked here. I just did some testing with these traits returning just S and everything seemed to work fine.

Again, I am going to protest this RFC just like the Index traits one because it does not support lazy evaluation, which is critical to making efficient mathematical libraries. In particular, it seems clear that if this specific design is kept, it will need to be supplemented by SliceSet and SliceGet like Index/IndexMut should be.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SiegeLord The traits as given work fine. My point about HKT is that you'd need it to make the traits general enough to do what @huonw was describing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SiegeLord I wasn't around when the Index traits were designed. Can you either point me to that earlier discussion, or explain your concerns in more detail?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SiegeLord were you referring to #111?

@alexcrichton
Copy link
Member

I would expect a common question about this syntax to be "Why did you choose .. over :?", or conversely "Why did you choose : over ..?". I'm curious, but what led you to ..? It may be worth adding some rationale for the choice of syntax (whatever it ends up being).

Of the other two bits of syntax introduced, I would predict fewer questions. I would expect that the usage of brackets to be unanimously expected as brackets convey "slice" in rust right now. I think it's definitely necessary to have a mutable slicing variant, and I think that the syntax you've proposed is likely the best one available.

@aturon
Copy link
Member Author

aturon commented Aug 14, 2014

@alexcrichton Updated with rationale.

The `as_slice` operator is particularly important. Since we've moved away from
auto-slicing in coercions, explicit `as_slice` calls have become extremely
common, and are one of the
[leading ergonomic/first impression](https://github.com/rust-lang/rust/issues/14983)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure that e.g. foo(some_string[]); is better from a first impressions/beginners perspective, it's rather cryptic/hard-to-google (sigil like?) syntax, unlike foo(some_string.as_slice()) or just foo(some_string).

@brendanzab
Copy link
Member

Awesome, looks good! I prefer vec[..] for vec.as_slice() because it is more consistent, but it is a minor thing.

@brendanzab
Copy link
Member

Could you include a mention of what standard library types will implement these traits?

@liigo
Copy link
Contributor

liigo commented Aug 14, 2014

I prefer .. range syntax, while : should always used as name:type.


*Mutable slicing*

- `foo[mut]` for `foo.as_mut_slice()`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This puzzles me. Why doesn't indexing use a[mut 0] for this same effect? Is there a between-the-lines proposal here to make IndexMut also require a mut annotation? If not, why does this need an annotation and IndexMut not?

@bluss
Copy link
Member

bluss commented Aug 14, 2014

.. is good, but let me note that match does use it for inclusive ranges, which is a bit of an allround inconsistency. i.e. match ch { 'a' .. 'z' => {} }

Obligatory Python-inspired question, what about multiple dimensions and strides? Python has xs[:, 1:3, :], does it make sense for Rust to mimic that with xs[.., 1..3, ..] for the same purpose? (With this example, [..] is the winner over [] for the full slice syntax.)

And a strided slice would be less efficient and not implementable as &[T]. Python's xs[a:b:c] could be maybe xs[a..b..c] in Rust.

@SiegeLord
Copy link

I have always wanted this sugar, but I always was afraid that it'd be implemented in this very restricted fashion, making it primarily useful for built-in types and Vec. Personally I'd prefer for this not to be if it's so restricted, and rather for macros to be enhanced to support something like the foo.s![1..2] syntax, which would convinently take care of all the stride/multiple dimension queries.

A reference for the multi-dimenstional slicing is D's solution, which might be doable in Rust: http://dlang.org/operatoroverloading#Slice

@aturon
Copy link
Member Author

aturon commented Aug 14, 2014

@SiegeLord It would be helpful if you could spell out the restrictions you're worried about here, and what you'd like to see instead. (Or if you've written about this previously, please post a link.)

@nikomatsakis nikomatsakis mentioned this pull request Aug 14, 2014
@lilyball
Copy link
Contributor

👍

@bluss I agree that this is mildly inconsistent with match, but really it's match that's doing it wrong, .. is traditionally a half-open range.

@SiegeLord The support here should not interfere with your ability to write foo.s![1..2] if we ever get method macros. And I too want to know what restrictions you're talking about. I can't imagine how slicing is ever supposed to work on e.g. a RingBuf, unless you really just want a convenient syntax to produce iterators.

@aturon
Copy link
Member Author

aturon commented Aug 14, 2014

@huonw A quick note regarding the more general version: I talked with @nikomatsakis about it this morning, and it should be possible to add a SliceHKT trait later with a more general (HKT) return type. We would provide a blanket impl of SliceHKT for anything implementing Slice, and switch the notation to use SliceHKT. So it should be backwards-compatible.

That should suffice to eventually handle striding, ropes, etc.

I'll amend the RFC with these points and a few others soon.

@SiegeLord
Copy link

Well, here's a very simple illustration of what would be possible with 'better' operator overloading. This snippet (made very non-generic for illustrative purposes) adds two vectors elementwise and then slices them without allocating a temporary vector. Note how there is nothing to return the reference to in the slice_from and index methods. Additionally, the slice_from takes self by value in order for lifetimes to work out. This isn't a contrived example, and I've built a small linear algebra library based on these idioms. Notably, this is exactly the same thing that Iterator does.

struct Vector
{
    data: Vec<f64>,
}

impl Vector
{
    fn add<'l>(&'l self, other: &'l Vector) -> Adder<'l>
    {
        Adder{ lhs: self, rhs: other }
    }

    fn index(&self, idx: uint) -> f64
    {
        self.data[idx]
    }
}

struct Adder<'l>
{
    lhs: &'l Vector,
    rhs: &'l Vector
}

impl<'l> Adder<'l>
{
    fn index(&self, idx: uint) -> f64
    {
        self.lhs.index(idx) + self.rhs.index(idx)
    }

    fn slice_from(self, from: uint) -> Slicer<'l>
    {
        Slicer{ base: self, from: from }
    }
}

struct Slicer<'l>
{
    base: Adder<'l>,
    from: uint
}

impl<'l> Slicer<'l>
{
    fn index(&self, idx: uint) -> f64
    {
        self.base.index(self.from + idx)
    }
}

fn main()
{
    let a = Vector{ data: vec![1.0, 2.0, 3.0] };
    let b = Vector{ data: vec![4.0, 5.0, 6.0] };
    let res = a.add(&b).slice_from(1);

    assert_eq!(res.index(0), 7.0);
    assert_eq!(res.index(1), 9.0);
}

@aturon
Copy link
Member Author

aturon commented Aug 14, 2014

@SiegeLord Thanks. This seems essentially the same as the point as @huonw raised. In both cases, the problem is that we need HKT in order to abstract over a type like Slicer (which is actually a family of types indexed by lifetimes).

See my comment above to @huonw, namely that we will be able to add this more general form of slice syntax later, after we have HKT, in a backwards-compatible way. I will add details to the RFC as soon as I can, but the basic point is that accepting this syntax now doesn't preclude us from doing the more general version later.

@SiegeLord
Copy link

Well, that's where my question comes in. I managed to create a fully functional library without HKT today (perhaps it could be more functional with HKT?) but it's not clear to me that HKT is necessary.

Here are the relevant traits and implementations from my library:

pub trait MatrixSlice
{
    fn slice(self, start: uint, end: uint) -> Slice<Self>;
}

impl<T: MatrixShape>
MatrixSlice for
T
{
    fn slice(self, start: uint, end: uint) -> Slice<T>
    {
        Slice::new(self, start, end)
    }
}

pub trait MatrixShape
{
    fn ncol(&self) -> uint;
    fn nrow(&self) -> uint;
}

impl<'l>
MatrixShape for
&'l Matrix
{
    ...
}

You'll note that I did specialize my Slice trait to have a specific return type, but I just tried using a generic return type and everything seems to still work:

pub trait MatrixSlice2<T>
{
    fn slice2(self, start: uint, end: uint) -> T;
}

impl<T: MatrixShape + Collection>
MatrixSlice2<Slice<T>> for
T
{
    fn slice2(self, start: uint, end: uint) -> Slice<T>
    {
        Slice::new(self, start, end)
    }
}

@aturon
Copy link
Member Author

aturon commented Aug 14, 2014

@SiegeLord That works because you're taking self by value. (See my first comment to Huon.) This is effectively a way of encoding HKT by "lifting" the lifetime parameter to the type implementing the trait.

In general, though, we want indexing and slicing operations to take &self or &mut self, and in that case the lifetime of the slice returned needs to depend on the lifetime of the self parameter.

@SiegeLord
Copy link

It has to be self in order to support chaining, as far as I understand. I.e. today this doesn't pass borrow-check:

let a = vec![1u];
let s = a.as_slice().as_slice();

My by value self varieties work fine, however (aside from the occasional explicit borrow that you mentioned to Huon).

@aturon
Copy link
Member Author

aturon commented Aug 14, 2014

@SiegeLord That's a problem with the rules for borrowing and temporaries, which will be fixed when the implementation of @kballard's RFC lands.

@aturon
Copy link
Member Author

aturon commented Aug 14, 2014

@SiegeLord To be more clear, if we used self for Index and Slice, you'd have to write things like:

let a = vec![1u];
let x = (&a)[0];
let s = (&a)[];

which seems less than ideal. I think with the HKT-based traits and better rules for temporaries (which we've already committed to doing), we should be able to have good ergonomics and still handle your and @huonw's examples.

@SiegeLord
Copy link

Alright, I'll defer until those things are implemented and maybe try to port my library away from self to see how it works out.

@glaebhoerl
Copy link
Contributor

Thinking about it, I don't see why we couldn't have .. be an inclusive range. Scrolling through the discussion I see some strongly-worded claims that it would be confusing, but not much justification.

Naively I would expect that if foo[0] is one element, then foo[0..1] is two elements.

Then we're also consistent with match, don't need to think about adding ..., and so on. If the issue is that it would be inconsistent with range(), then I think that's a smaller problem, because range() is a library function and doesn't use .. syntax.

@SiegeLord
Copy link

Exclusive ranges are a lot easier to deal with:

let slice = some_vec[a..a + 2];
assert_eq!(slice.len(), 2);
let m = some_vec.len() / 2;
// These two slices have no overlap
let slice1 = some_vec[..m];
let slice2 = some_vec[m..];

You'd have to sprinkle lots of + 1 or - 1 everywhere if that syntax was inclusive. Having used slices extensively in D and Python (where they are exclusive), having them be exclusive is very natural and convenient. Having used them in Octave/MATLAB where they are inclusive, I consider that to be one of the worst mistakes made in MATLAB (right next to 1-based indexing).

@lilyball
Copy link
Contributor

I think slicing pretty much requires exclusive ranges. It's confusing otherwise.

Honestly, maybe it's not really that bad to have .. in a slice be an exclusive range and .. in a match arm be an inclusive range. It's weird, sure, and I think it makes more sense to adjust match arms to use ..., but if there's some reason people don't like that, we could just use .. in both places. The contexts are different, and the semantics of the .. are different (one is basically shorthand for a bunch of alternate arms, the other is defining a range to index over).

@pcwalton
Copy link
Contributor

Agreed with @kballard.

@CloudiDust
Copy link
Contributor

@kballard @pcwalton, it is still better to have more consistency, and honestly I don't see why one would oppose this change (other than maybe he/she prefers the Ruby way, then he/she will oppose the slice notation too). This is not Rust's first breaking change after all. :P

@lilyball
Copy link
Contributor

@CloudiDust At this point, I think we should modify match. However, that can be done separately to introducing slicing. My claim right now is that we can proceed with the RFC for overloaded slice notation without modifying match, and a separate RFC can be introduced later to change the match syntax to use ... instead.

@CloudiDust
Copy link
Contributor

@kballard, I agree with you.

@liigo
Copy link
Contributor

liigo commented Aug 26, 2014

agree with @kballard (his latest comment).

@SiegeLord
Copy link

Incidentally, I want to mention an important use case for exclusive ranges in matches (i.e. I'm suggesting allowing both .. and ... in them with the former being exclusive like in slices, and the latter being inclusive): floating point ranges:

// .. means exclusive here
match 1.0f32
{
    0.0 .. 1.0 => (),
    1.0 .. 2.0 => (),
    _ => ()
}

It is basically impossible to write the above match with inclusive ranges, so while we support floating point ranges, there is some impetus to providing exclusive ranges with matches.

That said, floating point ranges are kind of... strange anyway, so an alternate solution would be the non-solution of nuking this problematic feature.

@aturon
Copy link
Member Author

aturon commented Aug 29, 2014

I've amended the RFC to clarify some points discussed in the comments, and to change the match notation to ... (for inclusive matching).

@lilyball
Copy link
Contributor

👍, although I think the Summary should also include a mention of the match change.

@aturon
Copy link
Member Author

aturon commented Aug 29, 2014

@kballard Thanks, updated.

@alexcrichton
Copy link
Member

Discussed a few weeks ago and the decision was to merge this.

@John-Nagle
Copy link

Did this make it into the pre-1.0 Alpha? The reference manual doesn't mention it. Currently, where s is a "&str", the old way works but is deprecated:

256:34 warning: use of deprecated item: use slice notation [a..] instead, #[warn(deprecated)] on by default
src/parsedatetimestd.rs:256     let sdigits = s.slice_from(1);  

The new way doesn't work yet.

src/parsedatetimestd.rs:257:9: 257:16 error: the trait `core::marker::Sized` is not implemented for the type `str` [E0277]
src/parsedatetimestd.rs:257     let sdigits = s[1..];  

Something seems to be out of sync.

@sfackler
Copy link
Member

@John-Nagle let sdigits = &s[1..];

@John-Nagle
Copy link

OK. Still needs to go in the reference manual, though.

@Centril Centril added A-syntax Syntax related proposals & ideas A-slice Slice related proposals & ideas A-patterns Pattern matching related proposals & ideas A-ranges Proposals relating to ranges. labels Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-patterns Pattern matching related proposals & ideas A-ranges Proposals relating to ranges. A-slice Slice related proposals & ideas A-syntax Syntax related proposals & ideas
Projects
None yet
Development

Successfully merging this pull request may close these issues.