Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Range-based indexing #149

Closed
Andlon opened this issue Feb 11, 2017 · 9 comments
Closed

Range-based indexing #149

Andlon opened this issue Feb 11, 2017 · 9 comments

Comments

@Andlon
Copy link
Collaborator

Andlon commented Feb 11, 2017

Currently, to take a slice of a matrix, one needs to use e.g. the sub_slice function. I propose to add implementations of Index<(Range<usize>, Range<usize>)>. See the following example:

let x = matrix![ 1.0, 2.0, 3.0;
                 4.0, 5.0, 6.0;
                 7.0, 8.0, 9.0];

// Want to take the 2x2 lower-right corner
// Current way:
let corner = x.sub_slice([1, 1], 2, 2);

// Range-based indexing:
let corner = x[(1 .. 3, 1 .. 3)];

// Or, more succinctly:
let corner = x[(1 .., 1 ..)];

// More flexible indexing, let's take the two first columns
let two_last_cols = x[( .. 2, ..)]

Similarly, we can slice rows or columns (returning Row or Column) using a mix of usize and Range<usize>:

let second_row = x[(1, ..)];
let second_col = x[(.., 1)];

// Only parts of a given row, still returns a `Row` struct 
// (so we can extract a raw slice, which is useful for performance)
let parts_of_second_row = x[(1, 1 .. 3 )];

The range-based indexing can be implemented in terms of Index<(R1, R2)> where R1, R2 are any of the following concrete types from the stdlib: Range, RangeFull, RangeTo, RangeFrom.

Any thoughts on this? I think it would really empower matrix slices.

@AtheMathmo
Copy link
Owner

AtheMathmo commented Feb 11, 2017

This is something that has been somewhat overlooked so far. First I'll explain why it doesn't exist right now.

Currently our index trait looks like this: Index<[usize; 2]>. I opted for this because I much preferred the mat[[i,j]] syntax to mat[(i, j)] (though both are worse than mat[i,j]). Sadly we cannot keep the same syntax for Range based indexing, because [T] must have a consistent type. This corresponds to R1 and R2 being the same in your example above. We could get around this by using the tuple syntax as you suggest but the inconsistency is undesirable.

To resolve this I see a few solutions:

  1. Ignore the inconsistency and have Range indexing use tuples. (A bad idea I think).
  2. Replace current indexing with tuples. (Open to discussion but I'm not keen on the look and it's a pretty nasty breaking change).
  3. Use Index<[R; 2]> and only allow the same type on each axis.
  4. Use inspiration from ndarray and provide a trait for indexing types. We'd implement this trait for [usize; 2], and hopefully this presents some nice ways around some of the issues described above (though not all).

@Andlon
Copy link
Collaborator Author

Andlon commented Feb 11, 2017

Initially, I also found x[[i, j]] to look better than x[(i, j)]. However, after having written a lot of indexing lately, I've realized that the latter is vastly more readable, because the square brackets tend to be easily confused (when glancing over the code) with common indexing letters that are straight, such as i or l.

I wanted to write up a thread on internal.rust-lang.org to see if there's any interest in providing syntax sugar for tuple indexing, i.e. x[i, j] would be the same as x[(i, j)], and if there are any reason this would not be a good idea.

In any case, that's not going to happen any time soon, so let's consider our options.

  1. Agree, this is not good.
  2. I'm open to this. We could also provide it alongside the existing indexing and try to gradually phase out array-based indexing.
  3. I think this is a very unfortunate and artificial limitation.
  4. This is an interesting idea, but it seems it doesn't really impact this particular problem though. That is, you still face the problem of which types to implement this trait for. It seems ndarray basically implements it for all of them, allowing both tuple-based and array-based indexing. This is also acceptable to me, I'm mostly concerned with being able to do flexible matrix slices much more simply.

@AtheMathmo
Copy link
Owner

I should have expanded a little on why I suggested the 4th option. I was indeed thinking that we could implement the trait for [usize; 2], and then tuple indexing such as: (usize, usize), (Range, RangeTo), etc. We could even impl OurIndex for usize and have this return a row - though this feels a bit like taking some overly strong moral stance...

Anyway, what I was trying to say was that we could do this to avoid making a strong decision right now. We get fancy range indexing with tuple syntax but don't break everyone elses indexing code (yet).

I would be interested in learning more about any official movement on syntactic sugar for indexing. If you do get around to making that post please let me know :).

@Andlon
Copy link
Collaborator Author

Andlon commented Feb 11, 2017

@AtheMathmo: that seems reasonable, and would probably make the code a whole lot cleaner as opposed to direct implementations of Index<(R1, R2)> for each combination of R1 and R2, as well as [R; 2]. I like it. Personally, I think single-number indexing of matrices should be avoided. In that case, one might simply just write x[(i, ..)] which is almost as short, and completely consistent with matrix indexing.

A bit of a side-note: One thing I've been thinking about lately, is the fact that in some places we use e.g. offsets or pointer arithmetic. These are inherently isize-based. In practice, this means that our indexing might be broken for values that lie in the interval (max(isize), max(usize)). For a 64-bit system, you would never get matrices or vectors this big, but I think it's at least conceivably possible to build such a large vector on a 32-bit system... I'm just wondering if this could cause us any problems. Most likely not, I expect!

@Andlon
Copy link
Collaborator Author

Andlon commented Feb 11, 2017

It was just pointed out to me on IRC that we actually cannot return a MatrixSlice, because Index returns a reference to its Output. There doesn't seem to be any way around that currently (afaik).

Hence my point here is mostly moot!

@AtheMathmo
Copy link
Owner

Ah I totally forgot about this part of the problem! Before we can do this (and many other cool things) we need custom DSTs. Sadly this PR was closed recently and although I'd love to keep pushing it forward myself I lack the technical know-how.

@Andlon
Copy link
Collaborator Author

Andlon commented Feb 11, 2017

Thanks, that was an interesting read!

@Andlon
Copy link
Collaborator Author

Andlon commented Feb 11, 2017

Oh, and by the way, I decided to post on internals about tuple syntax sugaring: https://internals.rust-lang.org/t/opinions-on-syntax-sugar-for-tuple-based-indexing/4776

Will be interesting to see if anyone has anything to say on the topic!

@Andlon
Copy link
Collaborator Author

Andlon commented Feb 12, 2017

Since we're not going to be able to do this for a potentially very long time, I will close this issue for now. I hope we can revisit it in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants