Returning MatrixSlice when requesting rows and columns #87

AtheMathmo · 2016-11-02T15:36:39Z

This issue is to move discussion from #84 and #78 in one place. I'll fill out this issue properly soon, but primarily we want to discuss how best to handle the return values on the row and column access functions.

AtheMathmo · 2016-11-14T16:36:29Z

I want to put a little more momentum into this to see if we can come to a decision. After thinking a little more I'm coming around more with @Andlon 's thinking. I expect that most of our users would prefer get_row and iter_row functions to return a MatrixSlice directly. This is also plays more nicely with added get_col and iter_col functions. I think that the as_contiguous_data function will be a fairly ugly but necessary addition which allows performance-focused users to get what they need in rare cases.

I want to run a few tests to check that we don't see any obvious regressions from this modification. If all looks good then hopefully we can get this change in and push in a lot of the other PRs which are waiting on 0.4.

AtheMathmo · 2016-11-14T17:34:03Z

One more disadvantage of switching iter_rows to return MatrixSlice: we currently implement FromIterator for &[T]. I think we would need to change this to work with MatrixSlice as currently we have some nice mat.iter_rows().skip(1).take(3).filter(...).collect::<Matrix<_>>() things we can do.

Maybe we can play around with Into to get this to work nicely?

Andlon · 2016-11-14T18:04:34Z

@AtheMathmo: I'm probably missing something basic here, but just based on what you said, isn't it just a matter of implementing FromIterator<MatrixSlice> for Matrix...?

AtheMathmo · 2016-11-14T18:05:43Z

Yes that would work and is much easier :). Thanks!

Andlon · 2016-11-14T20:42:29Z

A related issue: Does it actually make sense to have functions that return an Option for a given row/col index? I've been thinking on this a lot, and I really cannot come up with a single situation where I would attempt to access a row/col without already knowing that the index is within bounds.

Unless there are obvious situations where this is truly useful, I think the API would be a lot cleaner if it simply panicked when the index is out of bounds:

mat.row_mut(i) += 2.0;

vs.

mat.get_row_mut(i).unwrap() += 2.0;

(Here I've used the proposed feature where these functions return a MatrixSlice)

Andlon · 2016-11-14T20:58:58Z

I also just realized an ambiguity in the FromIterator<MatrixSlice> suggestion.

Consider the following:

// Rows
let mat: Matrix<f64> = other_mat.row_iter().collect();

// Columns
let mat: Matrix<f64> = other_mat.col_iter().collect();

// Slice. Very contrived, but just to illustrate my point
let mat = Matrix<f64> = (0..5).map(|i| mat.sub_slice([i, 0], 2, 1))
                              .collect();

In particular, given a set of MatrixSlice objects, how do we know how to "put them together"? In the above example, there's really no obvious way to determine whether they should be stacked horizontally or vertically, for example!

This would be solved by custom Row and Col, which we have discussed before. I'm actually starting to lean towards this solution again now, for the above reason. Given Row and Col structs, it's unambiguous how to glue them together. Unfortunately we lose the ability to let the result of row() or col() automatically act as a receiver in binary operators, so we wouldn't be able to do the above example, which I repeat here for completeness:

// Wouldn't be easily possible with custom `Row` struct,
// as we don't want to reimplement all binary operators
mat.row_mut(i) += 2.0;

// However, this works if we implement Deref
let row_doubled = mat.row(5) * 2.0;

// We could restore the top example with:
mat.row_mut(i).as_slice() += 2.0;

The end result here is not as nice as the above example, but I suppose it's bearable, and you don't need to provide MatrixSlice::as_contiguous_data(). Thoughts?

andrewcsmith · 2016-11-14T21:22:15Z

@AtheMathmo I would strongly prefer these references (particularly rows_iter()) to return a MatrixSlice, so I'm jumping in with a possible suggestion to solving the FromIterator for &[T] problem.

Couldn't you just impl std::convert::AsRef<[T]> for MatrixSlice<T>?

That would mean that you could keep the impl as general as possible. It also seems reasonable as an implementation anyway, since we would conceivably want a slice to be referenced as a &[T]. As I recall, a MatrixSlice is always (for now) contiguous data, so this shouldn't be a problem.

AtheMathmo · 2016-11-14T21:25:43Z

There's a lot to comment on here (and I will jump in to say more soon), but in response to @andrewcsmith: we cannot guarantee that MatrixSlice is contiguous. Matrix will be, but MatrixSlice could reference a block of non-contiguous data within a Matrix.

I also agree that returning a MatrixSlice would be nicer :). Just trying to figure out how that should impact the rest of the API (and performance where necessary).

andrewcsmith · 2016-11-14T21:57:00Z

@AtheMathmo Oh right, I was thinking specifically of the current implementation of iter_rows() which (iirc) returns a &[T] and doesn't work on cols.

It does seem that there are a few places where we can guarantee contiguous data at compile time, and if I understand correctly you don't want to lose that compile-time-optimization. We might then create two structs: ContiguousMatrixSlice and NotNecessarilyContiguousMatrixSlice.* ContiguousMatrixSlice could impl AsRef<[T]>, and that should give us the compile-time optimizations we're looking for. All the basic MatrixSlice operations could then be moved into a trait, rather than a single struct, and implementations could take place on that trait.

I'm taking some inspiration in these thoughts from Niko Matsakis's series on parallel iterators in Rayon, which splits the functionality of what can be parallelized into a few different categories. And, I think for such a performance-sensitive library like rulinalg, increasing type complexity is worth the (potential) performance gains.

*Obviously these names are terrible.

Andlon · 2016-11-14T22:03:02Z

@andrewcsmith: that idea certainly has merit, and in many ways would be ideal. One could also leverage Deref to avoid having to reimplement everything. There's one problem, however: one would have to implement binary operators for all combinations of such matrix slices :(

Deref does work for binary operators too, but unfortunately only when the struct which is being de-referenced is on the right side, not the left.

EDIT: Actually, when considering only the functionality you mentioned in your example, it might perhaps work with generic binary operators on the MatrixSlice trait, as long as we don't need specialization of the binary operators...?

andrewcsmith · 2016-11-14T22:15:15Z

@Andlon Right, that gets complicated quickly. Well, depending on the kind of available time/urgency of the issue, it does seem like separate structs and traits leverage Rust's type system and low-level power in the best possible way. And of course macros could help.

Anyone know anything about whether a custom Derive in Macros 1.1 could help speed this along?

Edit: Seems to me there is one instance where Macros 1.1 could help, and that is to create the macro #[derive(MatrixOps)]. But, honestly, that seems to be overkill; we could just as easily provide a generic impl MatrixOps over all the relevant structs using normal macros.

AtheMathmo · 2016-11-23T15:09:37Z

Sorry I forgot to properly respond to the comments here.

I think I agree with @Andlon in that we probably don't need the get_ prefix. I thought I had read this was a standard convention elsewhere but I can't find any evidence of that and actually prefer getting rid of it. I think it does make functions like row_unchecked (vs. get_row_unchecked) a little ambiguous.

I also agree that having iter_rows and iter_cols return MatrixSlice or &[T] does lead to some dimensionality confusion as discussed by @Andlon. Having explicit Row and Col relieves this - but again has the issue of exploding operation overloads. If we can find a nice Deref solution that would be ideal - maybe we can implement AsRef to take Row/Col to &MatrixSlice?

As for the explicit contiguous and non-contiguous MatrixSlice, I'm not too sure how this would work. How would we avoid the exploding operation overloading? I feel that the current Matrix, MatrixSlice, BaseSlice set up is fairly clunky which I think is the source of most of our pains here. However to me the solution is to have Custom DSTs (lets us be analogous to &[T] vs Vec for multidimensional structures).

Right now we've tried to move a lot of functionality into the BaseSlice trait to get around this issue so we can sort-of treat Matrix and MatrixSlice etc. the same in code.

I've probably missed some things. Please feel free to poke me on those.

andrewcsmith · 2016-11-23T20:07:45Z

Regarding complexity of traits and specialized impls for non/contiguous data, here's my thought: we already have PR #83 for sparse matrices. Conceivably, whatever implementations we develop allowing interaction between contiguous and non-contiguous storage schemes should also allow interoperability with sparse matrices. I do think it would be good to have a fairly granular trait interface, as long as that granularity yields compile-time optimizations. In the case of non/contiguous data, that's definitely possible. I'm not totally clear on what is vague, aside from outlining maybe the exact functions that will fall under each trait.

On having to re-implement all arithmetic operations: we could provide a impl of MatrixOps<T, U> for any T: Index<[usize, usize]>, U: Index<[usize, usize]>, via a macro (until specialization lands). That would take care of having to manually reimplement every possible pair of values, but would let us implement more efficient algorithms for (say) contiguous data.

Andlon · 2016-11-23T22:01:06Z

@andrewcsmith: Currently, matrix multiplication in rulinalg is implemented in terms of the matrixmultiply crate. In addition, BLAS bindings are on the wish list. Unfortunately, both these approaches are incompatible with your Index suggestion (which otherwise makes a lot of sense!) as I understand it, as they take a chunk of data (possibly with strides) as input to a black-box matrix multiply function. Without specialization, I think it is unfortunately not possible to go down this route...

@AtheMathmo: We can let Row and Col deref to MatrixSlice. We still get the binary operators of MatrixSlice as long as the Row or Col is on the right side. I speculate that it would give us the following:

// Through Deref, this should work
let doubled_row = 2.0 * &mat.row(i);

// But I think this does not, because the receiver (left-hand side) is not auto-dereferenced
let doubled_row = &mat.row(i) * 2.0;

// However, I _think_ this works
let doubled_Row = *mat.row(i) * 2.0;

Furthermore, I'm not sure if this actually works, but perhaps implementing DerefMut would let us do the following?

*mat.row(i) += 2.0;

It is my hope that the above would work because it first dereferences to a & mut MatrixSliceMut and uses its AddAssign<T> operator. I haven't actually worked with Deref before, so I'm not entirely sure if this is how it works.

In any case, it's not ideal, but I thought I'd explore some options. I think we're running into some present constraints of the Rust language at the moment. Hence I think we might want to settle for something that is not too complex for now. Once specialization lands I think a much better solution will be available. Should we perhaps compile a list of desired properties of the solution?

AtheMathmo · 2016-12-02T14:13:20Z

Just posting to say that I'm going to try and get a draft version of this working over the weekend. Just to give us an idea of what this actually looks like. This seems a helpful idea to me as there are a lot of things floating around right now. I'll just focus on getting a working API right now, but will want to get benchmarks etc. sorted before merging.

Thank you both for your comments - they are going to be very helpful!

AtheMathmo · 2016-12-04T11:36:37Z

Have made some progress on this but ran into some problems now.

Changing the get_row functions was pretty easy, very few things needed the &[T] return type here. However changing the iterator is a very different story. We have the following trouble use cases:

Collecting the row iterator into a matrix
Zipping the rows to perform vectorized operations. i.e. utils::in_place_vec_bin_op.

We can get around each of this by using the as_contiguous_slice function. However I am fairly confident that this will incur a noticeable performance hit as we need to check that the matrix is contiguous on each call (maybe we'll get some optimization from the compiler). Testing this is sadly tricky, as the change leads to ~70 compiler errors.

This makes me want to move to the custom Row and Column types a little sooner. My thought being that we can have:

pub struct Row<'a, T>(&'a [T])

pub struct Column<'a, T>(MatrixSlice<'a, T>)

And equivalent for RowMut and ColumnMut. This way we can easily access the row slice for our zipping use case. And we can have custom FromIterator for both Row and Column. (As opposed to adding a FromIterator<MatrixSlice> which I think could be confusing).

Unless there are any objections I'll move in this direction. To handle operator overloading for now I'm going to try using Deref to take each to a MatrixSlice. This seems a little whacky but hopefully will look ok to the user.

Andlon · 2016-12-04T11:51:13Z

@AtheMathmo: I think this sounds like a very promising direction! Completely agree with everything. Your approach of just internally wrapping a regular slice for the Row iterator sounds like a very good approach.

And we can have custom FromIterator for both Row and Column.

Did you mean IntoIterator?

Unless there are any objections I'll move in this direction. To handle operator overloading for now I'm going to try using Deref to take each to a MatrixSlice. This seems a little whacky but hopefully will look ok to the user.

Don't forget to implement DerefMut for the mutable versions :) I think in the end it will be an acceptable solution - at least until Rust maybe some day offers us more choices in this matter.

AtheMathmo · 2016-12-04T11:53:50Z

Did you mean IntoIterator?

I did mean FromIterator. So that we can do thing like: let mat: Matrix<T> = slice.iter_rows().collect() (and the same for iter_cols eventually).

However we will also need IntoIterator for Row etc. too!

until Rust maybe some day offers us more choices

I hope so! I always find it hard to tell whether I am missing an obvious way to do things or we really are right at the boundary of what is currently possible. Fortunately that boundary moves pretty quickly!

Andlon · 2016-12-04T11:57:07Z

@AtheMathmo: ah. FromIterator<Row<T>> for Matrix<T>. I was reading it the other way around, i.e. FromIterator<> for Row<>. Yes, this makes sense :)

AtheMathmo · 2016-12-04T12:05:05Z

It looks like we wont be able to have deref coersion in our arithmetic expressions. I think that having to explicitly deref isn't too painful though. For me *row += 2 is acceptable. I'll just try to make this clear in the documentation.

Andlon · 2016-12-04T12:07:44Z

Yes, this is what I've concluded with in earlier efforts too. That is, that auto-dereferencing does not take place on the receiver for binary operators (and likely won't do so for any foreseeable future, as it would cause potential combinatorial explosions for the Rust compiler). I also think this is acceptable, provided the docs are clear on this point, as you say.

AtheMathmo · 2016-12-04T13:23:33Z

Well, as I probably should have expecting getting Deref to work is pretty nasty. It works totally fine for Column as we own the type we are dereferencing too. But for Row it is quite horrible. We can just about get Deref working for Row:

impl<'a, T: 'a> Deref for Row<'a, T> {
    type Target = MatrixSlice<'a, T>;

    fn deref(&self) -> &MatrixSlice<'a, T> {
        unsafe {
            let raw = &MatrixSlice::from_raw_parts(self.0.as_ptr(),
                                        1,
                                        self.0.len(),
                                        self.0.len()) as *const _;
            &*raw
        }
    }
}

Though I haven't even begun to think about the safety implications there.

We cannot get DerefMut to work though. In order to get DerefMut we need Deref with the same type. In our case this is MatrixSliceMut which requires a *mut T raw pointer. But Deref takes &self and so we cannot get that mutable raw pointer. Not without doing some very nasty unsafe things that would be obviously bad (we don't want to give out a mutable pointer to a non-mutable object).

I don't think there is a way around this one sadly.

EDIT: The way around I'm adopting is to have Row own a MatrixSlice but give it a utility function to get the raw slice.

Andlon · 2016-12-04T14:18:52Z

@AtheMathmo: I'm a bit dizzy from not having eaten enough atm (working on that), but would it not be feasible to deref Row to MatrixSlice and RowMut to MatrixSliceMut? Then you only deref to a single type still. I've had some food now, and I understand that Deref of course cannot be implemented on MatrixSliceMut. So forget my previous comment!

I otherwise agree with your edit that this sounds like a good idea. The unsafe example you mentioned will have issues with lifetimes.

Also, you could alternatively store both a matrix slice and a reference to a slice, if this helps...? (avoids having to implement the utility function on MatrixSlice)

AtheMathmo mentioned this issue Nov 2, 2016

Adding column iterators (#3) #84

Closed

AtheMathmo mentioned this issue Dec 2, 2016

Updating get_row functions to return MatrixSlice #99

Merged

7 tasks

AtheMathmo closed this as completed Dec 11, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Returning MatrixSlice when requesting rows and columns #87

Returning MatrixSlice when requesting rows and columns #87

AtheMathmo commented Nov 2, 2016

AtheMathmo commented Nov 14, 2016

AtheMathmo commented Nov 14, 2016

Andlon commented Nov 14, 2016

AtheMathmo commented Nov 14, 2016

Andlon commented Nov 14, 2016

Andlon commented Nov 14, 2016 •

edited

Loading

andrewcsmith commented Nov 14, 2016

AtheMathmo commented Nov 14, 2016

andrewcsmith commented Nov 14, 2016

Andlon commented Nov 14, 2016 •

edited

Loading

andrewcsmith commented Nov 14, 2016 •

edited

Loading

AtheMathmo commented Nov 23, 2016

andrewcsmith commented Nov 23, 2016

Andlon commented Nov 23, 2016

AtheMathmo commented Dec 2, 2016

AtheMathmo commented Dec 4, 2016 •

edited

Loading

Andlon commented Dec 4, 2016

AtheMathmo commented Dec 4, 2016

Andlon commented Dec 4, 2016

AtheMathmo commented Dec 4, 2016

Andlon commented Dec 4, 2016

AtheMathmo commented Dec 4, 2016 •

edited

Loading

Andlon commented Dec 4, 2016 •

edited

Loading

Returning MatrixSlice when requesting rows and columns #87

Returning MatrixSlice when requesting rows and columns #87

Comments

AtheMathmo commented Nov 2, 2016

AtheMathmo commented Nov 14, 2016

AtheMathmo commented Nov 14, 2016

Andlon commented Nov 14, 2016

AtheMathmo commented Nov 14, 2016

Andlon commented Nov 14, 2016

Andlon commented Nov 14, 2016 • edited Loading

andrewcsmith commented Nov 14, 2016

AtheMathmo commented Nov 14, 2016

andrewcsmith commented Nov 14, 2016

Andlon commented Nov 14, 2016 • edited Loading

andrewcsmith commented Nov 14, 2016 • edited Loading

AtheMathmo commented Nov 23, 2016

andrewcsmith commented Nov 23, 2016

Andlon commented Nov 23, 2016

AtheMathmo commented Dec 2, 2016

AtheMathmo commented Dec 4, 2016 • edited Loading

Andlon commented Dec 4, 2016

AtheMathmo commented Dec 4, 2016

Andlon commented Dec 4, 2016

AtheMathmo commented Dec 4, 2016

Andlon commented Dec 4, 2016

AtheMathmo commented Dec 4, 2016 • edited Loading

Andlon commented Dec 4, 2016 • edited Loading

Andlon commented Nov 14, 2016 •

edited

Loading

Andlon commented Nov 14, 2016 •

edited

Loading

andrewcsmith commented Nov 14, 2016 •

edited

Loading

AtheMathmo commented Dec 4, 2016 •

edited

Loading

AtheMathmo commented Dec 4, 2016 •

edited

Loading

Andlon commented Dec 4, 2016 •

edited

Loading