Generalize matrix function signatures #1470

t4c1 · 2019-11-29T12:39:12Z

Description

Matrix functions can be generalized to work with Eigen expressions not just matrices. By propagating matrix expressions around we can avoid creating temporary matrices, improving performance. This could also make adding support for sparse matrices easier.

There was some discoussion about this on discourse. Also an abandoned PR #1309. Contrary to this PR I intend to split the this into smaller chunks:

element-wise functions (ones that use apply_scalar_unary). Prim implementation of these can in many cases use Eigen implementations that can be faster than std.
operator-like functions (add, subtract, multiply, divide, elt_multiply, elt_divide)
size and view functions (col, row, sub_col, sub_row, head, tail, diagonal, transpose, cols, rows, num_elements, dims)
reminder of */fun
*/prob
allow functions to return expressions

Example

Instead of :

Eigen::MatrixXd f(const Eigen::MatrixXd& x)

We can do:

template<typename T>
auto f(const T& x)

Expected Output

More general and possibly faster code.

Current Version:

v3.0.0

The text was updated successfully, but these errors were encountered:

bob-carpenter · 2019-11-29T16:31:39Z

+1 for this. We also need to do this to allow support for sparse matrices. @SteveBronder has been working on that along with perfect forwarding so we can reduce copying and exploit move semantics wherever possible.

andrjohns · 2019-11-29T23:05:04Z

We'd need to start coding a bit more defensively after implementing this, otherwise there's the risk of accidentally evaluating the expression multiple times. Take log_sum_exp for example:

template <typename T>
auto log_sum_exp(T& x) {
  if (x.size() == 0) {
    return -std::numeric_limits<double>::infinity();
  }

  const double max = x.maxCoeff();
  if (!std::isfinite(max)) {
    return max;
  }
  return max + std::log((x.array() - max).exp().sum());
}

If log_sum_exp is passed an expression, something like log_sum_exp(m1.array().exp().matrix() * m2), then that expression will get evaluated three times: x.size(), x.maxCoeff(), and the final return statement. So we'd need to start evaluating the expression at the beginning of the function, and probably also updating the documentation so that other/new developers know why

t4c1 · 2019-11-30T09:00:29Z

Right. I plan to do that in each function as I generalize its signature. At that time I can also make sure an expression is not used twice. Although I don't think just taking size of an expression evaluates it.

I also agree with the need to document this. Any idea where the doc should be put?

bob-carpenter · 2019-12-01T23:39:44Z

Absolutely. Is the pattern to do something like: Eigen::Matrix<T::Scalar, -1, 1> x_eval = x.eval() and then use that in the rest of the function? Evaluating `.size()` shouldn't need any evaluation. The expression template will already be evaluated and should at least know its size. `.maxCoeff()` is another story. So we'll have to dig in and see what's costly to evaluate and what's not. It would be a service to devs to update the developer-facing doc. I don't think this calls for any API-level doc changes other than of the acceptable input types for x here in log_sum_exp.

…

On Nov 29, 2019, at 6:05 PM, Andrew Johnson ***@***.***> wrote: We'd need to start coding a bit more defensively after implementing this, otherwise there's the risk of accidentally evaluating the expression multiple times. Take log_sum_exp for example: template <typename T> auto log_sum_exp(T& x) { if (x.size() == 0) { return -std::numeric_limits<double>::infinity(); } const double max = x.maxCoeff(); if (!std::isfinite(max)) { return max; } return max + std::log((x.array() - max).exp().sum()); } If log_sum_exp is passed an expression, something like log_sum_exp(m1.array().exp().matrix() * m2), then that expression will get evaluated three times: x.size(), x.maxCoeff(), and the final return statement. So we'd need to start evaluating the expression at the beginning of the function, and probably also updating the documentation so that other/new developers know why — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

bob-carpenter · 2019-12-01T23:40:34Z

:-) I didn't see this before responding. We're on the same page w.r.t. size(). I think by "evaluate" we mean evaluate operator()(i, j) for every (i, j) pair. It always gets constructed.

…

On Nov 30, 2019, at 4:00 AM, Tadej Ciglarič ***@***.***> wrote: Right. I plan to do that in each function as I generalize its signature. At that time I can also make sure an expression is not used twice. Although I don't think just taking size of an expression evaluates it. I also agree with the need to document this. Any idea where the doc should be put? — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

SteveBronder · 2019-12-02T03:15:51Z

I have a little info on adding new docs to the stan site here

https://github.com/stan-dev/math/blob/develop/doxygen/howto_add_page.md

I wonder if for some functions we use so much info about the matrices it would be better to leave their template arguments as Matrix<T, R, C> just to force evaluation.

t4c1 · 2019-12-02T07:56:50Z

If a function needs to evaluate an argument and the template params for that arguments are known (not templated) we can just leave that argument as is - const Matrix<T, R, C>&. However that does not work if we need template deduction at the same time. In that case I intend to use what Bob suggested.

andrjohns · 2019-12-03T00:32:25Z

If a function needs to evaluate an argument and the template params for that arguments are known (not templated) we can just leave that argument as is - const Matrix<T, R, C>&

You could also use the Ref<> functionality to maintain generality of input types, which (AFAIK) will take expressions as inputs and evaluate them into temporaries as needed. Using an example from Eigen's doc:

// read-only const argument:
void foo2(const Ref<const VectorXf>& x);

MatrixXf A;

foo2(A.row());              // Compilation error because A.row() is a 1xN object while foo2 is expecting a Nx1 object
foo2(A.row().transpose());  // The row is copied into a contiguous temporary
foo2(2*a);                  // The expression is evaluated into a temporary
foo2(A.col().segment(2,4)); // No temporary

Although some of the copies might be too expensive to be worth it

bob-carpenter · 2019-12-04T05:29:31Z

First, I thought Ref didn't work with slicing, segmenting, etc., but it seems to work OK. Second, I thought it was type not size that determined passing a VectorXd, but the following compiles and runs.

What are the rules for executing into temporaries? I wouldn't have thought any of those template expressions would cause an intermediate.

void foo2(const Eigen::Ref<const Eigen::VectorXd>& x) {
  std::cout << x << std::endl;
}

TEST(foo, bar) {
  Eigen::MatrixXd A(2, 2);
  A << 1, 2, 3, 4;
  foo2(A.row(0).transpose());

  Eigen::VectorXd a(2);
  a << 10, 100;
  foo2(2 * a);
  foo2(A.col(0).segment(0, 1));

  Eigen::MatrixXd B(2, 1);
  foo2(B);
}

t4c1 · 2019-12-04T07:24:13Z

This page says:

By default, a Ref can reference any dense vector expression of float having a contiguous memory layout. Likewise, a Ref can reference any column-major dense matrix expression of float whose column's elements are contiguously stored with the possibility to have a constant space in-between each column, i.e. the inner stride must be equal to 1, but the outer stride (or leading dimension) can be greater than the number of rows.

In the const case, if the input expression does not match the above requirement, then it is evaluated into a temporary before being passed to the function.

bob-carpenter · 2019-12-05T02:14:22Z

Thanks---that really clarifies what's going on. I wish we'd figured this out a few years ago!

…

On Dec 4, 2019, at 2:24 AM, Tadej Ciglarič ***@***.***> wrote: This page says: By default, a Ref can reference any dense vector expression of float having a contiguous memory layout. Likewise, a Ref can reference any column-major dense matrix expression of float whose column's elements are contiguously stored with the possibility to have a constant space in-between each column, i.e. the inner stride must be equal to 1, but the outer stride (or leading dimension) can be greater than the number of rows. In the const case, if the input expression does not match the above requirement, then it is evaluated into a temporary before being passed to the function. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

t4c1 · 2020-02-19T13:25:19Z

I am running out of ideas how to group remaining functions into PRs. So I will just go in alphabetical order.

SteveBronder · 2020-02-19T20:12:29Z

I am running out of ideas how to group remaining functions into PRs. So I will just go in alphabetical order.

Can we make a list of the remaining ones as an issue? I think you put one up somewhere a while ago though I couldn't find it. Would be good so we could divvy up work if anyone else is interested. I started picking apart some of the stuff for read_corr/cov_L etc.

t4c1 · 2020-02-19T21:33:43Z

No, I never made such a list. If you have a list of all functions in Stan Math I am happy to check ones that I already completed.

rok-cesnovar · 2020-02-19T21:35:01Z

I think Steve is reffering to the list of functions we made for the kernel generator.

SteveBronder · 2020-02-19T21:38:52Z

Ah yeah nvm wrong one. Yeah I can look through and slam an issue up. We can use it as the issue for the remaining ones

t4c1 · 2020-02-20T07:27:16Z

I don't think we need another issue for this as we have this one. Feel free to either post here or edit top level comment once you make the list.

t4c1 · 2020-11-10T09:48:50Z

We are very close to having all functions accept expressions, so I am thinking about the details of returning expressions. Whenever a function returns an expression referencing a local variable it needs to use holder (see #1775).

When a function returns an expression referencing a function argument it gets a bit less clear if a holder is necessary. In majority of use cases (including all possible uses from stan language) everything should work just fine without holder. However in c++ a function can be used in a way that the returned expression is not evaluated before its rvalue argument(s) get deleted. For example this happens if the resulting expression is stored in an auto variable. Example (assuming holder is not used in softmax):

Eigen::MatrixXd a = ...
Eigen::MatrixXd b = ...
auto c = softmax(a+b); //after this statement sum expression is destructed and softmax expression is left with a dangling reference
Eigen::MatrixXd d = c+1; //now the softmax expression c is evaluated

So the question is, should we use holder for such cases (to make code safer) or not (avoiding overhead of creating holder and simplifying code)? @SteveBronder @bbbales2 @andrjohns

bbbales2 · 2020-11-10T13:14:04Z

My first reaction is holders cause it's less to worry about then.

How does this code compare to something written with Eigen? Like:

#include <Eigen/Core>
#include <iostream>

int main() {
  Eigen::MatrixXd a = Eigen::MatrixXd::Random(2, 2);
  Eigen::MatrixXd b = Eigen::MatrixXd::Random(2, 2);
  auto c = (a + b).array().abs().sqrt();

  std::cout << c + 1.0 << std::endl;

  return 0;
}

Is Eigen having to allocate holder things along the way?

t4c1 · 2020-11-10T13:17:04Z

This example would work with eigen. However if you replaced addition with a function that returns a plain matrix (not expression) it would have the same issue.

t4c1 · 2020-11-10T13:18:51Z

Also eigen does not use anything like holder. In functions in Eigen, where such dangers would happen they usually evaluate the result of the function instead of returning an expression.

bbbales2 · 2020-11-10T13:24:25Z

Oh interesting. Adding an eval causes a memory problem here (had to turn on the sanitizer to get it). Weird: https://godbolt.org/z/M6sdGc

And what is the overhead of allocating a holder? Is it a malloc and a free, basically? Or is it something more?

t4c1 · 2020-11-10T13:26:44Z

Oh, I overlooked eval in your example. Yeah that is why it fails too.

About overhead: more or less just malloc and free.

bbbales2 · 2020-11-10T13:28:49Z

Nah nah, I added the eval after you said the plain type would cause it to fail. Tricky!

bbbales2 · 2020-11-10T13:35:07Z

should we use holder for such cases (to make code safer)

So using a holder in this case means our softmax would allocate a holder for its argument? Or where would the holder appear in the code?

t4c1 · 2020-11-10T14:56:23Z

Yes softmax would allocate it (if the argument is rvalue). You can look at implementation of value_of to see how holder is used.

t4c1 · 2020-11-11T10:05:36Z

I've been thinking about this some more. If we use holder for arguments, we need to do perfect forwarding on these arguments, as make_holder() needs to now whether it is dealing with lvalues or rvalues. But for these functions to actually know whether they have lvalue or rvalue wee need to forward arguments to them. So we need to use perfect forwarding in all functions that can call any function using holder. In other words in almost all functions.

On the other hand avoiding auto is not that hard, so I am leaning towards not using holder for arguments.

SteveBronder · 2020-11-11T17:56:02Z

I've been thinking about this some more. If we use holder for arguments, we need to do perfect forwarding on these arguments, as make_holder() needs to now whether it is dealing with lvalues or rvalues.

My cargo culting was not all wrong! 😆

If we really wanted to go this route I think I could come up with a regex that searches over function signatures and replaces const (TemplateName)& with (TemplateName)&&. Though then will have to start dealing with C++'s memory semantics which adds a non-trivial layer of complexity. The overhead of that could probably be mitigated a bit by good documentation. Though then we would use make_holder() adding one more level of complexity. We'd need like a Kurzgesagt style youtube infographic video to explain all of that

Or alt, could we have a make_holder that takes in a type like make_holder<T_A>(A) and for plain types does nothing but for expressions does the alloc/free? My thought here is that when we see something like an Addition<Mat, Mat> come into an function it's almost always going to be an rvalue. Though then there's Qs about Block<> types and dealing with things like Block<Addition<Mat1, Mat2>>.

It feels like kind of a bummer to ban auto. What happens with something like softmax(softmax(a + b))? Would a+b's expr still get erased here?

t4c1 · 2020-11-11T18:22:45Z

I am still trying to avoid littering all the code with calls to std::forward and make_holder, which even ignoring the work to get there would make code less readable.

That second paragraph is more or less how Eigen expressions work. That works in most cases but fails as soon as it gets a rvalue plain type.

It feels like kind of a bummer to ban auto

It would not get completely banned. Just banned unless you know what you are doing. :D

What happens with something like softmax(softmax(a + b))? Would a+b's expr still get erased here?

No. Temporaries are destructed after everything in a line of code gets executed. If expression referencing temporaries is evaluated in the same line as these temporaries are created everything works fine.

bbbales2 · 2020-11-11T18:43:57Z

It would not get completely banned.

Could we have our eval function handle this?

auto x = eval(...);

Like if you're using stan::math and you're assigning to an auto variable, eval. That's how I understand Eigen lol.

I guess that means the difference between us and Eigen then is that we can't have expressions as l-values and Eigen can?

t4c1 · 2020-11-11T19:11:15Z

Could we have our eval function handle this?

It can be used like this if you want.

I guess that means the difference between us and Eigen then is that we can't have expressions as l-values and Eigen can?

Nope. We are using Eigen expression anyway.

bbbales2 · 2020-11-11T19:27:49Z

Well Eigen can do:

auto c = (a + b).array().sqrt(); // somehow eigen's sum survives
auto d = c + 1.0;

And we cannot do:

auto c = softmax(a + b); // our sum gets destructed
auto d = c + 1.0;

SteveBronder · 2020-11-11T23:04:31Z

@t4c1 can we start removing the code like

    return ret_type(ret);

In functions that use reverse_pass_callback()?

https://github.com/stan-dev/math/blob/develop/stan/math/rev/fun/elt_multiply.hpp#L42

If it's safe to pass around arena matrices now I can also put up a lil PR in stanc3 that tests making stuff in the parameters block into arena matrices

SteveBronder · 2020-11-11T23:51:15Z

Actually wrt stanc3 here's a lil brutal version that uses arena_t as the matrix type when the scalar type of the matrix is a var

stan-dev/stanc3@master...SteveBronder:feature/arena-var-matrix

We could take off the ret_type() yada from the stuff the uses reverse pass callback rn and get an eyeball on how much we save there. Though imo the bigger thing is going to be doing some form of arena_t<> with the data

bbbales2 · 2020-11-12T00:55:41Z

can we start removing the code like

return ret_type(ret);

I'm still a bit nervous about this unless something has happened and we're talking about var<mat> and not mat<var>.

Check this code out:

Eigen::Matrix<stan::math::var, -1, -1> myfunc1() {
  Eigen::MatrixXd res_val(1, 1);
  res_val << 1.0;

  stan::arena_t<Eigen::Matrix<stan::math::var, -1, -1>> res = res_val;
  stan::math::reverse_pass_callback([res]() mutable {
      std::cout << "myfunc1" << std::endl;
      std::cout << res.val() << std::endl;
    });

  return res;
}

auto myfunc2() {
  Eigen::MatrixXd res_val(1, 1);
  res_val << 1.0;

  stan::arena_t<Eigen::Matrix<stan::math::var, -1, -1>> res = res_val;
  stan::math::reverse_pass_callback([res]() mutable {
      std::cout << "myfunc2" << std::endl;
      std::cout << res.val() << std::endl;
    });

  return res;
}

TEST(AgradRevMatrix, test_arena_output) {
  auto a = myfunc1();
  a(0) = 2.0;
  auto b = myfunc2();
  b(0) = 3.0;

  stan::math::grad();
}

In both cases the function sets the result value to 1, but this is the output:

myfunc2
3
myfunc1
1

The arena_t<Matrix<var, -1, -1>> is playing the same role as the vari ** s in the old varis, and if we don't make a copy of that on the output we're in danger of overwriting things.

SteveBronder · 2020-11-12T01:54:40Z

Oof yeah now that I think about this, any assign() statement could cause this. The only thing I can think of to have it would work be to do the same tricky stuff I'm working on right now for var<mat> assign where we save the previous slice and re-apply it as a callback in the reverse pass

t4c1 · 2020-11-12T08:15:42Z

Well Eigen can do: And we cannot do:

The difference is that softmax needs to create a local matrix, which is than referenced by returned expression and rsqrt only needs to reference the given expression.

@t4c1 can we start removing the code like

Yeah, Ben is right. We can do it for var. For mat we need the copy. But mat will eventually be gone anyway.

t4c1 · 2020-11-12T08:16:12Z

Also I am assuming you are ok with not using holder for arguments.

bbbales2 · 2020-11-12T12:07:04Z

The difference is that softmax needs to create a local matrix

Yeah and is that not true for all our functions?

I think what we'd write to document this is:

If a Stan expression is going to be used as an lvalue, it shouldn't depend on any rvalue temporaries. Practically this means that an expression saved as an lvalue should only use one function call, because the intermediates allocated in a sequence of function calls will get destructed.

Now that I write that it doesn't sound that clear, but that's what I'm thinking.

I feel like Eigen must have something special going on that we don't cause its expressions merge into bigger and bigger expressions that don't necessarily go away. The reason I'm making the direct comparison to Eigen is it would be cool if we could say "Stan expressions have the same limitations as Eigen expressions", which I don't think is the case cause our functions don't build giant expression types in the same way.

But mat will eventually be gone anyway.

var<mat> design doc needs an update. Assigning to the sub-matrix needs talked about and this seems like a big change of plans too.

bbbales2 · 2020-11-12T12:08:45Z

Also I am assuming you are ok with not using holder for arguments.

Yeah I'm cool with this. It's the same with any oddity. It's fine for the language since we can hide it, and it needs doc'ed for Math so someone doing C++ doesn't get confused and mad.

t4c1 · 2020-11-12T12:21:13Z

Yeah and is that not true for all our functions?

Not for all ,but for many it is true. What I am trying to say is it is not that we are doing things things in different way in our functions - we are doing different things.

it would be cool if we could say "Stan expressions have the same limitations as Eigen expressions"

Actually we are better - we have less limitations! In cases such as softmax that need internal matrices, we will use holder, so that will not be an issue anymore. Eigen's way of doing it would be to simply give up on expressions and eval the result.

I guess softmax was not the best example to start this discussion with, as it will need holder anyway.

bbbales2 · 2020-11-12T12:27:10Z

In cases such as softmax that need internal matrices, we will use holder, so that will not be an issue anymore
I guess softmax was not the best example to start this discussion with, as it will need holder anyway.

Oh okay so we are going to add some holders, just not holders everywhere. Yeah @ me when you put up this pull so I can stare at it a bit.

t4c1 · 2020-11-12T12:30:43Z

It will be up in 5 min :)

t4c1 mentioned this issue Nov 29, 2019

Generalize element-wise functions #1471

Merged

5 tasks

t4c1 mentioned this issue Jan 20, 2020

Generalize operator-like functions #1628

Merged

5 tasks

t4c1 mentioned this issue Jan 30, 2020

Generalize view and size functions #1660

Merged

5 tasks

This was referenced Feb 21, 2020

Generalize */fun starting with a-chol2 #1732

Merged

Generalize */fun startimg with chole-cov #1747

Merged

t4c1 mentioned this issue Mar 2, 2020

Generalize */fun starting with cr-d #1754

Merged

5 tasks

t4c1 mentioned this issue Apr 9, 2020

Generalize */fun starting with e-l #1828

Merged

5 tasks

This was referenced Apr 17, 2020

Generalize */fun from m to p #1845

Merged

Feature/1829 complex eigen traits #1838

Merged

generalize */fun q-r #1847

Merged

Generalize functions starting with s and t #1848

Merged

t4c1 mentioned this issue May 8, 2020

Let value_of and value_of_rec return expressions #1872

Merged

5 tasks

t4c1 mentioned this issue May 20, 2020

Generalize glms #1899

Merged

5 tasks

t4c1 mentioned this issue Nov 12, 2020

Let prim functions return expressions #2190

Merged

5 tasks

t4c1 mentioned this issue Nov 23, 2020

Remove linear indexing requirements from functions #2205

Merged

5 tasks

t4c1 closed this as completed Dec 18, 2020

t4c1 mentioned this issue Jun 16, 2021

Use Eigen nullaryExpr for rvalue indexing with multi_index stan-dev/stan#3046

Merged

3 tasks

Generalize matrix function signatures #1470

Generalize matrix function signatures #1470

Comments

t4c1 commented Nov 29, 2019 • edited Loading

Description

Example

Expected Output

Current Version:

bob-carpenter commented Nov 29, 2019

andrjohns commented Nov 29, 2019

t4c1 commented Nov 30, 2019

bob-carpenter commented Dec 1, 2019 via email

bob-carpenter commented Dec 1, 2019 via email

SteveBronder commented Dec 2, 2019

t4c1 commented Dec 2, 2019

andrjohns commented Dec 3, 2019

bob-carpenter commented Dec 4, 2019

t4c1 commented Dec 4, 2019

bob-carpenter commented Dec 5, 2019 via email

t4c1 commented Feb 19, 2020

SteveBronder commented Feb 19, 2020

t4c1 commented Feb 19, 2020

rok-cesnovar commented Feb 19, 2020

SteveBronder commented Feb 19, 2020

t4c1 commented Feb 20, 2020

t4c1 commented Nov 10, 2020 • edited Loading

bbbales2 commented Nov 10, 2020

t4c1 commented Nov 10, 2020

t4c1 commented Nov 10, 2020

bbbales2 commented Nov 10, 2020

t4c1 commented Nov 10, 2020

bbbales2 commented Nov 10, 2020

bbbales2 commented Nov 10, 2020

t4c1 commented Nov 10, 2020

t4c1 commented Nov 11, 2020

SteveBronder commented Nov 11, 2020

t4c1 commented Nov 11, 2020

bbbales2 commented Nov 11, 2020

t4c1 commented Nov 11, 2020

bbbales2 commented Nov 11, 2020

SteveBronder commented Nov 11, 2020

SteveBronder commented Nov 11, 2020 • edited Loading

bbbales2 commented Nov 12, 2020

SteveBronder commented Nov 12, 2020

t4c1 commented Nov 12, 2020

t4c1 commented Nov 12, 2020

bbbales2 commented Nov 12, 2020

bbbales2 commented Nov 12, 2020

t4c1 commented Nov 12, 2020

bbbales2 commented Nov 12, 2020

t4c1 commented Nov 12, 2020

t4c1 commented Nov 29, 2019 •

edited

Loading

t4c1 commented Nov 10, 2020 •

edited

Loading

SteveBronder commented Nov 11, 2020 •

edited

Loading