Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor unconstraining to use deserializer interface #872

Merged
merged 54 commits into from
Aug 13, 2021

Conversation

rybern
Copy link
Collaborator

@rybern rybern commented Apr 5, 2021

This PR moves the codegen for unconstraining variables to use to new deserializer backend interface (see e.g. stan-dev/stan#3018)

This is follows up on PR #856.

The approach is to create a new internal function, ReadUnconstrainData, that generates a function call from the new interface.

@rybern
Copy link
Collaborator Author

rybern commented Apr 5, 2021

@SteveBronder This is a WIP, but can you tell close the codgen is? Should transform_init_impl be getting passed a deserializer object now?

@SteveBronder
Copy link
Contributor

It looks pretty darn close! I can take a deeper dive into it a bit more today and tmrw

@SteveBronder
Copy link
Contributor

Should transform_init_impl be getting passed a deserializer object now?

It should just get handed a vector and the deserializer can be made in the impl I think we could make deserializer and pass it to the impl but I always kind of worry about passing things into functions that hold references. I can write up the bit for how things should be passed to the impl

@SteveBronder
Copy link
Contributor

So I think what we need to do is take the hardcoded function here in transform_inits

    inline void transform_inits(const stan::io::var_context& context,
                                std::vector<int>& params_i,
                                std::vector<double>& vars,
                                std::ostream* pstream = nullptr) const final {
      transform_inits_impl(context, params_i, vars, pstream);
    }

and serialize all the data in the context into a one flat std::vector. I could do some funny stuff in this PR but it might actually be easier if I add a flatten() method to the context that returns back an std::vector<double> of the data so then we can just change this to

    inline void transform_inits(const stan::io::var_context& context,
                                std::vector<int>& params_i,
                                std::vector<double>& vars,
                                std::ostream* pstream = nullptr) const final {
      std::vector<double> flat_context = context.flatten();
      transform_inits_impl(flat_context, params_i, vars, pstream);
    }

I can take a shot at just doing it in the compiler here and if it doesn't look too bad we will do that and otherwise I'll add the flatten() method

and we can also just make a

    inline void transform_inits(std::vector<double>& params_r,
                                std::vector<int>& params_i,
                                std::vector<double>& vars,
                                std::ostream* pstream = nullptr) const final {
      transform_inits_impl(params_r, params_i, vars, pstream);
    }

So future API points don't have to pay for the flattening

@rybern
Copy link
Collaborator Author

rybern commented Apr 6, 2021

That sounds good Steve, let me know if I can help. I'll do some cleanup in the mean time..

@SteveBronder
Copy link
Contributor

Alright so I'm using this as the example program and got everything compiling

data {
  int<lower=0> N;
  int M;
}
parameters {
  matrix<lower=0>[N, N] x1;
  matrix<lower=0, upper = 1>[N, N] x2;
  matrix[N, N] x3;
  matrix<lower=0, upper = 1>[N, N] x4[M];
}

Here's the C++ right now

https://gist.github.com/SteveBronder/ad4004530f9f9126f23ec76269c47f12

And we can use the serializer to do this

https://gist.github.com/SteveBronder/d02822989daf434a54cba4e3f3aa51e7

Does that work for you? I think we can do something similar in write_array as well if we want to clean that up

@SteveBronder
Copy link
Contributor

Also notice the signature change to have a name out_vars__ as the output we write to, we make a local vars__ vector filling it with zeros, fill that up using serializer then we just assign vars__ to out_vars__ at the end

@SteveBronder
Copy link
Contributor

Also a quick thing to take note of. We need to make sure the vector that goes into serializer is fully initialized aka for our case it has the correct size and it's values are all initialized to zero

@rybern
Copy link
Collaborator Author

rybern commented Apr 6, 2021

This all makes sense. Nice looking code! I'll work on that output

@rybern
Copy link
Collaborator Author

rybern commented Apr 6, 2021

@rybern
Copy link
Collaborator Author

rybern commented Apr 6, 2021

Looks like it'll be easy to do this for write_array as well, if you want to handle the signatures again

@SteveBronder
Copy link
Contributor

SteveBronder commented Apr 6, 2021

Nice! Yeah if it's easy peasy then lets do it. Only thing we need to change with serializer is to do

    std::decay_t<VecVar> vars__(params_r__.size(), 0);

instead of

    std::decay_t<VecVar> vars__;
    vars__.reserve(params_r__.size());

I can write the scheme for write_array as well. I'll put it on a separate branch so you can work on this one and merge it in when it's done.

For write array we let's use

data {
  int<lower=0> N;
  int M;
}
parameters {
  matrix<lower=0>[N, N] x1;
  matrix<lower=0, upper = 1>[N, N] x2;
  matrix[N, N] x3;
  matrix<lower=0, upper = 1>[N, N] x4[M];
}
transformed parameters {
  matrix<lower=0>[N, N] x1_tp = multiply(x1, x1);
  matrix<lower=0, upper = 1>[N, N] x2_tp = multiply(x2 x2);
  matrix[N, N] x3_tp = multiply(x3, x2)
  matrix<lower=0, upper = 1>[N, N] x4_tp[M];
}

Since write array works on both parameters and transformed parameters. For parameters we'll do the same serialize scheme but for td we need to do the ops on the transformed parameters and then make the call to serial

@rybern
Copy link
Collaborator Author

rybern commented Apr 6, 2021

Okay, changed decay_t. You've welcome to use this branch if you want, you can just comment this paragraph here and uncomment the paragraph above it to switch over to the serializer version in write_arrays

@SteveBronder
Copy link
Contributor

SteveBronder commented Apr 6, 2021

Got everything looking nice, but I was wrong about being able to inline deserialize() calls into serialize, goofed and totally forgot you can have parameters with constraints depending on other parameters so this fails

parameters {
  real p_real;
  real<lower=p_real> p_upper;
}

But I think that's in then we are good!

@rybern
Copy link
Collaborator Author

rybern commented Apr 7, 2021

Good call, I put back the assignments and did a little cleanup. Could probably stand to do some more cleanup when we're sure things are settled.

Side note, maybe I'm just tired - how does the following work?

parameters {
  real<lower=5> p_real;
  real<lower=p_real> p_upper;
}

Won't we be passing the unconstrained version of p_real as the constraint argument of p_upper, but we should be passing the constrained version?

@SteveBronder
Copy link
Contributor

Ah shoot, okay looking at this I think what we need is something like the below

https://gist.github.com/SteveBronder/a575c5e1089c8a40ea9a3c4df4480ca9#file-ex_new-hpp

In my brain I thought free'ing variables would be carried forward to other parameters constraints but that doesn't make sense lol. That gist has how we currently do it, how this one does it, and what we should be doing. I think I can just add the free methods to the serializer which I should be able to do today and then this should work just fine. Essentially it's just doing the read() on the input variable, then for the serializer calling the free equivalent

        p_real = in__.read<local_scalar_t__>();
        out__.write_free_lb(5, p_real);

Then the constraints are correct for parameters depending on other parameters.

@seantalts can you check out the above link to make sure that makes sense?

@seantalts
Copy link
Member

Sure thing - the ex_curr and ex_new examples look like they're doing the right thing at a high level to me (I didn't look into implementations).

@rybern
Copy link
Collaborator Author

rybern commented Apr 12, 2021

@SteveBronder Output should match your new example now, let me know what you think of the tests

@SteveBronder
Copy link
Contributor

@rybern fix was actually v simple we just forgot to do the size thing for the write_array that accepts vectors and the write_array() that accepts Eigen matrices

@SteveBronder
Copy link
Contributor

Ack there must be an off by one error in one of these I'll have a look

@SteveBronder
Copy link
Contributor

SteveBronder commented May 17, 2021

Okay I've spotted a difference, but I'm not sure who's right here. For the magnesium model, and any model that uses arrays of arrays, we are doing a row major order access pattern in the current transform_inits but a column major order access pattern in this PR.

here in the new C++ we are doing

      assign(theta,
        in__.read<std::vector<std::vector<local_scalar_t__>>>(N_priors,
          N_studies), "assigning variable theta");

Which is going to fill things in like

theta[1, 1]
theta[1, 2]
theta[1, 3]
...
theta[2, 1]
theta[2, 2]
theta[2, 3]

But in the current code we are doing

      {
        std::vector<local_scalar_t__> theta_flat__;
        current_statement__ = 2;
        theta_flat__ = context__.vals_r("theta");
        current_statement__ = 2;
        pos__ = 1;
        current_statement__ = 2;
        for (int sym1__ = 1; sym1__ <= N_studies; ++sym1__) {
          current_statement__ = 2;
          for (int sym2__ = 1; sym2__ <= N_priors; ++sym2__) {
            current_statement__ = 2;
            assign(theta, theta_flat__[(pos__ - 1)],
              "assigning variable theta", index_uni(sym2__),
                                            index_uni(sym1__));
            current_statement__ = 2;
            pos__ = (pos__ + 1);
          }
        }
      }

Which if I'm reading the sym2__ and sym1__ correctly is actually going to go

theta[1, 1]
theta[2, 1]
theta[3, 1]
...
theta[1, 2]
theta[2, 2]
theta[3, 2]

EDIT: I'm not sure if column major or row major order is the right term here. But in the old code we are writing all of the inner arrays first elements, then writing each arrays second elements, etc. where in the new code we are writing all of the first array, all of the second array, etc.

They are both assigning from a flattened vector, but the current one is assigning those values by rows and the new one by columns. Does the init file come in column or row major order? It's not clear to me whether this is a bug or intended.

The magnesium init file is not terribly helpful to sort this out as all the values are the same

https://github.com/stan-dev/example-models/blob/6f0ed2269f0b8a39022ac1202e7622fd644904fe/bugs_examples/vol1/magnesium/magnesium.init.R

What's also confusing is that when we go to write to the output in the current code we then switch back to writing things in column major order (aka going from N_studies as the outer loop and N_prior as the inner loop to N_prior as the outer loop and N_studies as the inner loop.)

      for (int sym1__ = 1; sym1__ <= N_priors; ++sym1__) {
        for (int sym2__ = 1; sym2__ <= N_studies; ++sym2__) {
          vars__.emplace_back(theta[(sym1__ - 1)][(sym2__ - 1)]);
        }
      }

Tagging @bob-carpenter, was this the behavior in the old compiler as well?

@SteveBronder
Copy link
Contributor

ftr I can fix this just fine in the new code to do what the old code does, but wanted to check that this is intended behavior

@SteveBronder
Copy link
Contributor

Actually reading this I think I totally misread things, I think the current behavior is correct since everything comes in column major order and it wants to write the columns first

@SteveBronder
Copy link
Contributor

Alright I think I know how to handle this in not too bad of a way but need to check if matrices are handled in the init file in the same way as well

@SteveBronder
Copy link
Contributor

Ack actually I tried to fix this and my brain hurt real bad. @rybern so the issue is that everything comes in as column major order, even arrays of arrays. The deserializer assumes the data it receives is coming in for arrays of arrays as if we serialized an std::vector<std::vector<local_scalar>> aka if this was a matrix we assume it came in row major order where each row is the inner vector. So in transform inits, the inits for like a std::vector<std::vector<std::vector<local_scalar>>> of size {2, 3, 2} come in like

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

and we want to assign from the serialized array of arrays to the array of arrays like (as zero indexing)

0 -> (0, 0, 0)
1 -> (1, 0, 0)
2 -> (0, 1, 0)
3 -> (1, 1, 0)
4 -> (0, 2, 0)
5 -> (1, 2, 0)
6 -> (0, 0, 1)
7 -> (1, 0, 1)
8 -> (0, 1, 1)
9 -> (1, 1, 1)
10 -> (0, 2, 1)
11 -> (1, 2, 1)

aka ticking by the first index, then second index, then third index. That lets us to an output of arrays that looks like the below where the {} represent an array

{{{1, 7},
  {2, 8},
  {3, 9}}, 
 {{4, 10}, 
  {5, 11},
  {6, 12}}}

If we still have the code for writing these arrays as loops, then for an array like

  real test_var_arr3[N_priors, N_studies, N_dim1];

Would it be fine with you to generate the loop still like

      {
        for (int sym1__ = 1; sym1__ <= N_dim1; ++sym1__) {
          for (int sym2__ = 1; sym2__ <= N_studies; ++sym2__) {
            for (int sym3__ = 1; sym3__ <= N_priors; ++sym3__) {
              assign(test_var_arr3, in__.read<local_scalar_t>(),
                     "assigning variable test_var_arr3", index_uni(sym3__),
                     index_uni(sym2__), index_uni(sym1__));
            }
          }
        }
      }

We only need to loop over the array portion. So like for arrays of matrices like

  matrix<lower=0>[N_priors, N_studies] test_mat_arr[N_dim1, N_dim2];

since matrices are column major order already then we would do

      {
        for (int sym1__ = 1; sym1__ <= N_dim2; ++sym1__) {
          current_statement__ = 4;
          for (int sym2__ = 1; sym2__ <= N_dim1; ++sym2__) {
            current_statement__ = 4;
            assign(test_mat_arr, in__.read<Eigen::Matrix<local_scalar_t__, -1, -1>>(N_priors, N_studies),
                   "assigning variable test_mat_arr", index_uni(sym2__), index_uni(sym1__));
            current_statement__ = 4;
            pos__ = (pos__ + 1);
          }
        }
      }

Does that work for you?

@rybern
Copy link
Collaborator Author

rybern commented May 18, 2021

@SteveBronder yep I think that should be fine. We have to write loops anyway for tuples

… Also updated mkfor interface, added array type util
@rybern
Copy link
Collaborator Author

rybern commented May 20, 2021

@SteveBronder Lmk if that looks right

@SteveBronder
Copy link
Contributor

Ack so looking at seeds.stan (hpp files) we do also need to loop over the matrices and arrays in vectors in reverse order as well. I misread what the original code was doing. So like for b in transform_inits in the original it actually iterates over I and then K

      {
        std::vector<local_scalar_t__> b_flat__;
        current_statement__ = 6;
        b_flat__ = context__.vals_r("b");
        current_statement__ = 6;
        pos__ = 1;
        current_statement__ = 6;
        for (int sym1__ = 1; sym1__ <= K; ++sym1__) {
          current_statement__ = 6;
          for (int sym2__ = 1; sym2__ <= I; ++sym2__) {
            current_statement__ = 6;
            assign(b, b_flat__[(pos__ - 1)],
              "assigning variable b", index_uni(sym2__), index_uni(sym1__));
            current_statement__ = 6;
            pos__ = (pos__ + 1);
          }
        }
      }

instead of doing I then K for these cases

      for (int sym1__ = 1; sym1__ <= I; ++sym1__) {
        assign(b, in__.read<Eigen::Matrix<local_scalar_t__, -1, 1>>(K),
          "assigning variable b", index_uni(sym1__));
      }

And I think we need to do the same thing in in write_array. Like looking at SIR (hpp files It looks like it uses the iterators starting from out to in here as well.

One alt, we could go back to just doing out__.write(y); for everything, but we add a template parameter to write like RowMajor | ColMajor and call it like out__.write<RowMajor>(y); which just tells it to write in whatever order we want. That would be pretty easy for me to post up (though has to wait till the 2.27 rc is finished to merge it)

@SteveBronder
Copy link
Contributor

I think it kind of makes sense to change the serializer because it doesn't seem to do the pattern we want

@SteveBronder
Copy link
Contributor

@rybern I finally got this all sorted out! Sadly we can't really use serializer like we do elsewhere because the data for transform inits comes in with a weird flattened array style for arrays. Essentially it flattens multi-dimensional matrices so that the outermost index is the one that changes the most, which causes us to have to do the reverse loop style.

But I think this is good! I'll give the ocaml one more read through and then I think we can merge!

https://github.com/stan-dev/stan/blob/develop/src/stan/io/dump.hpp#L63

@SteveBronder SteveBronder merged commit eccafc0 into stan-dev:master Aug 13, 2021
This was referenced Aug 17, 2021
@WardBrian WardBrian mentioned this pull request Aug 31, 2021
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants