-
-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor unconstraining to use deserializer interface #872
Conversation
@SteveBronder This is a WIP, but can you tell close the codgen is? Should transform_init_impl be getting passed a |
It looks pretty darn close! I can take a deeper dive into it a bit more today and tmrw |
It should just get handed a vector and the deserializer can be made in the |
So I think what we need to do is take the hardcoded function here in transform_inits inline void transform_inits(const stan::io::var_context& context,
std::vector<int>& params_i,
std::vector<double>& vars,
std::ostream* pstream = nullptr) const final {
transform_inits_impl(context, params_i, vars, pstream);
} and serialize all the data in the context into a one flat std::vector. I could do some funny stuff in this PR but it might actually be easier if I add a inline void transform_inits(const stan::io::var_context& context,
std::vector<int>& params_i,
std::vector<double>& vars,
std::ostream* pstream = nullptr) const final {
std::vector<double> flat_context = context.flatten();
transform_inits_impl(flat_context, params_i, vars, pstream);
} I can take a shot at just doing it in the compiler here and if it doesn't look too bad we will do that and otherwise I'll add the flatten() method and we can also just make a inline void transform_inits(std::vector<double>& params_r,
std::vector<int>& params_i,
std::vector<double>& vars,
std::ostream* pstream = nullptr) const final {
transform_inits_impl(params_r, params_i, vars, pstream);
} So future API points don't have to pay for the flattening |
That sounds good Steve, let me know if I can help. I'll do some cleanup in the mean time.. |
Alright so I'm using this as the example program and got everything compiling data {
int<lower=0> N;
int M;
}
parameters {
matrix<lower=0>[N, N] x1;
matrix<lower=0, upper = 1>[N, N] x2;
matrix[N, N] x3;
matrix<lower=0, upper = 1>[N, N] x4[M];
} Here's the C++ right now https://gist.github.com/SteveBronder/ad4004530f9f9126f23ec76269c47f12 And we can use the serializer to do this https://gist.github.com/SteveBronder/d02822989daf434a54cba4e3f3aa51e7 Does that work for you? I think we can do something similar in |
Also notice the signature change to have a name |
Also a quick thing to take note of. We need to make sure the vector that goes into serializer is fully initialized aka for our case it has the correct size and it's values are all initialized to zero |
This all makes sense. Nice looking code! I'll work on that output |
Looks like it'll be easy to do this for write_array as well, if you want to handle the signatures again |
…rite in write_array
Nice! Yeah if it's easy peasy then lets do it. Only thing we need to change with serializer is to do std::decay_t<VecVar> vars__(params_r__.size(), 0); instead of std::decay_t<VecVar> vars__;
vars__.reserve(params_r__.size()); I can write the scheme for write_array as well. I'll put it on a separate branch so you can work on this one and merge it in when it's done. For write array we let's use data {
int<lower=0> N;
int M;
}
parameters {
matrix<lower=0>[N, N] x1;
matrix<lower=0, upper = 1>[N, N] x2;
matrix[N, N] x3;
matrix<lower=0, upper = 1>[N, N] x4[M];
}
transformed parameters {
matrix<lower=0>[N, N] x1_tp = multiply(x1, x1);
matrix<lower=0, upper = 1>[N, N] x2_tp = multiply(x2 x2);
matrix[N, N] x3_tp = multiply(x3, x2)
matrix<lower=0, upper = 1>[N, N] x4_tp[M];
} Since write array works on both parameters and transformed parameters. For parameters we'll do the same serialize scheme but for td we need to do the ops on the transformed parameters and then make the call to serial |
Okay, changed decay_t. You've welcome to use this branch if you want, you can just comment this paragraph here and uncomment the paragraph above it to switch over to the serializer version in write_arrays |
Got everything looking nice, but I was wrong about being able to inline parameters {
real p_real;
real<lower=p_real> p_upper;
} But I think that's in then we are good! |
Good call, I put back the assignments and did a little cleanup. Could probably stand to do some more cleanup when we're sure things are settled. Side note, maybe I'm just tired - how does the following work?
Won't we be passing the unconstrained version of p_real as the constraint argument of p_upper, but we should be passing the constrained version? |
Ah shoot, okay looking at this I think what we need is something like the below https://gist.github.com/SteveBronder/a575c5e1089c8a40ea9a3c4df4480ca9#file-ex_new-hpp In my brain I thought free'ing variables would be carried forward to other parameters constraints but that doesn't make sense lol. That gist has how we currently do it, how this one does it, and what we should be doing. I think I can just add the free methods to the serializer which I should be able to do today and then this should work just fine. Essentially it's just doing the
Then the constraints are correct for parameters depending on other parameters. @seantalts can you check out the above link to make sure that makes sense? |
Sure thing - the |
@SteveBronder Output should match your new example now, let me know what you think of the tests |
@rybern fix was actually v simple we just forgot to do the size thing for the |
Ack there must be an off by one error in one of these I'll have a look |
Okay I've spotted a difference, but I'm not sure who's right here. For the here in the new C++ we are doing assign(theta,
in__.read<std::vector<std::vector<local_scalar_t__>>>(N_priors,
N_studies), "assigning variable theta"); Which is going to fill things in like
But in the current code we are doing {
std::vector<local_scalar_t__> theta_flat__;
current_statement__ = 2;
theta_flat__ = context__.vals_r("theta");
current_statement__ = 2;
pos__ = 1;
current_statement__ = 2;
for (int sym1__ = 1; sym1__ <= N_studies; ++sym1__) {
current_statement__ = 2;
for (int sym2__ = 1; sym2__ <= N_priors; ++sym2__) {
current_statement__ = 2;
assign(theta, theta_flat__[(pos__ - 1)],
"assigning variable theta", index_uni(sym2__),
index_uni(sym1__));
current_statement__ = 2;
pos__ = (pos__ + 1);
}
}
} Which if I'm reading the
EDIT: I'm not sure if column major or row major order is the right term here. But in the old code we are writing all of the inner arrays first elements, then writing each arrays second elements, etc. where in the new code we are writing all of the first array, all of the second array, etc. They are both assigning from a flattened vector, but the current one is assigning those values by rows and the new one by columns. Does the init file come in column or row major order? It's not clear to me whether this is a bug or intended. The What's also confusing is that when we go to write to the output in the current code we then switch back to writing things in column major order (aka going from for (int sym1__ = 1; sym1__ <= N_priors; ++sym1__) {
for (int sym2__ = 1; sym2__ <= N_studies; ++sym2__) {
vars__.emplace_back(theta[(sym1__ - 1)][(sym2__ - 1)]);
}
} Tagging @bob-carpenter, was this the behavior in the old compiler as well? |
ftr I can fix this just fine in the new code to do what the old code does, but wanted to check that this is intended behavior |
Actually reading this I think I totally misread things, I think the current behavior is correct since everything comes in column major order and it wants to write the columns first |
Alright I think I know how to handle this in not too bad of a way but need to check if matrices are handled in the init file in the same way as well |
Ack actually I tried to fix this and my brain hurt real bad. @rybern so the issue is that everything comes in as column major order, even arrays of arrays. The deserializer assumes the data it receives is coming in for arrays of arrays as if we serialized an [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] and we want to assign from the serialized array of arrays to the array of arrays like (as zero indexing)
aka ticking by the first index, then second index, then third index. That lets us to an output of arrays that looks like the below where the
If we still have the code for writing these arrays as loops, then for an array like real test_var_arr3[N_priors, N_studies, N_dim1]; Would it be fine with you to generate the loop still like {
for (int sym1__ = 1; sym1__ <= N_dim1; ++sym1__) {
for (int sym2__ = 1; sym2__ <= N_studies; ++sym2__) {
for (int sym3__ = 1; sym3__ <= N_priors; ++sym3__) {
assign(test_var_arr3, in__.read<local_scalar_t>(),
"assigning variable test_var_arr3", index_uni(sym3__),
index_uni(sym2__), index_uni(sym1__));
}
}
}
} We only need to loop over the array portion. So like for arrays of matrices like matrix<lower=0>[N_priors, N_studies] test_mat_arr[N_dim1, N_dim2]; since matrices are column major order already then we would do {
for (int sym1__ = 1; sym1__ <= N_dim2; ++sym1__) {
current_statement__ = 4;
for (int sym2__ = 1; sym2__ <= N_dim1; ++sym2__) {
current_statement__ = 4;
assign(test_mat_arr, in__.read<Eigen::Matrix<local_scalar_t__, -1, -1>>(N_priors, N_studies),
"assigning variable test_mat_arr", index_uni(sym2__), index_uni(sym1__));
current_statement__ = 4;
pos__ = (pos__ + 1);
}
}
} Does that work for you? |
@SteveBronder yep I think that should be fine. We have to write loops anyway for tuples |
… Also updated mkfor interface, added array type util
@SteveBronder Lmk if that looks right |
Ack so looking at {
std::vector<local_scalar_t__> b_flat__;
current_statement__ = 6;
b_flat__ = context__.vals_r("b");
current_statement__ = 6;
pos__ = 1;
current_statement__ = 6;
for (int sym1__ = 1; sym1__ <= K; ++sym1__) {
current_statement__ = 6;
for (int sym2__ = 1; sym2__ <= I; ++sym2__) {
current_statement__ = 6;
assign(b, b_flat__[(pos__ - 1)],
"assigning variable b", index_uni(sym2__), index_uni(sym1__));
current_statement__ = 6;
pos__ = (pos__ + 1);
}
}
} instead of doing I then K for these cases
And I think we need to do the same thing in in One alt, we could go back to just doing |
I think it kind of makes sense to change the serializer because it doesn't seem to do the pattern we want |
@rybern I finally got this all sorted out! Sadly we can't really use serializer like we do elsewhere because the data for transform inits comes in with a weird flattened array style for arrays. Essentially it flattens multi-dimensional matrices so that the outermost index is the one that changes the most, which causes us to have to do the reverse loop style. But I think this is good! I'll give the ocaml one more read through and then I think we can merge! https://github.com/stan-dev/stan/blob/develop/src/stan/io/dump.hpp#L63 |
This PR moves the codegen for unconstraining variables to use to new deserializer backend interface (see e.g. stan-dev/stan#3018)
This is follows up on PR #856.
The approach is to create a new internal function, ReadUnconstrainData, that generates a function call from the new interface.