Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Composable transforms #947

Closed
wants to merge 29 commits into from

Conversation

WardBrian
Copy link
Member

@WardBrian WardBrian commented Aug 18, 2021

Overview

This PR refactors our internal representation of transformations to include a concept of 'chaining' several in a row. The idea is roughly based on TFP's Chain bijector.

The code generation is based on turning something which currently would be codgen'd as

      y = in__.template read_constrain_lb<local_scalar_t__, jacobian__>(0,  lp__);

into

      y = lb_constrain<jacobian__>(in__.template read<local_scalar_t__>(),
                                                     0, lp);

with the obvious extension that you can add additional reassignments to add additional constraints, e.g.

      y = lb_constrain<jacobian__>(
              offset_multiplier_constrain<jacobian__>(
                   in__.template read<local_scalar_t__>(), 1, 3, lp__),
              0, lp__);

The same is done (in reverse order) for freeing.

There are several areas of the code which are not yet completed (hence the WIP). They are all marked with the string "TR TODO" in either a comment or a failwith statement.


Remaining work here

  • Variables with transforms can also be 'checked' that they are properly constrained. The way this is currently implemented is not correct in general for multiple composed transforms, as you need to unconstrain the outermost value (peel off a transform) before applying the next check
  • Much of the code in Transform_Mir, Mir_Utils, and Ast_to_Mir makes assumptions about sizes/dimensions based on the transform associated with the declaration, which still needs to be expanded or rethought
  • stan::math functions for constraining need to take a Jacobian template parameter, like those in stan::io::deserializer. This should be an easy change which @SteveBronder has agreed to do

Future work

As a follow on PR, the parser will be updated to parse declarations like

real <offset=1, multiplier=3, lower=0> variable;

As brought up in #659. Further extensions to allow more general compositions are left for others to consider.


Submission Checklist

  • Run unit tests
  • Documentation
    • If a user-facing facing change was made, the documentation PR is here:
    • OR, no user-facing changes were made

Release notes

Internal refactor to allow composing multiple transforms of a single variable

Copyright and Licensing

By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the BSD 3-clause license (https://opensource.org/licenses/BSD-3-Clause)

@SteveBronder
Copy link
Contributor

Variables with transforms can also be 'checked' that they are properly constrained. The way this is currently implemented is not correct in general for multiple composed transforms, as you need to unconstrain the outermost value (peel off a transform) before applying the next check

Can you show me an example of this? Or do you mean users literally call a function to check this? tmk in the compiler only transformed parameters do these checks where constraints don't actually apply but just checks. In that case wouldn't we just call multiple checks?

@WardBrian
Copy link
Member Author

Can you show me an example of this? Or do you mean users literally call a function to check this? tmk in the compiler only transformed parameters do these checks where constraints don't actually apply but just checks. In that case wouldn't we just call multiple checks?

Does data not get checked as input?

This was a comment @bob-carpenter made to me when discussing it in front of a whiteboard, so he may be able to weigh in better. My understanding is that checking if something meets multiple constraint requirements would require doing the freeing transform in the middle. For example, a hypothetical value declared with constraints <lower=0><upper=1> (that is, separate, not a lower-upper constraint), the check for upper=1 would check that values are less than one, but the check for lower=0 would have to first un-apply the upper transform (as the result of 1-exp(x_unc) is almost certainly below zero for most of its values)

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Aug 25, 2021

In the blocks transformed data, transformed parameters, and generated quantities, the constraints act as validity checks that are executed at the end of the block (because they may be reassigned within the block on the way to satisfying the constraints).

The constraints on the parameters variables define the mapping from unconstrained to constrained. Each of those constraints provides a range of values it maps to. For instance, real takes values in (-inf, inf), but real<lower = 0> takes values in (0, inf).

The problems arise when applying constraints to already constrained values. So if I write

parameters {
  real<lower = 1><upper = 2> alpha;
}

then what do I know about alpha? I'm going to assume transforms apply by proximity, so we first map (-inf, inf) to (1, inf) with the <lower = 1> constraint. But then what happens when we map (1, inf) with <upper = 2>? Well, we know the transform is f(u) = 2 - exp(u), so we that we know that f(inf) = -inf and f(1) = 2 - exp(1) * exp(exp(1))), or roughly (-inf, -39).

So if we want to match what happens with parameter transforms when checking bounds in the other blocks, we'd have to unwind the transforms step-by-step and check each step. That is, for

data {
  real<lower = 1><upper = 2> alpha;
}

we want to validate as follows.

  1. check that alpha satisfies <upper = 2>, i.e., alpha < 2,
  2. transform alpha using <upper = 2> (the inverse transform that does the constraining is f(u) = 2 - exp(u) and we need to apply f-inverse, then
  3. check that the transformed alpha satisfies <lower = 1>.

Alternatively, I think it'd make sense to only allow a single constraint on variables other than parameters.

@WardBrian
Copy link
Member Author

If we were only interested in allowing multiple transforms on parameters, I could probably tackle this with a slightly different strategy that might make some stuff simpler, but it would very tightly couple that idea (read: if someone later decided that was a poor choice and wanted MT other places, it would be as difficult as the first time around).

The other option is maintain my current approach but just assert that the data block (..etc) cannot have anything other than a primitive (single) transform. That means the debug data generation and constraint check system etc could all remain mostly unchanged, and the only open issue to resolve here would be the dimension issues.

Thoughts? I'm personally happy to put a nice error message inside debug data gen and constraint checking such that it gets punted to if/when someone decides later to implement that. Then when crafting the parser I can either disallow multiple constraints for non-parameters, or catch it during the semantic check.

@bob-carpenter
Copy link
Contributor

I'm trying to think of ways multiple transforms in a block other than parameters might be useful. Maybe for pinning/clamping/fixing parameters during model development. There you'd want to check that your data is constrained to the range of the constraining transform of the parameter. But that's a very edge use case.

@SteveBronder
Copy link
Contributor

I'm good with

The other option is maintain my current approach but just assert that the data block (..etc) cannot have anything other than a primitive (single) transform. That means the debug data generation and constraint check system etc could all remain mostly unchanged, and the only open issue to resolve here would be the dimension issues.

This starts delving into feature creep a bit, but I'd like to think about how user supplied constraints would interact here. I think what we are saying here is all well and good as long as in the future for constraining transformed parameters we do something where the main jist is to just expose the _constrain functions. Then in the C++ when we see those we place the jacobian__ template parameter after the function signature and add the lp parameter. So when a user writes

matrix[N, M] X = lb_constrain(X_other);

We generate something like

Eigen::Matrix<local_scalar_t__, -1, -1> X = lb_constrain<jacobian__>(X_other, lp);

And then I think in transformed_inits when we see a _constrain() function somewhere a user has added we have to call the associated _free function before it's written to the output vector.

@WardBrian
Copy link
Member Author

That is how I would imagine it yes. Nothing I'm doing should make that any harder, and if anything might make it a little easier.

The biggest issue from that standpoint I think is for transforms that change the dimensionality (like the matrix constraints we've discussed over email, or simplex). Trying to solve in-general what the correct dimensions are is going to be a big headache.

If you want to look at the current output I think I've nested everything as calls correctly, rather than repeated assignments.

I still need to change the template parameters in the read call for special cases, but it's getting much closer now.

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Aug 30, 2021

In general, a transform consists of several parts:

  1. a transform from "constrained" to "unconstrained"
  2. the inverse of the transform
  3. the log absolute Jacobian determinant function for the inverse transform
  4. dimensionality information (e.g. that simplex maps K - 1 unconstrained parameters to a K-vector)

We use (1) for transforming inits and (2)+(3) for transforming parameters in the log_prob function, and then just (2) for transforming parameters when writing them out in write_array.

There's no reason in principle these can't be chained if the dimensions conform, but if you want to keep track of the ranges beyond the last transform, that's much trickier.

The real question I have is whether we should try to encapsulate (1)-(4) into what TensorFlow Probability calls a "bijector" (that's not quite right as our unit vectors don't use a bijective transform).

@WardBrian
Copy link
Member Author

Currently getting CI failures about template substitution

See Error:

n file included from /mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun.hpp:240:
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun/ordered_constrain.hpp:83:12: error: no matching function for call to 'ordered_constrain'
    return ordered_constrain(x, lp);
           ^~~~~~~~~~~~~~~~~
../../test/integration/good/declarations.hpp:3280:16: note: in instantiation of function template specialization 'stan::math::ordered_constrain<false, std::vector<Eigen::Matrix<double, -1, 1, 0, -1, 1>, std::allocator<Eigen::Matrix<double, -1, 1, 0, -1, 1> > > >' requested here
      par_g1 = ordered_constrain<jacobian__>(
               ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun/ordered_constrain.hpp:81:13: note: candidate template ignored: couldn't infer template argument 'Jacobian'
inline auto ordered_constrain(const T& x, return_type_t<T>& lp) {
            ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun/ordered_constrain.hpp:54:13: note: candidate template ignored: requirement 'is_eigen_col_vector<vector<Matrix<double, -1, 1, 0, -1, 1>, allocator<Matrix<double, -1, 1, 0, -1, 1> > > >::value' was not satisfied [with EigVec = std::vector<Eigen::Matrix<double, -1, 1, 0, -1, 1>, std::allocator<Eigen::Matrix<double, -1, 1, 0, -1, 1> > >]
inline auto ordered_constrain(const EigVec& x, value_type_t<EigVec>& lp) {
            ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun/ordered_constrain.hpp:26:29: note: candidate function template not viable: requires single argument 'x', but 2 arguments were provided
inline plain_type_t<EigVec> ordered_constrain(const EigVec& x) {
                            ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun/ordered_constrain.hpp:83:12: error: no matching function for call to 'ordered_constrain'
    return ordered_constrain(x, lp);
           ^~~~~~~~~~~~~~~~~
../../test/integration/good/declarations.hpp:3285:16: note: in instantiation of function template specialization 'stan::math::ordered_constrain<false, std::vector<std::vector<Eigen::Matrix<double, -1, 1, 0, -1, 1>, std::allocator<Eigen::Matrix<double, -1, 1, 0, -1, 1> > >, std::allocator<std::vector<Eigen::Matrix<double, -1, 1, 0, -1, 1>, std::allocator<Eigen::Matrix<double, -1, 1, 0, -1, 1> > > > > >' requested here
      par_g2 = ordered_constrain<jacobian__>(
               ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun/ordered_constrain.hpp:81:13: note: candidate template ignored: couldn't infer template argument 'Jacobian'
inline auto ordered_constrain(const T& x, return_type_t<T>& lp) {
            ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun/ordered_constrain.hpp:54:13: note: candidate template ignored: requirement 'is_eigen_col_vector<vector<vector<Matrix<double, -1, 1, 0, -1, 1>, allocator<Matrix<double, -1, 1, 0, -1, 1> > >, allocator<vector<Matrix<double, -1, 1, 0, -1, 1>, allocator<Matrix<double, -1, 1, 0, -1, 1> > > > > >::value' was not satisfied [with EigVec = std::vector<std::vector<Eigen::Matrix<double, -1, 1, 0, -1, 1>, std::allocator<Eigen::Matrix<double, -1, 1, 0, -1, 1> > >, std::allocator<std::vector<Eigen::Matrix<double, -1, 1, 0, -1, 1>, std::allocator<Eigen::Matrix<double, -1, 1, 0, -1, 1> > > > >]
inline auto ordered_constrain(const EigVec& x, value_type_t<EigVec>& lp) {
            ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun/ordered_constrain.hpp:26:29: note: candidate function template not viable: requires single argument 'x', but 2 arguments were provided
inline plain_type_t<EigVec> ordered_constrain(const EigVec& x) {
                            ^
../../test/integration/good/declarations.hpp:3290:16: error: no matching function for call to 'ordered_constrain'
      par_g3 = ordered_constrain<jacobian__>(
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun/ordered_constrain.hpp:81:13: note: candidate template ignored: substitution failure [with Jacobian = false, T = std::vector<Eigen::Matrix<double, -1, 1, 0, -1, 1>, std::allocator<Eigen::Matrix<double, -1, 1, 0, -1, 1> > >]
inline auto ordered_constrain(const T& x, return_type_t<T>& lp) {
            ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun/ordered_constrain.hpp:54:13: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'EigVec'
inline auto ordered_constrain(const EigVec& x, value_type_t<EigVec>& lp) {
            ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/rev/fun/ordered_constrain.hpp:73:6: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'VarVec'
auto ordered_constrain(const VarVec& x, scalar_type_t<VarVec>& lp) {
     ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun/ordered_constrain.hpp:26:29: note: candidate function template not viable: requires single argument 'x', but 2 arguments were provided
inline plain_type_t<EigVec> ordered_constrain(const EigVec& x) {
                            ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/rev/fun/ordered_constrain.hpp:25:13: note: candidate function template not viable: requires single argument 'x', but 2 arguments were provided
inline auto ordered_constrain(const T& x) {
            ^
../../test/integration/good/declarations.hpp:3295:16: error: no matching function for call to 'ordered_constrain'
      par_g4 = ordered_constrain<jacobian__>(
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun/ordered_constrain.hpp:81:13: note: candidate template ignored: substitution failure [with Jacobian = false, T = std::vector<std::vector<Eigen::Matrix<double, -1, 1, 0, -1, 1>, std::allocator<Eigen::Matrix<double, -1, 1, 0, -1, 1> > >, std::allocator<std::vector<Eigen::Matrix<double, -1, 1, 0, -1, 1>, std::allocator<Eigen::Matrix<double, -1, 1, 0, -1, 1> > > > >]
inline auto ordered_constrain(const T& x, return_type_t<T>& lp) {
            ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun/ordered_constrain.hpp:54:13: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'EigVec'
inline auto ordered_constrain(const EigVec& x, value_type_t<EigVec>& lp) {
            ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/rev/fun/ordered_constrain.hpp:73:6: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'VarVec'
auto ordered_constrain(const VarVec& x, scalar_type_t<VarVec>& lp) {
     ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/prim/fun/ordered_constrain.hpp:26:29: note: candidate function template not viable: requires single argument 'x', but 2 arguments were provided
inline plain_type_t<EigVec> ordered_constrain(const EigVec& x) {
                            ^
/mnt/nvme1n1/workspace/stanc3_PR-947/performance-tests-cmdstan/cmdstan/stan/lib/stan_math/stan/math/rev/fun/ordered_constrain.hpp:25:13: note: candidate function template not viable: requires single argument 'x', but 2 arguments were provided
inline auto ordered_constrain(const T& x) {
            ^

@WardBrian
Copy link
Member Author

Okay at least some of them are because I need to figure out how to pass the dimension args to the cholesky constraints. Not sure why ordered_constrain is failing though

@SteveBronder
Copy link
Contributor

SteveBronder commented Aug 31, 2021

Oh, looking those over we need the lp parameter to have it's own template parameter in the math library since sometimes that lp can be a var but the matrix can have a scalar double type. So right now the lp type is based on the scalar of the input x, but that's wrong here in log_prob. I can fix this upstream pretty easily (I'll make a PR with it today)

@WardBrian
Copy link
Member Author

I think I have a decent abstraction for FnReadParam for all of the transforms we currently implement, but I'm baking in some pretty strong assumptions by doing so that will have to change if we let users define arbitrary constraints (particularly if they change dimension)

I'm starting to wonder if there is merit toward reintroducing an internal FnConstrain which handles that sort of stuff. I know #872 only just got rid of that.

@WardBrian
Copy link
Member Author

WardBrian commented Sep 1, 2021

Okay, so I think there are 3 main things needed before the compiler work here is done.

Once these are done we can start worrying about how to actually instantiate this/the language design for it. I think I'll want at least one working version before this gets merged so we can test some things, even if it ends up in a separate PR.

@WardBrian
Copy link
Member Author

I just wanted to ping @rybern on this -- this is the WIP PR for composable transforms that touches a lot of that codegen. We should chat about how that could affect the tuples PR

@WardBrian
Copy link
Member Author

I think we've decided #971 supersedes this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants