Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse intermediate computations in distributions part 2 #1752

Closed
wants to merge 9 commits into from

Conversation

mcol
Copy link
Contributor

@mcol mcol commented Feb 28, 2020

This is the second of a few PRs that aim to clean up the distributions so that we make better use of intermediate computations and composed functions.

This covers the following distributions:

  • chi_square
  • gamma
  • inv_chi_square
  • inv_gamma
  • scaled_inv_chi_square

Tests

None, this is just cleanup.

Side Effects

None.

Release Notes

Cleaned up implementations of chi_square, gamma, inv_chi_square, inv_gamma, and scaled_inv_chi_square to make use of intermediate calculations more.

Checklist

  • Math issue: Make use of composed functions and temporaries #1230

  • Copyright holder: Marco Colombo

    The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
    - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
    - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • dependencies checks pass, (make test-math-dependencies)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.89 4.84 1.01 1.05% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 1.0 0.2% faster
eight_schools/eight_schools.stan 0.09 0.09 0.98 -1.73% slower
gp_regr/gp_regr.stan 0.22 0.22 0.99 -1.04% slower
irt_2pl/irt_2pl.stan 6.06 6.11 0.99 -0.74% slower
performance.compilation 89.74 87.15 1.03 2.88% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 7.41 7.42 1.0 -0.24% slower
pkpd/one_comp_mm_elim_abs.stan 20.77 20.21 1.03 2.72% faster
sir/sir.stan 98.25 94.46 1.04 3.85% faster
gp_regr/gen_gp_data.stan 0.05 0.05 0.99 -0.71% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.96 2.98 0.99 -0.7% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.31 0.34 0.92 -9.24% slower
arK/arK.stan 1.73 1.74 0.99 -0.77% slower
arma/arma.stan 0.66 0.65 1.01 0.68% faster
garch/garch.stan 0.59 0.59 1.0 -0.44% slower
Mean result: 0.997970841634

Jenkins Console Log
Blue Ocean
Commit hash: 0abd9f1


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@bbbales2 bbbales2 self-assigned this Mar 2, 2020
Copy link
Member

@bbbales2 bbbales2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only a partial review, but I wanted to stop and ask a question before finishing it.

There seem to be a lot of logic changes with the templates. Presumably before we spent an unnecessary amount of time computing things that don't matter.

Is this all being tested somewhere?

Like are the gradients of gamma_lcdf(y, a, b) being tested for all argument combinations? Like:

gamma_lcdf(double, double, double);
gamma_lcdf(double, double, var);
..
gamma_lcdf(var, var, var);

Etc. Does the distribution test framework do that?

stan/math/prim/prob/chi_square_lccdf.hpp Show resolved Hide resolved
stan/math/prim/prob/chi_square_lcdf.hpp Show resolved Hide resolved
stan/math/prim/prob/chi_square_lpdf.hpp Show resolved Hide resolved
stan/math/prim/prob/chi_square_lpdf.hpp Show resolved Hide resolved
stan/math/prim/prob/gamma_cdf.hpp Show resolved Hide resolved
@mcol
Copy link
Contributor Author

mcol commented Mar 2, 2020

All these are tested in the distribution tests. For chi_square_lcdf, for example, all the following are being tested:

double, std::vector<double>
double, Eigen::Matrix<double, Eigen::Dynamic, 1>
double, var
double, std::vector<var>
double, Eigen::Matrix<var, Eigen::Dynamic, 1>
std::vector<double>, double
std::vector<double>, std::vector<double>
std::vector<double>, Eigen::Matrix<double, Eigen::Dynamic, 1>
std::vector<double>, var
std::vector<double>, std::vector<var>
std::vector<double>, Eigen::Matrix<var, Eigen::Dynamic, 1>
Eigen::Matrix<double, Eigen::Dynamic, 1>, double
Eigen::Matrix<double, Eigen::Dynamic, 1>, std::vector<double>
Eigen::Matrix<double, Eigen::Dynamic, 1>, Eigen::Matrix<double, Eigen::Dynamic, 1>
Eigen::Matrix<double, Eigen::Dynamic, 1>, var
Eigen::Matrix<double, Eigen::Dynamic, 1>, std::vector<var>
Eigen::Matrix<double, Eigen::Dynamic, 1>, Eigen::Matrix<var, Eigen::Dynamic, 1>
var, double
var, std::vector<double>
var, Eigen::Matrix<double, Eigen::Dynamic, 1>
var, var
var, std::vector<var>
var, Eigen::Matrix<var, Eigen::Dynamic, 1>
std::vector<var>, double
std::vector<var>, std::vector<double>
std::vector<var>, Eigen::Matrix<double, Eigen::Dynamic, 1>
std::vector<var>, var
std::vector<var>, std::vector<var>
std::vector<var>, Eigen::Matrix<var, Eigen::Dynamic, 1>
Eigen::Matrix<var, Eigen::Dynamic, 1>, double
Eigen::Matrix<var, Eigen::Dynamic, 1>, std::vector<double>
Eigen::Matrix<var, Eigen::Dynamic, 1>, Eigen::Matrix<double, Eigen::Dynamic, 1>
Eigen::Matrix<var, Eigen::Dynamic, 1>, var
Eigen::Matrix<var, Eigen::Dynamic, 1>, std::vector<var>
Eigen::Matrix<var, Eigen::Dynamic, 1>, Eigen::Matrix<var, Eigen::Dynamic, 1>

Same for fvar<double>, fvar<fvar<double>>, fvar<var> and fvar<fvar<var>>, so I think we are well covered.

I'm going to push a further commit to ensures that the size_zero checks happen after we checked the consistency of the other arguments, along with a few more cleanups that I've noticed while having another look at this.

Thanks for looking at this PR too!

@bbbales2
Copy link
Member

bbbales2 commented Mar 2, 2020

All these are tested in the distribution tests.

Thanks for checking!

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.84 4.92 0.98 -1.79% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 1.01 0.75% faster
eight_schools/eight_schools.stan 0.09 0.09 1.02 2.11% faster
gp_regr/gp_regr.stan 0.22 0.22 1.01 0.55% faster
irt_2pl/irt_2pl.stan 6.11 6.18 0.99 -1.12% slower
performance.compilation 88.32 87.25 1.01 1.22% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 7.68 7.67 1.0 0.17% faster
pkpd/one_comp_mm_elim_abs.stan 21.04 20.32 1.04 3.45% faster
sir/sir.stan 90.69 96.77 0.94 -6.69% slower
gp_regr/gen_gp_data.stan 0.05 0.05 1.0 0.17% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.95 2.97 0.99 -0.5% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.33 0.33 0.99 -0.7% slower
arK/arK.stan 1.74 1.75 0.99 -0.5% slower
arma/arma.stan 0.66 0.66 1.0 0.22% faster
garch/garch.stan 0.52 0.58 0.89 -11.8% slower
Mean result: 0.991612615195

Jenkins Console Log
Blue Ocean
Commit hash: 0680496


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@SteveBronder
Copy link
Collaborator

Should this branch be tested by all the models in the cmdstan performance test suit? I only say that because I've tried one of these before that led to a revert

https://github.com/stan-dev/performance-tests-cmdstan

@bbbales2
Copy link
Member

bbbales2 commented Mar 4, 2020

The code here looks good (tests pass and looking over it there's nothing obviously wrong), but I'm worried (as Steve points out) that we've made changes to these distribution functions before and it's broken stuff downstream unexpectedly.

Regarding this scope/objectives of pull/how to get it merged, I have a few questions:

  1. This is part of a larger pull request broken into pieces (including one that was already merged: Reuse intermediate computations in distributions part 1 #1744). How many pieces are there? Are they just touching all the distributions?

  2. I don't think issue Make use of composed functions and temporaries #1230 justifies the changes here.

It looks like the point of these changes is to make the lpdfs faster in situations where not all of the arguments are autodiff variables (and so there are optional calculations). Is that closer to the mark?

I guess the additional question here is how will I know the problem fixed?

  1. What are the specific issues where this came up and what were the fixes? Specifically what broke for you @SteveBronder? I remember the glms have broken before as well. I'd like to know this before figuring out if we're tested enough or need more tests.

I'll look into #3 myself (feel free to add links if you know about stuff). #1 and #2 are for you @mcol.

@mcol
Copy link
Contributor Author

mcol commented Mar 4, 2020

My plan is to cover the large majority of distributions, excluding the multivariate distributions and the glms (for both these types of distributions, the code is quite different from the rest, and I'm not familiar with that). If we go with PRs of sizes such as this and #1744 (which seems a reasonable size to me, but let me know if smaller would be desirable), then we will have a part 3 and a part 4. Note that #1758 will reduce the size of these PRs, so I will wait until it goes in before I pick this one up again.

This set of PRs does indeed more that what #1230 required. I can open a new issue to describe that, if that's what you'd like to see. Overall, this is not motivated by performance gains, as I expect them to be generally tiny if measurable at all, but from the attempt to reuse some intermediate computations and simplify existing ones. In a few cases some computations can be avoided for non-autodiff types, but those are the minority.

@bbbales2
Copy link
Member

bbbales2 commented Mar 4, 2020

I can open a new issue to describe that, if that's what you'd like to see

Just talk them out here for now. We can just copy-paste them over to an issue easily.

Overall, this is not motivated by performance gains, as I expect them to be generally tiny if measurable at all, but from the attempt to reuse some intermediate computations and simplify existing ones.

Oh, is it for numerical reasons? I don't think simplifying these things just for the sake of simplification really justifies the work or dangers, honestly.

@mcol
Copy link
Contributor Author

mcol commented Mar 4, 2020

This is an example of what I mean by simplification, from stan/math/prim/prob/chi_square_cdf.hpp (the first one in the diff):

-    
-      ops_partials.edge1_.partials_[n] += beta_dbl * exp(-beta_dbl * y_dbl)
-                                          * pow(beta_dbl * y_dbl, alpha_dbl - 1)
-                                          / tgamma(alpha_dbl) / Pn;
+      ops_partials.edge1_.partials_[n] += 0.5 * exp(-half_y_dbl)
+                                          * pow(half_y_dbl, half_nu_dbl - 1)
+                                          / (tgamma_vec[n] * Pn);

What's going on is the following:

  • beta_dbl is a constant of value 0.5 defined a few lines above this block, so I just replace it with 0.5
  • beta_dbl * y_dbl is used already in two places in this block, so I precompute it in variable half_y_dbl
  • alpha_dbl is defined a few lines above this block to be value_of(nu_vec[n]) * 0.5 (would you have guessed it?), and for clarity I rename it as half_nu_dbl
  • tgamma(alpha_dbl) was actually precomputed in a VectorBuilder but not used here, so I replace it with the VectorBuilder version called tgamma_vec[n]
  • instead of the two divisions / tgamma(alpha_dbl) / Pn, I follow the suggestion from a / (b * c) is more efficient than a / b / c #596 and rewrite is as / (tgamma_vec[n] * Pn)

In this example, only the last point could potentially change the numerics, and only skipping the extra tgamma call could potentially help performance. The rest is pretty trivial stuff that makes the code a bit more clear.

There may be cases in which what's going on is a bit more involved than this: I could go and find an example if you'd find it helpful.

@bbbales2
Copy link
Member

bbbales2 commented Mar 5, 2020

Yeah the numbers make sense and weren't too bad to check.

The things that scare me are the template logic/the VectorBuilder logic/the loop and if placement. That stuff I have trouble following.

Looking at the pull request that got reverted for Steve, I think the culprit line is: https://github.com/stan-dev/math/pull/1331/files#diff-565dac80cbb935e5597e584799cc9220R76

if (include_summand<propto>::value) should be if (include_summand<propto, T_size1, T_size2>::value), I think (didn't test this though, so quite likely wrong). I will look more tomorrow to understand what happened.

@mcol
Copy link
Contributor Author

mcol commented Mar 5, 2020

That line is part of the problem. In case of Steve's bug, I think what happened is that the normalizing constants were originally computed in three blocks:

  1. the binomial coefficient term if include_summand<propto>::value
  2. the lbeta numerator if include_summand<propto, T_size1, T_size2>::value
  3. the lbeta denominator again if include_summand<propto, T_size1, T_size2>::value

Further down, those terms were summed up again respecting those include_summand conditions.

In the buggy version, they got computed only if include_summand<propto>::value, as you pointed out. Which means that for propto=true (that is using the ~ notation) the contribution of the lbeta terms was lost, even if T_size1 and T_size2 were autodiff types.

In general, the rule for the include_summand or !is_constant_all checks is that if you add a term to their parameter list, that block will be run in more situations (whenever any of the parameters is an autodiff type): so the addition of a term in general is not dangerous (it may be detrimental to performance, but not to correctness). What is potentially more risky is the removal of a term, and this should be double-checked against the rest of the changes.

For example, looking at this code block (again, the first I found):

-  if (!is_constant_all<T_dof>::value) {
-    for (size_t i = 0; i < stan::math::size(nu); i++) {
-      const T_partials_return alpha_dbl = value_of(nu_vec[i]) * 0.5;
-      gamma_vec[i] = tgamma(alpha_dbl);
-      digamma_vec[i] = digamma(alpha_dbl);
-      digamma_vec(size_nu);
+  if (!is_constant_all<T_y, T_dof>::value) {
+    for (size_t i = 0; i < size_nu; i++) {
+      const T_partials_return half_nu_dbl = 0.5 * value_of(nu_vec[i]);
+      tgamma_vec[i] = tgamma(half_nu_dbl);
+      if (!is_constant_all<T_dof>::value) {
+        digamma_vec[i] = digamma(half_nu_dbl);
+      }

Before it was run only if T_dof is an autodiff type, but now if any of T_y or T_dof is. This change was done so that the tgamma_vec could be reused in a term further down that is computed if T_y is not constant. On the other hand, the digamma_vec term is only ever needed if T_dof is an autodiff type, hence the second extra check. Overall, digamma_vec is computed as often as before, and tgamma_vec potentially more often.

As for moving these check out of for loops, that should be neutral. Both examples below will do the same amount of work as the include_summand lines are compile-time checks, and will be removed by the compiler if they always evaluate to false. I tend to prefer the second form (I take Steve does too), as that's what would be used if those ifs were evaluated at runtime.

  for (size_t n = 0; n < size_beta; n++) {
    if (include_summand<propto, T_shape, T_inv_scale>::value) {
      log_beta[n] = log(value_of(beta_vec[n]));
  if (include_summand<propto, T_shape, T_inv_scale>::value) {
    for (size_t n = 0; n < size_beta; n++) {
      log_beta[n] = log(value_of(beta_vec[n]));

Sorry for the long message, especially because I'm probably telling you stuff you already know by heart!

@bbbales2
Copy link
Member

bbbales2 commented Mar 5, 2020

so the addition of a term in general is not dangerous

Good point.

Our problems are with propto = true

  1. The autodiff framework won't work since it assumes value_of(f(var(1.0))) == f(1.0). This won't work for propto = true

  2. We can't even expect f<propto = true>(var(1.0)) == f<propto = false>(1.0) (binomial has non-parameter terms that can still get dropped).

  3. Because we have the true densities (propto = false), we can write a test to verify that the normalizing constant is not a function of any of the autodiff variables.

So if f(x) is a function proportional to the true distribution g(x) we have:

log(g(x)) = K + log(f(x))

So we can compute K at a couple different values of x and make sure they're the same. This would have caught Steve's bug (f in that case was constant, but g would have changed).

  1. The gradient templating of the log density functions can be checked by verifying the gradients of all combinations of autodiff variables against the finite difference versions with propto = false

  2. The cdfs and ccdfs don't deal with propto, so the autodiff tester would be fine for them (not sure we've had any problems with cdfs and ccdfs though).

I think there is a bug in the probability test framework in handling 3 (#1763). I think the probability test framework should be doing 4 but I didn't check.

I'm probably telling you stuff you already know by heart!

I don't :D. This is why I'm so scared to move forward with this if it's purely code-style changes. We're really playing with fire.

Do you know how the bug you found in #1662 got through the testing framework? I see that the glms aren't tested in the testing framework so it's no surprise that bugs made it through there. That's a problem.

@mcol
Copy link
Contributor Author

mcol commented Mar 6, 2020

#1662 was caused by VectorBuilder loops that ran for the wrong number of times if one of the parameters was a vector and the rest were not. The distribution tests admittedly contains test for this case, but perhaps there's a subtle bug there too?

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.89 4.8 1.02 1.77% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 1.0 -0.06% slower
eight_schools/eight_schools.stan 0.09 0.09 1.01 1.06% faster
gp_regr/gp_regr.stan 0.22 0.22 1.01 0.72% faster
irt_2pl/irt_2pl.stan 6.45 6.5 0.99 -0.72% slower
performance.compilation 88.79 86.33 1.03 2.77% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 7.52 7.56 0.99 -0.6% slower
pkpd/one_comp_mm_elim_abs.stan 21.04 20.71 1.02 1.6% faster
sir/sir.stan 93.7 94.55 0.99 -0.91% slower
gp_regr/gen_gp_data.stan 0.05 0.05 0.97 -2.79% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.96 3.01 0.98 -1.82% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.31 0.32 0.96 -4.13% slower
arK/arK.stan 1.9 1.74 1.09 8.63% faster
arma/arma.stan 0.65 0.66 1.0 -0.35% slower
garch/garch.stan 0.52 0.52 1.0 0.03% faster
Mean result: 1.00430571747

Jenkins Console Log
Blue Ocean
Commit hash: 6cd5d46


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@bbbales2
Copy link
Member

bbbales2 commented Mar 7, 2020

#1662 was caused by VectorBuilder loops that ran for the wrong number of times if one of the parameters was a vector and the rest were not

@mcol if you get a chance, mind taking a look at the test framework to see if this is something it tests for? If it doesn't, that's okay, just report back and we'll make an issue. If it does, then presumably there's another bug with the test framework?

@mcol
Copy link
Contributor Author

mcol commented Mar 8, 2020

@bbbales2 There's a test_repeat_as_vector function that presumably shoud do that. But (after only a quick skim), it only seems to repeat the same value multiple times, while bug in #1662 would appear if different values were evaluated (as if a vector is not filled up correctly and the only first value is used, this is the same as repeating the first value mutiple times).

@bbbales2
Copy link
Member

bbbales2 commented Mar 8, 2020

@mcol is there an obvious way to change it so that it goes through multiple values?

And if so, does it catch the #1662 error?

You can run specific probability tests with code like:

./runTests.py test/prob/neg_binomial/neg_binomial_cdf_00000_generated_v_test

which is much faster than running them all.

@syclik
Copy link
Member

syclik commented Apr 28, 2020

@bbbales2, mind taking a few minutes to see if your comments have been addressed? (Either confirm the PR shouldn't be in, approve the PR, or remove the review.)

@bbbales2
Copy link
Member

@syclik yeah sorry.

I originally hesitated on this pull requestion because it touches a lot of difficult code and so I was scared a cleanup for the sake of a cleanup is kinda risky cause something might break.

The fear was that shuffling things around might break something that the tests don't catch. Even though we think our tests are fairly comprehensive, we found a bug in the test framework in the process of reviewing this code: #1763

But now I'm thinking the reverse logic applies as well. If someone goes through and shuffles around the code they might also find bugs we didn't notice before. And it is useful when people do this. So I think I should accept this if tests pass and the code looks good.

What's your opinion on this? I think I accept it if it passes tests (and this goes through) and I review it -- I think I only made it part way through the first roung.

Copy link
Member

@bbbales2 bbbales2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay @mcol, finished looking through this. Looks good , a couple questions though. The only thing I'm really concerned about are the negative infinity gradients at infinity. I assume those should be zero.

The half_nu vs half_nu_dbl thing just looked weird to me but I suspect I'm missing something.

Ignore the half_nu comments. There's nothing wrong with what's there.

I'm gonna take the question here: #1752 (comment) to a separate thread so it doesn't hold this up.

stan/math/prim/prob/inv_chi_square_cdf.hpp Show resolved Hide resolved
stan/math/prim/prob/inv_chi_square_lccdf.hpp Show resolved Hide resolved
stan/math/prim/prob/inv_chi_square_cdf.hpp Show resolved Hide resolved
stan/math/prim/prob/inv_gamma_lccdf.hpp Show resolved Hide resolved
stan/math/prim/prob/inv_chi_square_lcdf.hpp Show resolved Hide resolved
stan/math/prim/prob/inv_chi_square_lccdf.hpp Show resolved Hide resolved
@bbbales2
Copy link
Member

bbbales2 commented Jul 17, 2020

@mcol any chance you could take a look at the y_dbl == INFTY things?

Edit: I just don't think the gradients of any CDF-like thing will be infinity at infinity because the CDF limits to a constant 1.0 or 0.0 here.

@mcol
Copy link
Contributor Author

mcol commented Jul 18, 2020

All those infinity cases were pre-existing, I only touch those lines for variable renames or some similar cleanups. Since those you pointed out are lccdfs, at infinity they will take on the value of log(0), which is negative infinity. At least, that's how I justified to myself that code and kept it around. If that's not correct, then there must be something that we are not testing or not testing correctly.

@bbbales2
Copy link
Member

@mcol oh yeah you are right on all counts

@bbbales2
Copy link
Member

Merged in the new develop since this has been sitting awhile (just wanna make sure nothing went funky). I'll merge when the tests pass thanks!

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.13 4.15 1.0 -0.46% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.96 -4.64% slower
eight_schools/eight_schools.stan 0.09 0.09 0.99 -1.44% slower
gp_regr/gp_regr.stan 0.19 0.19 1.01 1.01% faster
irt_2pl/irt_2pl.stan 5.31 5.34 0.99 -0.58% slower
performance.compilation 87.4 86.09 1.02 1.5% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.53 8.52 1.0 0.02% faster
pkpd/one_comp_mm_elim_abs.stan 26.42 29.37 0.9 -11.16% slower
sir/sir.stan 110.08 112.73 0.98 -2.41% slower
gp_regr/gen_gp_data.stan 0.05 0.04 1.01 0.99% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.29 3.3 1.0 -0.49% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.38 0.39 0.97 -2.8% slower
arK/arK.stan 1.87 1.83 1.03 2.53% faster
arma/arma.stan 0.65 0.71 0.92 -8.2% slower
garch/garch.stan 0.53 0.52 1.01 0.66% faster
Mean result: 0.984513127964

Jenkins Console Log
Blue Ocean
Commit hash: 60a574d


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@wds15
Copy link
Contributor

wds15 commented Jul 20, 2020

@bbbales2 you have reviewed this thoroughly and want to merge it? I am not sure if you can given that you pushed commits, so let me know if you need a hand.

@bbbales2
Copy link
Member

@wds15 what I did was merge in develop and add a missing using std::pow (here).

If you wanna click the merge button I'm fine with that. The rest of this is reviewed (as best I could). The discussion in here is mostly two problems with the distribution testing framework we dug up working on this.

@SteveBronder
Copy link
Collaborator

Should we wait for the fixes to the testing framework before approving this? (i.e. this would wait till the next release most likely)

@bbbales2
Copy link
Member

@SteveBronder well I was thinking really conservatively on this back in March and that stalled the pull request quite a bit (though we did find a couple problems with the test framework).

I don't think we'll ever be totally happy with the distribution tests. There's at least one fix (#1764) that'll probably make it in this release (and will presumably catch in merge any #1764 bugs this introduces if it does).

But there are also problems/weaknesses with the tests that won't be fixed this release: #1978 and #1976

And that's just the stuff I know about. I'm more leaning now we accept what's passing our tests, otherwise it's kindof an impossible moving bar to hop over.

@wds15
Copy link
Contributor

wds15 commented Jul 20, 2020

You guys decide on this one, please.

@SteveBronder
Copy link
Collaborator

imo I'd rather wait till the distribution tests are beefier till we merge this

@bbbales2
Copy link
Member

Sure, but we should decide what beefier is so we don't just leave it hanging.

@SteveBronder
Copy link
Collaborator

By "beefier" I just meant merging in #1764. Ben I can look this over if you want another set of eyes but otherwise I'm fine with it (though I'd classify this as not a bugfix so it would wait till the next release)

@bbbales2
Copy link
Member

We can wait. I reran the tests after 1764 merged and it's green light. We can press the merge button next Tuesday.

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.24 4.25 1.0 -0.12% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.97 -3.53% slower
eight_schools/eight_schools.stan 0.09 0.09 1.01 0.87% faster
gp_regr/gp_regr.stan 0.19 0.2 0.99 -1.26% slower
irt_2pl/irt_2pl.stan 5.34 5.34 1.0 -0.03% slower
performance.compilation 86.71 85.96 1.01 0.87% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.57 8.51 1.01 0.69% faster
pkpd/one_comp_mm_elim_abs.stan 28.57 27.19 1.05 4.85% faster
sir/sir.stan 124.33 115.03 1.08 7.48% faster
gp_regr/gen_gp_data.stan 0.04 0.05 0.91 -10.06% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.47 3.28 1.06 5.32% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.38 0.4 0.93 -7.13% slower
arK/arK.stan 1.81 1.84 0.99 -1.38% slower
arma/arma.stan 0.69 0.62 1.1 9.45% faster
garch/garch.stan 0.52 0.53 1.0 -0.28% slower
Mean result: 1.00630478923

Jenkins Console Log
Blue Ocean
Commit hash: 60a574d


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.01 4.08 0.98 -1.64% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.99 -1.17% slower
eight_schools/eight_schools.stan 0.09 0.09 1.05 4.46% faster
gp_regr/gp_regr.stan 0.2 0.19 1.02 2.08% faster
irt_2pl/irt_2pl.stan 5.32 5.33 1.0 -0.12% slower
performance.compilation 86.81 85.48 1.02 1.54% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 7.58 7.59 1.0 -0.21% slower
pkpd/one_comp_mm_elim_abs.stan 27.33 26.56 1.03 2.82% faster
sir/sir.stan 110.81 111.54 0.99 -0.66% slower
gp_regr/gen_gp_data.stan 0.04 0.05 0.98 -1.7% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.06 2.99 1.02 2.3% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.4 0.41 0.98 -2.02% slower
arK/arK.stan 1.72 1.73 0.99 -0.62% slower
arma/arma.stan 0.59 0.59 1.0 0.03% faster
garch/garch.stan 0.59 0.59 1.0 -0.16% slower
Mean result: 1.00363017664

Jenkins Console Log
Blue Ocean
Commit hash: 60a574d


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@rok-cesnovar
Copy link
Member

Is this good to go now?

@bbbales2
Copy link
Member

bbbales2 commented Aug 1, 2020

I'm working on the test framework now so we can just merge it after #1989 is through.

I think it would pass everything fine, but I am working on the test framework (and there are problems with it I'm finding), so this may as well wait. I'm trying to get that done asap.

@rok-cesnovar
Copy link
Member

Cool cool. Test away!

@bbbales2
Copy link
Member

bbbales2 commented Oct 3, 2020

@mcol The distributions in this pull request ended up all getting rewritten before I got the tests in place (still open: #2085).

I guess this in particular didn't go through because my testing standards lowered between July and now. Apologies for that.

Mind if I close this (I'll go ahead and do it in a few days if I don't hear anything back)?

@mcol
Copy link
Contributor Author

mcol commented Oct 5, 2020

No problem, let's close this now!

@mcol mcol closed this Oct 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants