Reuse intermediate computations in distributions part 2 #1752

mcol · 2020-02-28T16:07:04Z

This is the second of a few PRs that aim to clean up the distributions so that we make better use of intermediate computations and composed functions.

This covers the following distributions:

chi_square
gamma
inv_chi_square
inv_gamma
scaled_inv_chi_square

Tests

None, this is just cleanup.

Side Effects

None.

Release Notes

Cleaned up implementations of chi_square, gamma, inv_chi_square, inv_gamma, and scaled_inv_chi_square to make use of intermediate calculations more.

Checklist

Math issue: Make use of composed functions and temporaries #1230
Copyright holder: Marco Colombo

The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
- unit tests pass (to run, use: ./runTests.py test/unit)
- header checks pass, (make test-headers)
- dependencies checks pass, (make test-math-dependencies)
- docs build, (make doxygen)
- code passes the built in C++ standards checks (make cpplint)
the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested

…uare

stan-buildbot · 2020-02-28T23:24:40Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	4.89	4.84	1.01	1.05% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	1.0	0.2% faster
eight_schools/eight_schools.stan	0.09	0.09	0.98	-1.73% slower
gp_regr/gp_regr.stan	0.22	0.22	0.99	-1.04% slower
irt_2pl/irt_2pl.stan	6.06	6.11	0.99	-0.74% slower
performance.compilation	89.74	87.15	1.03	2.88% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	7.41	7.42	1.0	-0.24% slower
pkpd/one_comp_mm_elim_abs.stan	20.77	20.21	1.03	2.72% faster
sir/sir.stan	98.25	94.46	1.04	3.85% faster
gp_regr/gen_gp_data.stan	0.05	0.05	0.99	-0.71% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	2.96	2.98	0.99	-0.7% slower
pkpd/sim_one_comp_mm_elim_abs.stan	0.31	0.34	0.92	-9.24% slower
arK/arK.stan	1.73	1.74	0.99	-0.77% slower
arma/arma.stan	0.66	0.65	1.01	0.68% faster
garch/garch.stan	0.59	0.59	1.0	-0.44% slower
Mean result: 0.997970841634

Jenkins Console Log
Blue Ocean
Commit hash: 0abd9f1

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

bbbales2

This is only a partial review, but I wanted to stop and ask a question before finishing it.

There seem to be a lot of logic changes with the templates. Presumably before we spent an unnecessary amount of time computing things that don't matter.

Is this all being tested somewhere?

Like are the gradients of gamma_lcdf(y, a, b) being tested for all argument combinations? Like:

gamma_lcdf(double, double, double);
gamma_lcdf(double, double, var);
..
gamma_lcdf(var, var, var);

Etc. Does the distribution test framework do that?

stan/math/prim/prob/chi_square_lccdf.hpp

stan/math/prim/prob/chi_square_lcdf.hpp

stan/math/prim/prob/chi_square_lpdf.hpp

stan/math/prim/prob/gamma_cdf.hpp

mcol · 2020-03-02T22:08:26Z

All these are tested in the distribution tests. For chi_square_lcdf, for example, all the following are being tested:

double, std::vector<double>
double, Eigen::Matrix<double, Eigen::Dynamic, 1>
double, var
double, std::vector<var>
double, Eigen::Matrix<var, Eigen::Dynamic, 1>
std::vector<double>, double
std::vector<double>, std::vector<double>
std::vector<double>, Eigen::Matrix<double, Eigen::Dynamic, 1>
std::vector<double>, var
std::vector<double>, std::vector<var>
std::vector<double>, Eigen::Matrix<var, Eigen::Dynamic, 1>
Eigen::Matrix<double, Eigen::Dynamic, 1>, double
Eigen::Matrix<double, Eigen::Dynamic, 1>, std::vector<double>
Eigen::Matrix<double, Eigen::Dynamic, 1>, Eigen::Matrix<double, Eigen::Dynamic, 1>
Eigen::Matrix<double, Eigen::Dynamic, 1>, var
Eigen::Matrix<double, Eigen::Dynamic, 1>, std::vector<var>
Eigen::Matrix<double, Eigen::Dynamic, 1>, Eigen::Matrix<var, Eigen::Dynamic, 1>
var, double
var, std::vector<double>
var, Eigen::Matrix<double, Eigen::Dynamic, 1>
var, var
var, std::vector<var>
var, Eigen::Matrix<var, Eigen::Dynamic, 1>
std::vector<var>, double
std::vector<var>, std::vector<double>
std::vector<var>, Eigen::Matrix<double, Eigen::Dynamic, 1>
std::vector<var>, var
std::vector<var>, std::vector<var>
std::vector<var>, Eigen::Matrix<var, Eigen::Dynamic, 1>
Eigen::Matrix<var, Eigen::Dynamic, 1>, double
Eigen::Matrix<var, Eigen::Dynamic, 1>, std::vector<double>
Eigen::Matrix<var, Eigen::Dynamic, 1>, Eigen::Matrix<double, Eigen::Dynamic, 1>
Eigen::Matrix<var, Eigen::Dynamic, 1>, var
Eigen::Matrix<var, Eigen::Dynamic, 1>, std::vector<var>
Eigen::Matrix<var, Eigen::Dynamic, 1>, Eigen::Matrix<var, Eigen::Dynamic, 1>

Same for fvar<double>, fvar<fvar<double>>, fvar<var> and fvar<fvar<var>>, so I think we are well covered.

I'm going to push a further commit to ensures that the size_zero checks happen after we checked the consistency of the other arguments, along with a few more cleanups that I've noticed while having another look at this.

Thanks for looking at this PR too!

…stable/2017-11-14)

bbbales2 · 2020-03-02T22:21:34Z

All these are tested in the distribution tests.

Thanks for checking!

stan-buildbot · 2020-03-03T05:00:40Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	4.84	4.92	0.98	-1.79% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	1.01	0.75% faster
eight_schools/eight_schools.stan	0.09	0.09	1.02	2.11% faster
gp_regr/gp_regr.stan	0.22	0.22	1.01	0.55% faster
irt_2pl/irt_2pl.stan	6.11	6.18	0.99	-1.12% slower
performance.compilation	88.32	87.25	1.01	1.22% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	7.68	7.67	1.0	0.17% faster
pkpd/one_comp_mm_elim_abs.stan	21.04	20.32	1.04	3.45% faster
sir/sir.stan	90.69	96.77	0.94	-6.69% slower
gp_regr/gen_gp_data.stan	0.05	0.05	1.0	0.17% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan	2.95	2.97	0.99	-0.5% slower
pkpd/sim_one_comp_mm_elim_abs.stan	0.33	0.33	0.99	-0.7% slower
arK/arK.stan	1.74	1.75	0.99	-0.5% slower
arma/arma.stan	0.66	0.66	1.0	0.22% faster
garch/garch.stan	0.52	0.58	0.89	-11.8% slower
Mean result: 0.991612615195

Jenkins Console Log
Blue Ocean
Commit hash: 0680496

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

SteveBronder · 2020-03-03T19:32:27Z

Should this branch be tested by all the models in the cmdstan performance test suit? I only say that because I've tried one of these before that led to a revert

https://github.com/stan-dev/performance-tests-cmdstan

bbbales2 · 2020-03-04T16:54:42Z

The code here looks good (tests pass and looking over it there's nothing obviously wrong), but I'm worried (as Steve points out) that we've made changes to these distribution functions before and it's broken stuff downstream unexpectedly.

Regarding this scope/objectives of pull/how to get it merged, I have a few questions:

This is part of a larger pull request broken into pieces (including one that was already merged: Reuse intermediate computations in distributions part 1 #1744). How many pieces are there? Are they just touching all the distributions?
I don't think issue Make use of composed functions and temporaries #1230 justifies the changes here.

It looks like the point of these changes is to make the lpdfs faster in situations where not all of the arguments are autodiff variables (and so there are optional calculations). Is that closer to the mark?

I guess the additional question here is how will I know the problem fixed?

What are the specific issues where this came up and what were the fixes? Specifically what broke for you @SteveBronder? I remember the glms have broken before as well. I'd like to know this before figuring out if we're tested enough or need more tests.

I'll look into #3 myself (feel free to add links if you know about stuff). #1 and #2 are for you @mcol.

mcol · 2020-03-04T18:03:35Z

My plan is to cover the large majority of distributions, excluding the multivariate distributions and the glms (for both these types of distributions, the code is quite different from the rest, and I'm not familiar with that). If we go with PRs of sizes such as this and #1744 (which seems a reasonable size to me, but let me know if smaller would be desirable), then we will have a part 3 and a part 4. Note that #1758 will reduce the size of these PRs, so I will wait until it goes in before I pick this one up again.

This set of PRs does indeed more that what #1230 required. I can open a new issue to describe that, if that's what you'd like to see. Overall, this is not motivated by performance gains, as I expect them to be generally tiny if measurable at all, but from the attempt to reuse some intermediate computations and simplify existing ones. In a few cases some computations can be avoided for non-autodiff types, but those are the minority.

bbbales2 · 2020-03-04T20:24:57Z

I can open a new issue to describe that, if that's what you'd like to see

Just talk them out here for now. We can just copy-paste them over to an issue easily.

Overall, this is not motivated by performance gains, as I expect them to be generally tiny if measurable at all, but from the attempt to reuse some intermediate computations and simplify existing ones.

Oh, is it for numerical reasons? I don't think simplifying these things just for the sake of simplification really justifies the work or dangers, honestly.

mcol · 2020-03-04T21:07:44Z

This is an example of what I mean by simplification, from stan/math/prim/prob/chi_square_cdf.hpp (the first one in the diff):

-    
-      ops_partials.edge1_.partials_[n] += beta_dbl * exp(-beta_dbl * y_dbl)
-                                          * pow(beta_dbl * y_dbl, alpha_dbl - 1)
-                                          / tgamma(alpha_dbl) / Pn;
+      ops_partials.edge1_.partials_[n] += 0.5 * exp(-half_y_dbl)
+                                          * pow(half_y_dbl, half_nu_dbl - 1)
+                                          / (tgamma_vec[n] * Pn);

What's going on is the following:

beta_dbl is a constant of value 0.5 defined a few lines above this block, so I just replace it with 0.5
beta_dbl * y_dbl is used already in two places in this block, so I precompute it in variable half_y_dbl
alpha_dbl is defined a few lines above this block to be value_of(nu_vec[n]) * 0.5 (would you have guessed it?), and for clarity I rename it as half_nu_dbl
tgamma(alpha_dbl) was actually precomputed in a VectorBuilder but not used here, so I replace it with the VectorBuilder version called tgamma_vec[n]
instead of the two divisions / tgamma(alpha_dbl) / Pn, I follow the suggestion from a / (b * c) is more efficient than a / b / c #596 and rewrite is as / (tgamma_vec[n] * Pn)

In this example, only the last point could potentially change the numerics, and only skipping the extra tgamma call could potentially help performance. The rest is pretty trivial stuff that makes the code a bit more clear.

There may be cases in which what's going on is a bit more involved than this: I could go and find an example if you'd find it helpful.

bbbales2 · 2020-03-05T02:37:56Z

Yeah the numbers make sense and weren't too bad to check.

The things that scare me are the template logic/the VectorBuilder logic/the loop and if placement. That stuff I have trouble following.

Looking at the pull request that got reverted for Steve, I think the culprit line is: https://github.com/stan-dev/math/pull/1331/files#diff-565dac80cbb935e5597e584799cc9220R76

if (include_summand<propto>::value) should be if (include_summand<propto, T_size1, T_size2>::value), I think (didn't test this though, so quite likely wrong). I will look more tomorrow to understand what happened.

mcol · 2020-03-05T18:28:35Z

That line is part of the problem. In case of Steve's bug, I think what happened is that the normalizing constants were originally computed in three blocks:

the binomial coefficient term if include_summand<propto>::value
the lbeta numerator if include_summand<propto, T_size1, T_size2>::value
the lbeta denominator again if include_summand<propto, T_size1, T_size2>::value

Further down, those terms were summed up again respecting those include_summand conditions.

In the buggy version, they got computed only if include_summand<propto>::value, as you pointed out. Which means that for propto=true (that is using the ~ notation) the contribution of the lbeta terms was lost, even if T_size1 and T_size2 were autodiff types.

In general, the rule for the include_summand or !is_constant_all checks is that if you add a term to their parameter list, that block will be run in more situations (whenever any of the parameters is an autodiff type): so the addition of a term in general is not dangerous (it may be detrimental to performance, but not to correctness). What is potentially more risky is the removal of a term, and this should be double-checked against the rest of the changes.

For example, looking at this code block (again, the first I found):

-  if (!is_constant_all<T_dof>::value) {
-    for (size_t i = 0; i < stan::math::size(nu); i++) {
-      const T_partials_return alpha_dbl = value_of(nu_vec[i]) * 0.5;
-      gamma_vec[i] = tgamma(alpha_dbl);
-      digamma_vec[i] = digamma(alpha_dbl);
-      digamma_vec(size_nu);
+  if (!is_constant_all<T_y, T_dof>::value) {
+    for (size_t i = 0; i < size_nu; i++) {
+      const T_partials_return half_nu_dbl = 0.5 * value_of(nu_vec[i]);
+      tgamma_vec[i] = tgamma(half_nu_dbl);
+      if (!is_constant_all<T_dof>::value) {
+        digamma_vec[i] = digamma(half_nu_dbl);
+      }

Before it was run only if T_dof is an autodiff type, but now if any of T_y or T_dof is. This change was done so that the tgamma_vec could be reused in a term further down that is computed if T_y is not constant. On the other hand, the digamma_vec term is only ever needed if T_dof is an autodiff type, hence the second extra check. Overall, digamma_vec is computed as often as before, and tgamma_vec potentially more often.

As for moving these check out of for loops, that should be neutral. Both examples below will do the same amount of work as the include_summand lines are compile-time checks, and will be removed by the compiler if they always evaluate to false. I tend to prefer the second form (I take Steve does too), as that's what would be used if those ifs were evaluated at runtime.

  for (size_t n = 0; n < size_beta; n++) {
    if (include_summand<propto, T_shape, T_inv_scale>::value) {
      log_beta[n] = log(value_of(beta_vec[n]));

  if (include_summand<propto, T_shape, T_inv_scale>::value) {
    for (size_t n = 0; n < size_beta; n++) {
      log_beta[n] = log(value_of(beta_vec[n]));

Sorry for the long message, especially because I'm probably telling you stuff you already know by heart!

bbbales2 · 2020-03-05T23:28:26Z

so the addition of a term in general is not dangerous

Good point.

Our problems are with propto = true

The autodiff framework won't work since it assumes value_of(f(var(1.0))) == f(1.0). This won't work for propto = true
We can't even expect f<propto = true>(var(1.0)) == f<propto = false>(1.0) (binomial has non-parameter terms that can still get dropped).
Because we have the true densities (propto = false), we can write a test to verify that the normalizing constant is not a function of any of the autodiff variables.

So if f(x) is a function proportional to the true distribution g(x) we have:

log(g(x)) = K + log(f(x))

So we can compute K at a couple different values of x and make sure they're the same. This would have caught Steve's bug (f in that case was constant, but g would have changed).

The gradient templating of the log density functions can be checked by verifying the gradients of all combinations of autodiff variables against the finite difference versions with propto = false
The cdfs and ccdfs don't deal with propto, so the autodiff tester would be fine for them (not sure we've had any problems with cdfs and ccdfs though).

I think there is a bug in the probability test framework in handling 3 (#1763). I think the probability test framework should be doing 4 but I didn't check.

I'm probably telling you stuff you already know by heart!

I don't :D. This is why I'm so scared to move forward with this if it's purely code-style changes. We're really playing with fire.

Do you know how the bug you found in #1662 got through the testing framework? I see that the glms aren't tested in the testing framework so it's no surprise that bugs made it through there. That's a problem.

mcol · 2020-03-06T10:05:46Z

#1662 was caused by VectorBuilder loops that ran for the wrong number of times if one of the parameters was a vector and the rest were not. The distribution tests admittedly contains test for this case, but perhaps there's a subtle bug there too?

stan-buildbot · 2020-03-07T02:15:43Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	4.89	4.8	1.02	1.77% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	1.0	-0.06% slower
eight_schools/eight_schools.stan	0.09	0.09	1.01	1.06% faster
gp_regr/gp_regr.stan	0.22	0.22	1.01	0.72% faster
irt_2pl/irt_2pl.stan	6.45	6.5	0.99	-0.72% slower
performance.compilation	88.79	86.33	1.03	2.77% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	7.52	7.56	0.99	-0.6% slower
pkpd/one_comp_mm_elim_abs.stan	21.04	20.71	1.02	1.6% faster
sir/sir.stan	93.7	94.55	0.99	-0.91% slower
gp_regr/gen_gp_data.stan	0.05	0.05	0.97	-2.79% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	2.96	3.01	0.98	-1.82% slower
pkpd/sim_one_comp_mm_elim_abs.stan	0.31	0.32	0.96	-4.13% slower
arK/arK.stan	1.9	1.74	1.09	8.63% faster
arma/arma.stan	0.65	0.66	1.0	-0.35% slower
garch/garch.stan	0.52	0.52	1.0	0.03% faster
Mean result: 1.00430571747

Jenkins Console Log
Blue Ocean
Commit hash: 6cd5d46

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

bbbales2 · 2020-03-07T21:21:01Z

#1662 was caused by VectorBuilder loops that ran for the wrong number of times if one of the parameters was a vector and the rest were not

@mcol if you get a chance, mind taking a look at the test framework to see if this is something it tests for? If it doesn't, that's okay, just report back and we'll make an issue. If it does, then presumably there's another bug with the test framework?

mcol · 2020-03-08T13:48:52Z

@bbbales2 There's a test_repeat_as_vector function that presumably shoud do that. But (after only a quick skim), it only seems to repeat the same value multiple times, while bug in #1662 would appear if different values were evaluated (as if a vector is not filled up correctly and the only first value is used, this is the same as repeating the first value mutiple times).

bbbales2 · 2020-03-08T18:03:43Z

@mcol is there an obvious way to change it so that it goes through multiple values?

And if so, does it catch the #1662 error?

You can run specific probability tests with code like:

./runTests.py test/prob/neg_binomial/neg_binomial_cdf_00000_generated_v_test

which is much faster than running them all.

syclik · 2020-04-28T03:38:00Z

@bbbales2, mind taking a few minutes to see if your comments have been addressed? (Either confirm the PR shouldn't be in, approve the PR, or remove the review.)

bbbales2 · 2020-04-29T17:16:44Z

@syclik yeah sorry.

I originally hesitated on this pull requestion because it touches a lot of difficult code and so I was scared a cleanup for the sake of a cleanup is kinda risky cause something might break.

The fear was that shuffling things around might break something that the tests don't catch. Even though we think our tests are fairly comprehensive, we found a bug in the test framework in the process of reviewing this code: #1763

But now I'm thinking the reverse logic applies as well. If someone goes through and shuffles around the code they might also find bugs we didn't notice before. And it is useful when people do this. So I think I should accept this if tests pass and the code looks good.

What's your opinion on this? I think I accept it if it passes tests (and this goes through) and I review it -- I think I only made it part way through the first roung.

bbbales2

Sorry for the delay @mcol, finished looking through this. Looks good ~~, a couple questions though~~. The only thing I'm really concerned about are the negative infinity gradients at infinity. I assume those should be zero.

~~The half_nu vs half_nu_dbl thing just looked weird to me but I suspect I'm missing something.~~

Ignore the half_nu comments. There's nothing wrong with what's there.

I'm gonna take the question here: #1752 (comment) to a separate thread so it doesn't hold this up.

stan/math/prim/prob/inv_chi_square_cdf.hpp

stan/math/prim/prob/inv_chi_square_lccdf.hpp

stan/math/prim/prob/inv_chi_square_cdf.hpp

stan/math/prim/prob/scaled_inv_chi_square_lccdf.hpp

stan/math/prim/prob/inv_gamma_lccdf.hpp

stan/math/prim/prob/inv_chi_square_lcdf.hpp

stan/math/prim/prob/inv_chi_square_lccdf.hpp

bbbales2 · 2020-07-17T23:46:47Z

@mcol any chance you could take a look at the y_dbl == INFTY things?

Edit: I just don't think the gradients of any CDF-like thing will be infinity at infinity because the CDF limits to a constant 1.0 or 0.0 here.

mcol · 2020-07-18T11:58:12Z

All those infinity cases were pre-existing, I only touch those lines for variable renames or some similar cleanups. Since those you pointed out are lccdfs, at infinity they will take on the value of log(0), which is negative infinity. At least, that's how I justified to myself that code and kept it around. If that's not correct, then there must be something that we are not testing or not testing correctly.

bbbales2 · 2020-07-19T19:38:57Z

@mcol oh yeah you are right on all counts

…ediates-prob-p2

bbbales2 · 2020-07-19T19:44:23Z

Merged in the new develop since this has been sitting awhile (just wanna make sure nothing went funky). I'll merge when the tests pass thanks!

stan-buildbot · 2020-07-20T06:16:11Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	4.13	4.15	1.0	-0.46% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	0.96	-4.64% slower
eight_schools/eight_schools.stan	0.09	0.09	0.99	-1.44% slower
gp_regr/gp_regr.stan	0.19	0.19	1.01	1.01% faster
irt_2pl/irt_2pl.stan	5.31	5.34	0.99	-0.58% slower
performance.compilation	87.4	86.09	1.02	1.5% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	8.53	8.52	1.0	0.02% faster
pkpd/one_comp_mm_elim_abs.stan	26.42	29.37	0.9	-11.16% slower
sir/sir.stan	110.08	112.73	0.98	-2.41% slower
gp_regr/gen_gp_data.stan	0.05	0.04	1.01	0.99% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan	3.29	3.3	1.0	-0.49% slower
pkpd/sim_one_comp_mm_elim_abs.stan	0.38	0.39	0.97	-2.8% slower
arK/arK.stan	1.87	1.83	1.03	2.53% faster
arma/arma.stan	0.65	0.71	0.92	-8.2% slower
garch/garch.stan	0.53	0.52	1.01	0.66% faster
Mean result: 0.984513127964

Jenkins Console Log
Blue Ocean
Commit hash: 60a574d

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

wds15 · 2020-07-20T08:24:00Z

@bbbales2 you have reviewed this thoroughly and want to merge it? I am not sure if you can given that you pushed commits, so let me know if you need a hand.

bbbales2 · 2020-07-20T16:24:20Z

@wds15 what I did was merge in develop and add a missing using std::pow (here).

If you wanna click the merge button I'm fine with that. The rest of this is reviewed (as best I could). The discussion in here is mostly two problems with the distribution testing framework we dug up working on this.

SteveBronder · 2020-07-20T18:45:55Z

Should we wait for the fixes to the testing framework before approving this? (i.e. this would wait till the next release most likely)

bbbales2 · 2020-07-20T19:11:57Z

@SteveBronder well I was thinking really conservatively on this back in March and that stalled the pull request quite a bit (though we did find a couple problems with the test framework).

I don't think we'll ever be totally happy with the distribution tests. There's at least one fix (#1764) that'll probably make it in this release (and will presumably catch in merge any #1764 bugs this introduces if it does).

But there are also problems/weaknesses with the tests that won't be fixed this release: #1978 and #1976

And that's just the stuff I know about. I'm more leaning now we accept what's passing our tests, otherwise it's kindof an impossible moving bar to hop over.

wds15 · 2020-07-20T19:43:22Z

You guys decide on this one, please.

SteveBronder · 2020-07-21T20:57:23Z

imo I'd rather wait till the distribution tests are beefier till we merge this

bbbales2 · 2020-07-21T21:03:10Z

Sure, but we should decide what beefier is so we don't just leave it hanging.

SteveBronder · 2020-07-22T21:08:15Z

By "beefier" I just meant merging in #1764. Ben I can look this over if you want another set of eyes but otherwise I'm fine with it (though I'd classify this as not a bugfix so it would wait till the next release)

bbbales2 · 2020-07-22T22:04:29Z

We can wait. I reran the tests after 1764 merged and it's green light. We can press the merge button next Tuesday.

stan-buildbot · 2020-07-22T22:44:56Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	4.24	4.25	1.0	-0.12% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	0.97	-3.53% slower
eight_schools/eight_schools.stan	0.09	0.09	1.01	0.87% faster
gp_regr/gp_regr.stan	0.19	0.2	0.99	-1.26% slower
irt_2pl/irt_2pl.stan	5.34	5.34	1.0	-0.03% slower
performance.compilation	86.71	85.96	1.01	0.87% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	8.57	8.51	1.01	0.69% faster
pkpd/one_comp_mm_elim_abs.stan	28.57	27.19	1.05	4.85% faster
sir/sir.stan	124.33	115.03	1.08	7.48% faster
gp_regr/gen_gp_data.stan	0.04	0.05	0.91	-10.06% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	3.47	3.28	1.06	5.32% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.38	0.4	0.93	-7.13% slower
arK/arK.stan	1.81	1.84	0.99	-1.38% slower
arma/arma.stan	0.69	0.62	1.1	9.45% faster
garch/garch.stan	0.52	0.53	1.0	-0.28% slower
Mean result: 1.00630478923

Jenkins Console Log
Blue Ocean
Commit hash: 60a574d

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

stan-buildbot · 2020-07-31T09:37:14Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	4.01	4.08	0.98	-1.64% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	0.99	-1.17% slower
eight_schools/eight_schools.stan	0.09	0.09	1.05	4.46% faster
gp_regr/gp_regr.stan	0.2	0.19	1.02	2.08% faster
irt_2pl/irt_2pl.stan	5.32	5.33	1.0	-0.12% slower
performance.compilation	86.81	85.48	1.02	1.54% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	7.58	7.59	1.0	-0.21% slower
pkpd/one_comp_mm_elim_abs.stan	27.33	26.56	1.03	2.82% faster
sir/sir.stan	110.81	111.54	0.99	-0.66% slower
gp_regr/gen_gp_data.stan	0.04	0.05	0.98	-1.7% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	3.06	2.99	1.02	2.3% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.4	0.41	0.98	-2.02% slower
arK/arK.stan	1.72	1.73	0.99	-0.62% slower
arma/arma.stan	0.59	0.59	1.0	0.03% faster
garch/garch.stan	0.59	0.59	1.0	-0.16% slower
Mean result: 1.00363017664

Jenkins Console Log
Blue Ocean
Commit hash: 60a574d

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

rok-cesnovar · 2020-08-01T17:42:33Z

Is this good to go now?

bbbales2 · 2020-08-01T17:56:28Z

I'm working on the test framework now so we can just merge it after #1989 is through.

I think it would pass everything fine, but I am working on the test framework (and there are problems with it I'm finding), so this may as well wait. I'm trying to get that done asap.

rok-cesnovar · 2020-08-01T18:04:34Z

Cool cool. Test away!

bbbales2 · 2020-10-03T15:14:53Z

@mcol The distributions in this pull request ended up all getting rewritten before I got the tests in place (still open: #2085).

I guess this in particular didn't go through because my testing standards lowered between July and now. Apologies for that.

Mind if I close this (I'll go ahead and do it in a few days if I don't hear anything back)?

mcol · 2020-10-05T21:00:31Z

No problem, let's close this now!

mcol added 2 commits February 28, 2020 16:02

Reuse intermediate computations in chi_square and [scaled_]inv_chi_sq…

41af3de

…uare

Reuse intermediate computations in gamma and inv_gamma

0abd9f1

bbbales2 self-assigned this Mar 2, 2020

bbbales2 requested changes Mar 2, 2020

View reviewed changes

mcol and others added 3 commits March 2, 2020 22:09

Move size_zero checks after arguments checks and further cleanups

e851f2c

Merge commit 'f0341173706be97c95f5b76af06f27a0fa597791' into HEAD

cae9b46

[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…

0680496

…stable/2017-11-14)

mcol and others added 2 commits March 6, 2020 20:27

Merge branch 'develop' into cleanup/1230-better-intermediates-prob-p2

b597f07

Remove duplicated using statements introduced by a bad merge

6cd5d46

bbbales2 requested changes May 2, 2020

View reviewed changes

bbbales2 mentioned this pull request May 2, 2020

Probability test framework didn't catch bug with vectorization #1861

Closed

Merge remote-tracking branch 'origin' into cleanup/1230-better-interm…

356de66

…ediates-prob-p2

bbbales2 mentioned this pull request Jul 19, 2020

vector distribution tests should test with vectors of not all the same values #1978

Open

Add using std::double to gamma cdf (Issue #1763)

60a574d

bbbales2 approved these changes Jul 20, 2020

View reviewed changes

rok-cesnovar mentioned this pull request Jul 20, 2020

Release of Stan Math 3.3 #1974

Closed

mcol closed this Oct 5, 2020

Reuse intermediate computations in distributions part 2 #1752

Reuse intermediate computations in distributions part 2 #1752

Conversation

mcol commented Feb 28, 2020 • edited by bbbales2 Loading

Tests

Side Effects

Release Notes

Checklist

stan-buildbot commented Feb 28, 2020

bbbales2 left a comment

Choose a reason for hiding this comment

mcol commented Mar 2, 2020 • edited Loading

bbbales2 commented Mar 2, 2020

stan-buildbot commented Mar 3, 2020

SteveBronder commented Mar 3, 2020

bbbales2 commented Mar 4, 2020

mcol commented Mar 4, 2020

bbbales2 commented Mar 4, 2020

mcol commented Mar 4, 2020

bbbales2 commented Mar 5, 2020

mcol commented Mar 5, 2020

bbbales2 commented Mar 5, 2020

mcol commented Mar 6, 2020

stan-buildbot commented Mar 7, 2020

bbbales2 commented Mar 7, 2020

mcol commented Mar 8, 2020

bbbales2 commented Mar 8, 2020

syclik commented Apr 28, 2020

bbbales2 commented Apr 29, 2020

bbbales2 left a comment • edited Loading

Choose a reason for hiding this comment

bbbales2 commented Jul 17, 2020 • edited Loading

mcol commented Jul 18, 2020

bbbales2 commented Jul 19, 2020

bbbales2 commented Jul 19, 2020

stan-buildbot commented Jul 20, 2020

wds15 commented Jul 20, 2020

bbbales2 commented Jul 20, 2020

SteveBronder commented Jul 20, 2020

bbbales2 commented Jul 20, 2020

wds15 commented Jul 20, 2020

SteveBronder commented Jul 21, 2020

bbbales2 commented Jul 21, 2020

SteveBronder commented Jul 22, 2020

bbbales2 commented Jul 22, 2020

stan-buildbot commented Jul 22, 2020

stan-buildbot commented Jul 31, 2020

rok-cesnovar commented Aug 1, 2020

bbbales2 commented Aug 1, 2020

rok-cesnovar commented Aug 1, 2020

bbbales2 commented Oct 3, 2020

mcol commented Oct 5, 2020

mcol commented Feb 28, 2020 •

edited by bbbales2

Loading

mcol commented Mar 2, 2020 •

edited

Loading

bbbales2 left a comment •

edited

Loading

bbbales2 commented Jul 17, 2020 •

edited

Loading