-
-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add probability tests to check vectorized/scalar lpmfs/lpdfs are the same #1989
Conversation
…g them in a non-vectorized way (Issue #1861)
…4.1 (tags/RELEASE_600/final)
…ng in as_scalars_vs_as_vectors to be like repeat_as_vectors (Issue #1861)
….com/stan-dev/math into bugfix/issue-1861-scalars-vs-vectors
…4.1 (tags/RELEASE_600/final)
I also intend to fix: #1978 with this pull request. And similarly there should be tests for the lccdfs and the cdfs. These should basically be copy-paste to add once the lpdf/lpmf code is there. |
….com/stan-dev/math into bugfix/issue-1861-scalars-vs-vectors
…4.1 (tags/RELEASE_600/final)
….com/stan-dev/math into bugfix/issue-1861-scalars-vs-vectors
…4.1 (tags/RELEASE_600/final)
….com/stan-dev/math into bugfix/issue-1861-scalars-vs-vectors
…4.1 (tags/RELEASE_600/final)
…work with things near zero better (Issue #1861)
….com/stan-dev/math into bugfix/issue-1861-scalars-vs-vectors
…4.1 (tags/RELEASE_600/final)
….com/stan-dev/math into bugfix/issue-1861-scalars-vs-vectors
… expect_near_rel from test/unit (Issue #1861)
…4.1 (tags/RELEASE_600/final)
….com/stan-dev/math into bugfix/issue-1861-scalars-vs-vectors
…4.1 (tags/RELEASE_600/final)
} | ||
plus[n] += e; | ||
minus[n] -= e; | ||
auto f_wrap = [&](const Eigen::VectorXd& e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stan math finite difference function is higher order and easier to use, so I defer to it.
@@ -257,8 +260,8 @@ class AgradCcdfLogTestFixture : public ::testing::Test { | |||
// works for <var> | |||
double calculate_gradients_1storder(vector<double>& grad, var& ccdf_log, | |||
vector<var>& x) { | |||
stan::math::set_zero_all_adjoints(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These gradient functions get called a lot in a sequence. If we do stan::math::recover_memory
we clear the autodiff stack and then the tests aren't meaningful. I switched the recover_memory
s to set_zero_all_adjoint
s and put recover_memory
calls in the tests that use the calculate_gradients_*
functions.
@@ -273,7 +276,7 @@ class AgradCcdfLogTestFixture : public ::testing::Test { | |||
// works for fvar<double> | |||
double calculate_gradients_1storder(vector<double>& grad, | |||
fvar<double>& ccdf_log, vector<var>& x) { | |||
x.push_back(ccdf_log.d_); | |||
grad.push_back(ccdf_log.d_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushing stuff into x doesn't do anything. I think this was a bug. grad is the thing that gets checked.
<< " grads: " << gradients; | ||
|
||
stan::test::expect_near_rel(stream.str(), finite_dif[i], gradients[i], | ||
stan::test::relative_tolerance(1e-4, 1e-7)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1e-4 relative error (this is what the unit tests use for gradients, see here: https://github.com/stan-dev/math/blob/develop/test/unit/math/ad_tolerances.hpp)
I think I used 1e-7 for the minimum error tolerance cause 1e-8 didn't work for some function. I'm fuzzy on this.
test_gradients_equal(expected_gradients1, gradients1); | ||
test_gradients_equal(expected_gradients2, gradients2); | ||
test_gradients_equal(expected_gradients3, gradients3); | ||
test_gradients_equal(expected_gradients1, gradients1, 1e-3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reference implementation gradients are quite bad. I had to use a relative tolerance of 1e-3 to get them to pass (finite difference worked with 1e-4).
It's stuff like gamma_p
and gamma_q
I think (#2006).
add_vars(s2, p0_, p1_, p2_, p3_, p4_, p5_); | ||
add_vars(s3, p0_, p1_, p2_, p3_, p4_, p5_); | ||
vector<var> scalar_vars; | ||
add_vars(scalar_vars, p0_, p1_, p2_, p3_, p4_, p5_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s1, s2, and s3 seemed like duplicates so I simplified things.
calculate_gradients_1storder(multiple_gradients3, multiple_ccdf_log, x1); | ||
calculate_gradients_1storder(multiple_gradients1, multiple_ccdf_log, | ||
vector_vars); | ||
calculate_gradients_2ndorder(multiple_gradients2, multiple_ccdf_log, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously we were only computing 1st order gradients.
} | ||
} | ||
|
||
void test_as_scalars_vs_as_vector() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
} | ||
|
||
void test_as_scalars_vs_as_vector() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -3,6 +3,7 @@ | |||
|
|||
#include <stan/math/rev.hpp> | |||
#include <test/prob/utility.hpp> | |||
#include <test/unit/math/expect_near_rel.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the changes in this file are similar to the equivalent ones in the lccdf file.
@@ -3,6 +3,7 @@ | |||
|
|||
#include <stan/math/rev.hpp> | |||
#include <test/prob/utility.hpp> | |||
#include <test/unit/math/expect_near_rel.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should fix #1978. The changes in this file are similar to the ones in the lccdf and cdf checks.
T_return_type cdf | ||
= TestClass.template cdf<Scalar0, Scalar1, Scalar2, Scalar3, Scalar4, | ||
Scalar5>(p0_, p1_, p2_, p3_, p4_, p5_); | ||
T_return_type single_cdf = pow( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll notice a pow here. In the old version we were comparing gradients of something like grad(x)
with the gradients of something like grad(x^r)
. To do this there was extra compare logic in test_multiple_gradient_values
.
As I recall this was confusing for higher order things, so I just added a pow here so we're comparing grad(x^r)
directly against grad(x^r)
computed another way and there's no confusion.
@@ -3,6 +3,7 @@ | |||
|
|||
#include <stan/math/mix.hpp> | |||
#include <test/prob/utility.hpp> | |||
#include <test/unit/math/expect_near_rel.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should basically be the same as the lccdf, cdf, and lcdf checks.
} | ||
} | ||
|
||
void test_as_scalars_vs_as_vector() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should fix #1861.
@@ -54,13 +54,69 @@ std::ostream& operator<<(std::ostream& os, const vector<var>& param) { | |||
} // namespace std | |||
|
|||
// ------------------------------------------------------------ | |||
|
|||
template <typename T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved these definitions up here so they are visible to get_params
.
// default template handles Eigen::Matrix | ||
template <typename T> | ||
T get_params(const vector<vector<double>>& parameters, const size_t p) { | ||
T param(parameters.size()); | ||
for (size_t n = 0; n < parameters.size(); n++) | ||
if (p < parameters[0].size()) | ||
param(n) = parameters[n][p]; | ||
param(n) = get_param<stan::scalar_type_t<T>>(parameters[n], p); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For higher order autodiff types, we need to initialize each of the params with something more than just casting up a double.
For vars, we assign the derivative term to 1.0, for instance.
If we don't do this code that depends on get_params
and get_param
getting the same params fails.
@@ -37,9 +37,9 @@ class AgradDistributionVonMises : public AgradDistributionTest { | |||
|
|||
param[0] = boost::math::constants::third_pi<double>(); | |||
param[1] = boost::math::constants::sixth_pi<double>(); | |||
param[2] = 1e-8; | |||
param[2] = 1e-2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Larger test value to avoid finite difference out of range errors
return stan::math::frechet_lpdf<false>(y, alpha, beta); | ||
}; | ||
|
||
stan::test::expect_ad(f, 2.0, 1.0, 1.0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this test since the higher order Frechet stuff was failing. I think I could remove it but it seems fine to me.
@@ -1,68 +1,10 @@ | |||
#include <stan/math/prim.hpp> | |||
#include <gtest/gtest.h> | |||
#include <test/unit/util.hpp> | |||
#include <test/unit/math/prim/functor/ode_test_functors.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the changes in the ODE files were an alternate fix to a problem that popped up here: #1993 (comment)
They don't change behavior. They just rearrange test code a bit so that the jumbo tests (#1965) build correctly.
This is ready to review. There's a ton of different sorts of changes in here. I went through I tried to explain each of them, cause some of them probably look pretty weird. When I got into the testing framework and started pulling threads I ended up working on a lot more things than I intended to. |
@syclik you think you'd have a chance to review this in like the next week or so? If not I'll grab someone else. |
@t4c1 yo can you review this? There were a few of problems with the testing framework. I wanna get these in before we make all the expression-compatibility changes. |
This is a huge PR. Could you split it into 2 or 3 smaller ones? There is a lot of math stuff in here. I am not sure I feel comfortable reviewing that. |
Closed by #2085 |
Summary
This addresses the missing tests from #1861
Edit: Breaking up this pull into pieces. I'm gonna leave this here until it's done.
So far #2039, #2041, and #2042
Tests
Side Effects
Release notes
Added extra tests to check lpdfs/lpmfs evaluated with vectors produce the same results as with scalars
Checklist
Math issue Probability test framework didn't catch bug with vectorization #1861
Copyright holder: Columbia University
The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
./runTests.py test/unit
)make test-headers
)make test-math-dependencies
)make doxygen
)make cpplint
)the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested