Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved behavior of expect_near_rel #1657

Conversation

martinmodrak
Copy link
Contributor

@martinmodrak martinmodrak commented Jan 29, 2020

Summary

Improved the behavior of expect_near_rel as proposed in #1656.

Wherever a tolerance of type double was used, there is a new struct relative_tolerance that gathers relative (tol) and minimal tolerance (tol_min). The final tolerance is computed as std::max(tol * fabs(x), tol_min) where x is either an exact value to be tested against (I will use this in the precomputed tests) or average of the two values to be compared (this is what expect_near_rel uses).

To make the transition easy relative_tolerance has an implicit constructor from double (which is treated as relative tolerance), I instructed the linter to be quiet about it.

Default tol_rel is 1e-8 and default tol_min is std::max(tol_ * tol_, 1e-14). The tol_min default is slightly weird but it is a compromise between a desire to be strict by default while acknowledging that 1e-16 tolerance is almost impossible to achieve in many cases.

Tests

There is a test for relative_tolerance - since the tolerance code is decoupled from expect_near_rel it can be tested more directly than in the previous version.

I had to tweak the tolerances for a bunch of tests.

Side Effects

Most tests got stricter or stayed the same for all/most values, but some test/value combinations got a bit more relaxed. Specifically:

  • Any test that had tol > 1e-7 and the tolerance was not explicitly touched in this PR is stricter for fabs(value) < tol and the same for fabs(value) > tol.
  • Any test that had 1e-14 < tol < 1e-7 and the tolerance was not explicitly touched in this PR is stricter for fabs(value) < tol, slightly relaxed for tol < fabs(value) < 1e-14/tol and the same for fabs(value) > 1e-14/tol.
  • In tests I touched I always aimed to got stricter somewhere, i.e. if I had to increase tol_min above it's default, I would decrease tol or vice versa. Mostly tests that modified their tolerance actually needed to modify tol_min and not tol.
  • The default relative tolerance for hessian_hessian_ and hessian_fvar_hessian_ used to be 1e-3. I noticed that this was necessary primarily because of bad behaviour around zero. The new default is tol = 1e-4, tol_min = 1e-3, so those tests got stricter above 1e-1 and weaker between 1e-3 and 1e-1.

Most places where I had to tweak the tolerance already had tolerance tweaked before. The exceptions are

  • test/unit/math/mix/fun/matrix_exp_test.cpp
  • test/unit/math/mix/fun/quad_form_diag_test.cpp

where I had to slightly increase tolerances for at least one test_ad call that used default tolerance before. This was due to the change in default tolerance described above.

I also removed functions relative_diff and expect_near_relative from test/unit/math/rev/fun/util.hpp and replaced their use (only in 1 file) by expect_near_rel.

Checklist

By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • dependencies checks pass, (make test-math-dependencies)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

@martinmodrak martinmodrak changed the title [WIP] Improved behavior of expect_near_rel Improved behavior of expect_near_rel Jan 31, 2020
@martinmodrak
Copy link
Contributor Author

@bob-carpenter could you review this? Or should I ask someone else?

@bob-carpenter
Copy link
Contributor

I can review it. Thanks for documenting so clearly what's going on.

Copy link
Contributor

@bob-carpenter bob-carpenter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than naming of member values in the new class, this all looks great.

I don't want to lose the discussion of the issue. Could you put the developer-facing doc on how to use the new tolerances on the wiki here: https://github.com/stan-dev/stan/wiki/Testing-framework

I found the graphs particularly helpful.

test/unit/math/relative_tolerance.hpp Outdated Show resolved Hide resolved
test/unit/math/ad_tolerances.hpp Show resolved Hide resolved
@martinmodrak
Copy link
Contributor Author

So I checked the coding guidelines and I realized relative_tolerance should have really been a class, so I changed it to one. Also added some accessor/modifier functions while I was at it. I briefly documented the changes at https://github.com/stan-dev/stan/wiki/Testing-framework but I mostly link to code as this will be less stale.

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.86 4.84 1.0 0.48% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 1.01 1.23% faster
eight_schools/eight_schools.stan 0.09 0.09 0.96 -4.1% slower
gp_regr/gp_regr.stan 0.22 0.22 1.0 0.23% faster
irt_2pl/irt_2pl.stan 6.09 6.04 1.01 0.81% faster
performance.compilation 88.19 87.18 1.01 1.15% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 7.4 7.3 1.01 1.35% faster
pkpd/one_comp_mm_elim_abs.stan 20.76 21.39 0.97 -3.04% slower
sir/sir.stan 92.13 90.7 1.02 1.54% faster
gp_regr/gen_gp_data.stan 0.05 0.05 1.01 1.09% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.0 3.03 0.99 -0.96% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.33 0.31 1.07 6.28% faster
arK/arK.stan 1.76 1.77 0.99 -0.68% slower
arma/arma.stan 0.81 0.8 1.02 1.66% faster
garch/garch.stan 0.62 0.63 1.0 -0.49% slower
Mean result: 1.00490436256

Jenkins Console Log
Blue Ocean
Commit hash: f54c4a0


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@martinmodrak
Copy link
Contributor Author

@bob-carpenter tests are passing, should be ready for re-review

@bob-carpenter bob-carpenter merged commit 77248cf into stan-dev:develop Feb 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make the relative tolerance in expect_near_rel continuous
4 participants