-
-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clang tidy cleanup and using std algorithms #1373
Clang tidy cleanup and using std algorithms #1373
Conversation
…d-statements,performance-unnecessary-value-param
…, and std::inner_product when multiply two standard vectors
…stable/2017-11-14)
@@ -34,7 +34,7 @@ class accumulator { | |||
/** | |||
* Destroy an accumulator. | |||
*/ | |||
~accumulator() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for defining the accumulator destructor as empty here? tmk this still calls the destructor for all the accumulates members
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is OK to leave as default---as is, it's not virtual and breaks the rule of 3(5).
… of github.com:stan-dev/math into clang-tidy/braces-defaults_constructor-range_for_loops
…e file to another
…stable/2017-11-14)
…gs/RELEASE_500/final)
@wds15 @rok-cesnovar this PR has a bunch of tbb stuff in it now (i.e. lib/tbb/libtbbmalloc_proxy.so.2), what do we need to add to the .gitignore so this stuff is not pushed? |
lib/tbb/* should be ignored all together as its a build folder. We should add that to the integrate PR. |
(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 0.98) |
(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 0.99) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! I really like seeing this kind of code cleanup.
The only thing I'm curious about is efficiecy on log-sum-exp expanded as it is. And one request to capture by value. Everything else is a comment or optional. Of the optional stuff, it'd be particularly great to vectorize the checks so that they can deal with indexing in the error message and we can remove a lot of boilerplate.
double max_val = *std::max_element(x.begin(), x.end()); | ||
double sum = std::accumulate( | ||
x.begin(), x.end(), 0.0, | ||
[&max_val](auto& acc, auto&& x_i) { return acc + exp(x_i - max_val); }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this generate code that's as efficient as before? It will come down to how efficiently it can compile that closure.
How do we test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just did this on godbolt, lhs is the code (bottom is current (labled editor 1) and top is the new one (editor 2) middle is the output from the new stuff and far right is the output from the current stuff. You can highlight certain instructions and it usually pops up a little 'heres what this does'. You can click 'add' in the top right to get a diff view of the two outputs, though it usually looks wonky at O3. You can click and drag any of the tabs for each little block to move stuff. If you right click the highlighted code on the lhs it should have an options to take you to where that line is happening in whichever of the bottom two outputs, though it's not always exact.
I like to look at -O0 to see where stuff is then looking at -O3. About lines 40-60'ish is where the loop and exp calculation happen. The code is super similar, the lambda version removes a compare and a few moves. But those are mostly because we don't do the if statement in there anymore. I can look tmrw at just removing that check there with the old version.
godbolt is pretty neat! I learned last night you can also get a real graph of the call graph!
There's a way to make a PR on their repo so we can get Stan up there, would like to find time for that in the next week or so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another cool internet benchmark thing!
double max_val = *std::max_element(x.begin(), x.end()); | ||
double sum = std::accumulate( | ||
x.begin(), x.end(), 0.0, | ||
[&max_val](auto& acc, auto&& x_i) { return acc + exp(x_i - max_val); }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the rules for capture are like argument passing, so that primitives like max_val
should be captured by value, not by reference.
} | ||
|
||
return max + log(sum); | ||
double max_val = *std::max_element(x.begin(), x.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very neat!
@@ -34,7 +34,7 @@ class accumulator { | |||
/** | |||
* Destroy an accumulator. | |||
*/ | |||
~accumulator() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is OK to leave as default---as is, it's not virtual and breaks the rule of 3(5).
@@ -275,11 +275,11 @@ gp_exp_quad_cov(const std::vector<T_x1> &x1, const std::vector<T_x2> &x2, | |||
return cov; | |||
} | |||
|
|||
for (size_t i = 0; i < x1_size; ++i) { | |||
check_not_nan(function_name, "x1", x1[i]); | |||
for (auto &&x1_i : x1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As is, I think these can be const.
These should be using a vectorized check_not_nan
so that the index can also be printed and we don't have all this boilerplate looping.
Another alternative would be a for-each loop, which doesn't actually simplify things here, especiallyw ith explicit capture of the function name.
std::for_each(x1.begin(), x1.end(),
[&function_name](double x) { return check_not_nan(function_name, "x", x); });
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be using a vectorized check_not_nan so that the index can also be printed and we don't have all this boilerplate looping.
Agree this should use a vectorized check_nan, but the vectorized version of check_not_nan
does not work for vectors of eigen matrices atm :-(
After Andrew and I sort out the more generic templating discussion in #1425 then I'm going to come back to these check functions and clean them up so we can do that.
} | ||
} | ||
return max + log(sum); | ||
double max_val = std::max_element(x.begin(), x.end())->val(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[optional]
This is soooo close to the double
version, the only difference being the ->val()
pulling out the double
based value. Could the (recursive?) value_of
for max_val
computation allow these to be combined into a single implementation? Maybe not worth it given again how complicated the indirection would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, it's so close! I think for a v v clean version of this we need a vectorized value_of
. Then in the constructor for log_sum_exp_vector_vari
we could just call op_vector_vari(log_sum_exp(value_of(x)), x)
.
I put a comment above log_sum_exp_as_double
about this and can do those value_of's in a separate PR
The |
I like how the std algorithms look but you make a good point. Winder if we could even get away with a single more general implementation |
You've definitely done some neat work with the std code, so I'm not in a hurry to wipe that away! I don't think you should do anything to this pull - I'll have a look into this and create an issue with some ideas and performance testing |
I would be cautious with going all eigen...weren’t these slower than the non eigen implementations die to memory Lay-out stuff? But harmonizing things is a good thought, of course. |
It wouldn't be a blind change, I'm planning on comparing performance with the perf-math repo to make sure things scale well - just to make sure there aren't any surprises |
@SteveBronder, there's a merge conflict. It should be a quick fix (I looked at it briefly and didn't know which direction to go on first glance). |
Yes apologies getting over a cold this week and back to the jobby job, I'll update this tonight. I think I'm going to remove the changes to log_sum_exp since there's a lot of stuff going on there and probably needs a bigger discussion on refactoring (if it even needs to be) |
…y value for log_sum_exp's accumulator with max_val
…stable/2017-11-14)
@bob-carpenter at work right now but I have two PRs which don't touch subtract but it looks like |
(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.0) |
@SteveBronder: there are code conflicts. Can you update your branch and reopen? |
Summary
This includes a few automated refactors and some hand made ones I'll review below
make clang-tidy-fix files=./test/unit/math/mix/mat/eigen_plugins_test* \ tidy_checks=modernize-use-bool-literals,performance-for-range-copy, modernize-use-equals-default,readability-braces-around-statements, performance-unnecessary-value-param
Links below to what each of these do:
modernize-use-equals-default
readability-braces-around-statements
performance-unnecessary-value-param
-. If the value of the container is not primitive, we do a range based for loop with rvalue ref
(auto&& x_i : x)
-. If the value is primitive and never modified we do
(const auto x_i: x)
std::inner_product
instead of a for loop for vector dot productsReading the above it looks like if any value is -inf or +inf the end result will still be +-inf . If that's the only edge case we were focusing on with the above I think the below change satisfies that a little cleaner
We have a bunch of default constructors we are declaring that are just the default so I set those to explicitly use the default. Accumulate declares a destructor that's also the default. Should we just remove those and use the implicitly generated constructors?
promote_elements
for vectors uses braced initializers to construct the output vector whilepromote_elements
for Eigen uses aMat.cast<T>
sum
now uses anstd::accumulate
In a few places we now use
x.coeffRef(i)
to avoid bounds checking on when usingoperator[ ]
on eigen matriceslog_sum_exp_test
was running a test on an uninitialized vector so I set the vector to a size of 0.There were a few places we had the C++03 style of
> >
at the end of a template which got cleaned up.Tests
Refactor so idt new tests? Happy to add any if the current stuff was missing tests
Side Effects
idt so!
Checklist
Math issue Update internals to use more modern c++ #1308
Copyright holder: Steve Bronder
The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
./runTests.py test/unit
)make test-headers
)make doxygen
)make cpplint
)the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested