Printing stats for template sets #17

ben-craig · 2019-10-06T17:47:39Z

Sometimes, you don't just want to know the most expensive instantiation of a template (i.e. tuple<int, float>), you want to know which templates as a whole are expensive (i.e. tuple). This new analysis does that.

Example output:

**** Template sets that took longest to instantiate:
175188 ms: std::tuple<$> (17428 times, avg 10 ms)
70354 ms: testing::internal::FunctionMocker<$>::Invoke (6166 times, avg 11 ms)
61916 ms: testing::internal::FunctionMocker<$> (4371 times, avg 14 ms)
61186 ms: testing::internal::FunctionMockerBase<$> (4017 times, avg 15 ms)
59616 ms: testing::internal::MockSpec<$> (3936 times, avg 15 ms)

Similarly, you don't always want to know which specific function took a long time to optimize, but which collection of template functions.
Example output:

**** Function sets that took longest to compile / optimize:
16710 ms: public: static void __cdecl testing::internal::TuplePrefix<$>::ExplainMatchFailuresTo<$>(class std::tuple<$> const &, class std::tuple<$> const &, class std::basic_ostream<$> *) (6484 times, avg 2 ms)
11430 ms: private: void __thiscall testing::internal::TypedExpectation<$>::ExplainMatchResultTo(class std::tuple<$> const &, class std::basic_ostream<$> *) const (2999 times, avg 3 ms)
11161 ms: private: class testing::Action<$> const & __thiscall testing::internal::TypedExpectation<$>::GetCurrentAction(class testing::internal::FunctionMockerBase<$> const *, class std::tuple<$> const &) const (2999 times, avg 3 ms)
5071 ms: public: static bool __cdecl testing::internal::TypeParameterizedTest<$>::Register(char const *, struct testing::internal::CodeLocation, char const *, char const *, int) (999 times, avg 5 ms)
4025 ms: private: virtual class testing::internal::ExpectationBase const * __thiscall testing::internal::FunctionMockerBase<$>::UntypedFindMatchingExpectation(void const *, void const **, bool *, class std::basic_ostream<$> *, class std::basic_ostream<$> *) (2999 times, avg 1 ms)

ben-craig · 2019-10-08T13:56:25Z

Found a bug, so this particular revision shouldn't be merged.

This revision doesn't analyze recursive templates correctly. The duration that is in instantiations is a "total" time not a "self" time. Adding up all the "total" times will cause the rolled up time to be significantly inflated.

I'll see if I can pull off a more heavy duty analysis where I avoid this issue. I'm not sure if it extends to the back-end times as well.

ben-craig · 2019-10-18T02:17:23Z

My apologies for leaving this in the failed state for so long, but I haven't been able to find the time to fix the PR yet.

Is the general idea something that is of interest to the maintainer?

aras-p · 2019-10-18T06:20:47Z

Is the general idea something that is of interest to the maintainer?

yeah, the idea sounds really good!

Now, we count the total time for a template instantation (e.g. tuple<A, B, C>), but we don't count any time that comes from another instation of the same template if it is a child (e.g. tuple<B, C>). This seemed to give more actionable results than counting self-time. This change bundles in some refactors. If requested, they can be separated out. Times now stay as microseconds up until the moment we output them. This reduces the amount that we bias our sums against large numbers of small times. EventIndex and DetailIndex have been added to provide more type safety when accessing the various data vectors. It was hard for me to keep straight which index went where.

ben-craig · 2019-10-19T19:09:14Z

Let me know if you want me to split this PR up. I can imagine having different PRs for the ms -> us change, the EventIndex + DetailIndex change, the change where instantiations track event index instead of detail index, and finally, a change that adds the feature.

Or we can go with the big thing that I've got now. Your call.

aras-p · 2019-10-28T16:43:42Z

Very nice, thanks!

Ben Craig added 5 commits October 6, 2019 12:43

Printing stats for template sets

ff34e8e

Reporting back end function set times

d1a9bda

updating hlsl tests

0085850

updating clang test

b3ea4ab

Making the sorting more deterministic

74a4121

instantiation of template sets now using self time

30fd86a

Ben Craig added 3 commits October 19, 2019 13:48

Adding missing namespace, thanks MSVC

d81c7d3

even more std::

20794b5

aras-p merged commit 37983c3 into aras-p:master Oct 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Printing stats for template sets #17

Printing stats for template sets #17

ben-craig commented Oct 6, 2019 •

edited

Loading

ben-craig commented Oct 8, 2019

ben-craig commented Oct 18, 2019

aras-p commented Oct 18, 2019

ben-craig commented Oct 19, 2019

aras-p commented Oct 28, 2019

Printing stats for template sets #17

Printing stats for template sets #17

Conversation

ben-craig commented Oct 6, 2019 • edited Loading

ben-craig commented Oct 8, 2019

ben-craig commented Oct 18, 2019

aras-p commented Oct 18, 2019

ben-craig commented Oct 19, 2019

aras-p commented Oct 28, 2019

ben-craig commented Oct 6, 2019 •

edited

Loading