Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Printing stats for template sets #17

Merged
merged 9 commits into from
Oct 28, 2019
Merged

Conversation

ben-craig
Copy link

@ben-craig ben-craig commented Oct 6, 2019

Sometimes, you don't just want to know the most expensive instantiation of a template (i.e. tuple<int, float>), you want to know which templates as a whole are expensive (i.e. tuple). This new analysis does that.

Example output:

**** Template sets that took longest to instantiate:
175188 ms: std::tuple<$> (17428 times, avg 10 ms)
70354 ms: testing::internal::FunctionMocker<$>::Invoke (6166 times, avg 11 ms)
61916 ms: testing::internal::FunctionMocker<$> (4371 times, avg 14 ms)
61186 ms: testing::internal::FunctionMockerBase<$> (4017 times, avg 15 ms)
59616 ms: testing::internal::MockSpec<$> (3936 times, avg 15 ms)

Similarly, you don't always want to know which specific function took a long time to optimize, but which collection of template functions.
Example output:

**** Function sets that took longest to compile / optimize:
16710 ms: public: static void __cdecl testing::internal::TuplePrefix<$>::ExplainMatchFailuresTo<$>(class std::tuple<$> const &, class std::tuple<$> const &, class std::basic_ostream<$> *) (6484 times, avg 2 ms)
11430 ms: private: void __thiscall testing::internal::TypedExpectation<$>::ExplainMatchResultTo(class std::tuple<$> const &, class std::basic_ostream<$> *) const (2999 times, avg 3 ms)
11161 ms: private: class testing::Action<$> const & __thiscall testing::internal::TypedExpectation<$>::GetCurrentAction(class testing::internal::FunctionMockerBase<$> const *, class std::tuple<$> const &) const (2999 times, avg 3 ms)
5071 ms: public: static bool __cdecl testing::internal::TypeParameterizedTest<$>::Register(char const *, struct testing::internal::CodeLocation, char const *, char const *, int) (999 times, avg 5 ms)
4025 ms: private: virtual class testing::internal::ExpectationBase const * __thiscall testing::internal::FunctionMockerBase<$>::UntypedFindMatchingExpectation(void const *, void const **, bool *, class std::basic_ostream<$> *, class std::basic_ostream<$> *) (2999 times, avg 1 ms)

@ben-craig
Copy link
Author

Found a bug, so this particular revision shouldn't be merged.

This revision doesn't analyze recursive templates correctly. The duration that is in instantiations is a "total" time not a "self" time. Adding up all the "total" times will cause the rolled up time to be significantly inflated.

I'll see if I can pull off a more heavy duty analysis where I avoid this issue. I'm not sure if it extends to the back-end times as well.

@ben-craig
Copy link
Author

My apologies for leaving this in the failed state for so long, but I haven't been able to find the time to fix the PR yet.

Is the general idea something that is of interest to the maintainer?

@aras-p
Copy link
Owner

aras-p commented Oct 18, 2019

Is the general idea something that is of interest to the maintainer?

yeah, the idea sounds really good!

Ben Craig added 3 commits October 19, 2019 13:48
Now, we count the total time for a template instantation (e.g.
tuple<A, B, C>), but we don't count any time that comes from another
instation of the same template if it is a child (e.g. tuple<B, C>).
This seemed to give more actionable results than counting self-time.

This change bundles in some refactors.  If requested, they can be
separated out.

Times now stay as microseconds up until the moment we output them.
This reduces the amount that we bias our sums against large numbers of
small times.

EventIndex and DetailIndex have been added to provide more type safety
when accessing the various data vectors.  It was hard for me to keep
straight which index went where.
@ben-craig
Copy link
Author

Let me know if you want me to split this PR up. I can imagine having different PRs for the ms -> us change, the EventIndex + DetailIndex change, the change where instantiations track event index instead of detail index, and finally, a change that adds the feature.

Or we can go with the big thing that I've got now. Your call.

@aras-p aras-p merged commit 37983c3 into aras-p:master Oct 28, 2019
@aras-p
Copy link
Owner

aras-p commented Oct 28, 2019

Very nice, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants