[READY] Add benchmark infrastructure #769

micbou · 2017-06-02T01:12:09Z

This PR sets the infrastructure for adding benchmarks through the Google benchmark library and for automatically running them on Travis and AppVeyor. They can also be run locally with the benchmark.py script. The library is included in the repository for compilation ease. Benchmarks are run on all platforms because optimizations may be platform-dependent.

For now, there is only one benchmark based on the output of Program 2 in ycm-core/YouCompleteMe#2668. It measures the filter and sort algorithm on a worst-case scenario: all identifiers share a common prefix and the query is part of the prefix. In that case, no identifiers are filtered and since they all have the same weight, the algorithm falls back to lexicographic sorting. This scenario is not uncommon in practice. For instance, C libraries often use a common prefix for naming variables and functions to simulate namespaces.

Here's the output of the benchmark on my configuration:

------------------------------------------------------------------------------------------
Benchmark                                       Time           CPU Iterations
------------------------------------------------------------------------------------------
CandidatesWithCommonPrefix_bench/1           1955 ns       1898 ns     345165
CandidatesWithCommonPrefix_bench/2          11563 ns      11681 ns      64102
CandidatesWithCommonPrefix_bench/4          30761 ns      30594 ns      22436
CandidatesWithCommonPrefix_bench/8          69551 ns      69532 ns      11218
CandidatesWithCommonPrefix_bench/16        143963 ns     143924 ns       4986
CandidatesWithCommonPrefix_bench/32        292668 ns     290603 ns       2362
CandidatesWithCommonPrefix_bench/64        862766 ns     869571 ns        897
CandidatesWithCommonPrefix_bench/128      2205099 ns    2191318 ns        299
CandidatesWithCommonPrefix_bench/256      8895499 ns    8840057 ns         90
CandidatesWithCommonPrefix_bench/512     17704787 ns   17680113 ns         45
CandidatesWithCommonPrefix_bench/1024    45564517 ns   45760293 ns         15
CandidatesWithCommonPrefix_bench/2048    96960893 ns   98057771 ns          7
CandidatesWithCommonPrefix_bench/4096   217881085 ns  218401400 ns          3
CandidatesWithCommonPrefix_bench/8192   481444392 ns  483603100 ns          2
CandidatesWithCommonPrefix_bench/16384 1005462405 ns  982806300 ns          1
CandidatesWithCommonPrefix_bench/32768 1805209871 ns 1809611600 ns          1
CandidatesWithCommonPrefix_bench/65536 4215533125 ns 4212027000 ns          1
CandidatesWithCommonPrefix_bench_BigO     3979.06 NlgN    3974.50 NlgN
CandidatesWithCommonPrefix_bench_RMS           10 %          9 %

As you can see, performance becomes unacceptable starting from 16000 identifiers which is not a lot. A great feature of Google benchmark is that it can calculate the algorithm complexity. As expected, we have a O(n log n) complexity where n is the number of candidates (we are using std::sort to sort our candidates).

Thanks to this benchmark, I was able to improve the performance on this particular case by a factor of 60. I'll send the changes once this PR is merged.

This change is

codecov-io · 2017-06-02T02:12:30Z

Codecov Report

Merging #769 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #769      +/-   ##
==========================================
+ Coverage   94.52%   94.52%   +<.01%     
==========================================
  Files          79       79              
  Lines        5276     5278       +2     
  Branches      158      158              
==========================================
+ Hits         4987     4989       +2     
  Misses        246      246              
  Partials       43       43

bstaletic · 2017-06-02T06:02:35Z

I have tested this last night and confirmed that it works as expected. I did leave a minor comment, but this is any way.

Reviewed 9 of 59 files at r1.
Review status: 9 of 59 files reviewed at latest revision, 1 unresolved discussion.

cpp/ycm/CMakeLists.txt, line 231 at r1 (raw file):

                             benchmarks/*.h benchmarks/*.cpp
                             CMakeFiles/*.cpp
                             *client* )

*client* ) could be joined with the line above. Or every glob in this file() statement can be on a different line.

No strong opinion, feel free to disregard this comment.

Comments from Reviewable

Valloric · 2017-06-02T21:47:33Z

I've used Google's C++ benchmarking library extensively when I worked there; the API seems to have changed a bit, but it's largely the same thing. The library was great back then and I don't expect that to have changed.

@micbou Thanks for setting this up! I never really benchmarked the filter/sort part; it was plenty fast for all the use-cases I could come up with years ago. But if others have found cases where it's slow and we could make it faster, great!

The complexity of the algo being O(NlogN) is indeed expected. Back in 2011 when I was still thinking about starting YCM, I remember coming up with a proof that it's not possible to perform subsequence matching on a list of strings (the strings being known ahead of time, but not which subset of the strings or their order) faster than O(NlogN). That's just Big Oh, so plenty of constant factors to optimize.

Review status: 9 of 59 files reviewed at latest revision, 2 unresolved discussions.

cpp/ycm/benchmarks/IdentifierCompleter_bench.cpp, line 28 at r1 (raw file):

    benchmark::State& state ) {

  // Generate a list of candidates of the form a_a_a_[a-z]{5}.

You might want to put state.PauseTiming(); before this block and state.ResumeTiming(); right after it. Otherwise the setup time will be included in your benchmark run. It will get spread over all the iterations, but still.

Comments from Reviewable

micbou · 2017-06-03T21:14:42Z

I added a method to CandidateRepository for clearing the candidates. This is needed to isolate the benchmarks (and the CandidateReposity tests). If we don't clear the candidates, measurements are influenced by previous runs.

Also, I changed the benchmark to use a query with a lowercase and an uppercase letter. This makes the benchmark more relevant because if we want to improve performance against it, we need a solution that deal with both kinds of case, not just lowercase.

Reviewed 59 of 59 files at r1, 1 of 1 files at r2, 3 of 3 files at r3, 1 of 1 files at r4, 3 of 3 files at r5.
Review status: all files reviewed at latest revision, 3 unresolved discussions.

cpp/ycm/CMakeLists.txt, line 231 at r1 (raw file):

Previously, bstaletic wrote…

*client* ) could be joined with the line above. Or every glob in this file() statement can be on a different line.

No strong opinion, feel free to disregard this comment.

Done.

cpp/ycm/benchmarks/IdentifierCompleter_bench.cpp, line 28 at r1 (raw file):

Previously, Valloric (Val Markovic) wrote…

You might want to put state.PauseTiming(); before this block and state.ResumeTiming(); right after it. Otherwise the setup time will be included in your benchmark run. It will get spread over all the iterations, but still.

Isn't the measurement only occurring in the KeepRunning loop?

cpp/ycm/benchmarks/IdentifierCompleter_bench.cpp, line 43 at r4 (raw file):

  }

  IdentifierCompleter completer( candidates );

Moved this out of the main loop because we don't want to measure the addition of candidates to the repository.

Comments from Reviewable

vheon · 2017-06-04T10:29:40Z

I know that is easier for us but are we sure we want to keep the benchmarks library in the source tree? I'm only saying this because people already complains about the size of the project when they have to download it and this only increase the burden 😕 I understand that it would be more fragile but we could clone the library "on demand" on the first invocation of the benchmark.py script if the library was not already there.

Reviewed 6 of 59 files at r1, 1 of 1 files at r2, 1 of 3 files at r3, 1 of 1 files at r4, 1 of 3 files at r5.
Review status: all files reviewed at latest revision, 2 unresolved discussions.

Comments from Reviewable

micbou · 2017-06-04T15:43:23Z

The benchmark folder only takes 336 KB. That's negligible compared to the ~370 MB that ycmd takes after installation. In addition, like the gmock folder, we will never update it so it will only increase Git history once.

Review status: all files reviewed at latest revision, 2 unresolved discussions.

Comments from Reviewable

Valloric · 2017-06-04T16:13:51Z

@vheon Complaints about YCM on-disk size have always been silly to me. It's a few hundred MB, who cares. Disks are both huge and cheap.

Review status: all files reviewed at latest revision, 2 unresolved discussions.

cpp/ycm/benchmarks/IdentifierCompleter_bench.cpp, line 28 at r1 (raw file):

Previously, micbou wrote…

Isn't the measurement only occurring in the KeepRunning loop?

Why would it be? The API has changed since the last time I used it, but unless the first call to KeepRunning mutates some internal state so that measurement starts (quite possible), the whole benchmark time will be measured from the start of the function.

That's the way it worked before at least.

Comments from Reviewable

bstaletic · 2017-06-04T17:07:00Z

I measured YCM/ycmd disk usage today. It's a bit over 200MB. You decide if that is big.

Reviewed 1 of 1 files at r2, 2 of 3 files at r3, 1 of 1 files at r4.
Review status: all files reviewed at latest revision, 2 unresolved discussions.

Comments from Reviewable

Valloric · 2017-06-04T17:39:16Z

@bstaletic It's not something we should worry about at all IMO. If it's less than 1GB, I don't care.

Review status: all files reviewed at latest revision, 2 unresolved discussions.

Comments from Reviewable

puremourning · 2017-06-04T18:37:04Z

Reviewed 1 of 59 files at r1.
Review status: all files reviewed at latest revision, 2 unresolved discussions.

Comments from Reviewable

puremourning · 2017-06-04T18:54:58Z

Reviewed 56 of 59 files at r1, 1 of 1 files at r2, 1 of 3 files at r3, 1 of 1 files at r4, 3 of 3 files at r5.
Review status: all files reviewed at latest revision, 3 unresolved discussions.

benchmark.py, line 25 at r5 (raw file):

  parser.add_argument( '--msvc', type = int, choices = [ 12, 14, 15 ],
                       help = 'Choose the Microsoft Visual '
                       'Studio version. (default: 15).' )

maybe i'm being thick, but what ensures the default is 15? it might be implicit, but i think i would prefer default=15 if that's a thing.

cpp/ycm/benchmarks/IdentifierCompleter_bench.cpp, line 28 at r1 (raw file):

Previously, Valloric (Val Markovic) wrote…

Why would it be? The API has changed since the last time I used it, but unless the first call to KeepRunning mutates some internal state so that measurement starts (quite possible), the whole benchmark time will be measured from the start of the function.

That's the way it worked before at least.

As I recall from using it before, the first call to KeepRunning starts the timing for an iteration, and subsequent calls end/restart. Seems to be backed up here: https://github.com/google/benchmark/blob/93bfabc8b89270c00816c3cafb601475eaf85364/include/benchmark/benchmark_api.h#L334

Comments from Reviewable

micbou · 2017-06-04T20:22:01Z

Reviewed 2 of 3 files at r6, 1 of 3 files at r8.
Review status: 62 of 64 files reviewed at latest revision, 3 unresolved discussions.

benchmark.py, line 25 at r5 (raw file):

Previously, puremourning (Ben Jackson) wrote…

maybe i'm being thick, but what ensures the default is 15? it might be implicit, but i think i would prefer default=15 if that's a thing.

Copy-paste from the run_tests.py script. I've changed it in both scripts.

Comments from Reviewable

vheon · 2017-06-04T20:47:19Z

Reviewed 1 of 59 files at r1, 3 of 3 files at r8.
Review status: all files reviewed at latest revision, 3 unresolved discussions.

Comments from Reviewable

micbou · 2017-06-04T22:29:30Z

I've squashed the commits. PR is ready.

Reviewed 1 of 3 files at r8.
Review status: all files reviewed at latest revision, 2 unresolved discussions.

Comments from Reviewable

bstaletic · 2017-06-05T10:21:04Z

@zzbot r+

Reviewed 1 of 3 files at r5, 2 of 3 files at r6, 2 of 3 files at r8.
Review status: all files reviewed at latest revision, 2 unresolved discussions.

Comments from Reviewable

zzbot · 2017-06-05T10:21:05Z

📌 Commit dff0884 has been approved by bstaletic

zzbot · 2017-06-05T10:21:08Z

⌛ Testing commit dff0884 with merge 64ddce4...

[READY] Add benchmark infrastructure This PR sets the infrastructure for adding benchmarks through [the Google benchmark library](https://github.com/google/benchmark) and for automatically running them on Travis and AppVeyor. They can also be run locally with the `benchmark.py` script. The library is included in the repository for compilation ease. Benchmarks are run on all platforms because optimizations may be platform-dependent. For now, there is only one benchmark based on the output of *Program 2* in ycm-core/YouCompleteMe#2668. It measures the filter and sort algorithm on a worst-case scenario: all identifiers share a common prefix and the query is part of the prefix. In that case, no identifiers are filtered and since they all have the same weight, the algorithm falls back to lexicographic sorting. This scenario is not uncommon in practice. For instance, C libraries often use a common prefix for naming variables and functions to simulate namespaces. Here's the output of the benchmark on my configuration: ``` ------------------------------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------------------------------ CandidatesWithCommonPrefix_bench/1 1955 ns 1898 ns 345165 CandidatesWithCommonPrefix_bench/2 11563 ns 11681 ns 64102 CandidatesWithCommonPrefix_bench/4 30761 ns 30594 ns 22436 CandidatesWithCommonPrefix_bench/8 69551 ns 69532 ns 11218 CandidatesWithCommonPrefix_bench/16 143963 ns 143924 ns 4986 CandidatesWithCommonPrefix_bench/32 292668 ns 290603 ns 2362 CandidatesWithCommonPrefix_bench/64 862766 ns 869571 ns 897 CandidatesWithCommonPrefix_bench/128 2205099 ns 2191318 ns 299 CandidatesWithCommonPrefix_bench/256 8895499 ns 8840057 ns 90 CandidatesWithCommonPrefix_bench/512 17704787 ns 17680113 ns 45 CandidatesWithCommonPrefix_bench/1024 45564517 ns 45760293 ns 15 CandidatesWithCommonPrefix_bench/2048 96960893 ns 98057771 ns 7 CandidatesWithCommonPrefix_bench/4096 217881085 ns 218401400 ns 3 CandidatesWithCommonPrefix_bench/8192 481444392 ns 483603100 ns 2 CandidatesWithCommonPrefix_bench/16384 1005462405 ns 982806300 ns 1 CandidatesWithCommonPrefix_bench/32768 1805209871 ns 1809611600 ns 1 CandidatesWithCommonPrefix_bench/65536 4215533125 ns 4212027000 ns 1 CandidatesWithCommonPrefix_bench_BigO 3979.06 NlgN 3974.50 NlgN CandidatesWithCommonPrefix_bench_RMS 10 % 9 % ``` As you can see, performance becomes unacceptable starting from 16000 identifiers which is not a lot. A great feature of Google benchmark is that it can calculate the algorithm complexity. As expected, we have a `O(n log n)` complexity where `n` is the number of candidates (we are using `std::sort` to sort our candidates). Thanks to this benchmark, I was able to improve the performance on this particular case by a factor of 60. I'll send the changes once this PR is merged.  --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/valloric/ycmd/769)

zzbot · 2017-06-05T11:00:53Z

☀️ Test successful - status-travis
Approved by: bstaletic
Pushing 64ddce4 to master...

micbou force-pushed the google-benchmark branch from db4c096 to 562fca3 Compare June 3, 2017 14:51

Add benchmark infrastructure

dff0884

micbou force-pushed the google-benchmark branch from f4a5c03 to dff0884 Compare June 4, 2017 22:24

micbou changed the title ~~[RFC] Add benchmark infrastructure~~ [READY] Add benchmark infrastructure Jun 4, 2017

zzbot merged commit dff0884 into ycm-core:master Jun 5, 2017

micbou deleted the google-benchmark branch June 14, 2017 08:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[READY] Add benchmark infrastructure #769

[READY] Add benchmark infrastructure #769

micbou commented Jun 2, 2017 •

edited

Loading

codecov-io commented Jun 2, 2017 •

edited

Loading

bstaletic commented Jun 2, 2017

Valloric commented Jun 2, 2017

micbou commented Jun 3, 2017

vheon commented Jun 4, 2017

micbou commented Jun 4, 2017

Valloric commented Jun 4, 2017

bstaletic commented Jun 4, 2017

Valloric commented Jun 4, 2017

puremourning commented Jun 4, 2017

puremourning commented Jun 4, 2017

micbou commented Jun 4, 2017

vheon commented Jun 4, 2017

micbou commented Jun 4, 2017

bstaletic commented Jun 5, 2017

zzbot commented Jun 5, 2017

zzbot commented Jun 5, 2017

zzbot commented Jun 5, 2017

[READY] Add benchmark infrastructure #769

[READY] Add benchmark infrastructure #769

Conversation

micbou commented Jun 2, 2017 • edited Loading

codecov-io commented Jun 2, 2017 • edited Loading

Codecov Report

bstaletic commented Jun 2, 2017

Valloric commented Jun 2, 2017

micbou commented Jun 3, 2017

vheon commented Jun 4, 2017

micbou commented Jun 4, 2017

Valloric commented Jun 4, 2017

bstaletic commented Jun 4, 2017

Valloric commented Jun 4, 2017

puremourning commented Jun 4, 2017

puremourning commented Jun 4, 2017

micbou commented Jun 4, 2017

vheon commented Jun 4, 2017

micbou commented Jun 4, 2017

bstaletic commented Jun 5, 2017

zzbot commented Jun 5, 2017

zzbot commented Jun 5, 2017

zzbot commented Jun 5, 2017

micbou commented Jun 2, 2017 •

edited

Loading

codecov-io commented Jun 2, 2017 •

edited

Loading