Auto merge of #769 - micbou:google-benchmark, r=bstaletic

[READY] Add benchmark infrastructure This PR sets the infrastructure for adding benchmarks through [the Google benchmark library](https://github.com/google/benchmark) and for automatically running them on Travis and AppVeyor. They can also be run locally with the `benchmark.py` script. The library is included in the repository for compilation ease. Benchmarks are run on all platforms because optimizations may be platform-dependent. For now, there is only one benchmark based on the output of *Program 2* in ycm-core/YouCompleteMe#2668. It measures the filter and sort algorithm on a worst-case scenario: all identifiers share a common prefix and the query is part of the prefix. In that case, no identifiers are filtered and since they all have the same weight, the algorithm falls back to lexicographic sorting. This scenario is not uncommon in practice. For instance, C libraries often use a common prefix for naming variables and functions to simulate namespaces. Here's the output of the benchmark on my configuration: ``` ------------------------------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------------------------------ CandidatesWithCommonPrefix_bench/1 1955 ns 1898 ns 345165 CandidatesWithCommonPrefix_bench/2 11563 ns 11681 ns 64102 CandidatesWithCommonPrefix_bench/4 30761 ns 30594 ns 22436 CandidatesWithCommonPrefix_bench/8 69551 ns 69532 ns 11218 CandidatesWithCommonPrefix_bench/16 143963 ns 143924 ns 4986 CandidatesWithCommonPrefix_bench/32 292668 ns 290603 ns 2362 CandidatesWithCommonPrefix_bench/64 862766 ns 869571 ns 897 CandidatesWithCommonPrefix_bench/128 2205099 ns 2191318 ns 299 CandidatesWithCommonPrefix_bench/256 8895499 ns 8840057 ns 90 CandidatesWithCommonPrefix_bench/512 17704787 ns 17680113 ns 45 CandidatesWithCommonPrefix_bench/1024 45564517 ns 45760293 ns 15 CandidatesWithCommonPrefix_bench/2048 96960893 ns 98057771 ns 7 CandidatesWithCommonPrefix_bench/4096 217881085 ns 218401400 ns 3 CandidatesWithCommonPrefix_bench/8192 481444392 ns 483603100 ns 2 CandidatesWithCommonPrefix_bench/16384 1005462405 ns 982806300 ns 1 CandidatesWithCommonPrefix_bench/32768 1805209871 ns 1809611600 ns 1 CandidatesWithCommonPrefix_bench/65536 4215533125 ns 4212027000 ns 1 CandidatesWithCommonPrefix_bench_BigO 3979.06 NlgN 3974.50 NlgN CandidatesWithCommonPrefix_bench_RMS 10 % 9 % ``` As you can see, performance becomes unacceptable starting from 16000 identifiers which is not a lot. A great feature of Google benchmark is that it can calculate the algorithm complexity. As expected, we have a `O(n log n)` complexity where `n` is the number of candidates (we are using `std::sort` to sort our candidates). Thanks to this benchmark, I was able to improve the performance on this particular case by a factor of 60. I'll send the changes once this PR is merged.  --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/valloric/ycmd/769)
ycm-core · Jun 5, 2017 · 64ddce4 · 64ddce4
2 parents 7618a8d + dff0884
commit 64ddce4
Show file tree

Hide file tree

Showing 64 changed files with 7,005 additions and 33 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -12,9 +12,10 @@ install:
   - source ci/travis/travis_install.sh
 compiler:
   - gcc
-script: ./run_tests.py
+script:
+  - ci/travis/travis_script.sh
 after_success:
-  - bash <(curl -s https://codecov.io/bash)
+  - if [ "${COVERAGE}" == "true" ]; then bash <(curl -s https://codecov.io/bash); fi
 env:
   global:
     # Travis can run out of RAM, so we need to be careful here.
@@ -25,6 +26,7 @@ env:
     - USE_CLANG_COMPLETER=true YCMD_PYTHON_VERSION=2.6
     - USE_CLANG_COMPLETER=true YCMD_PYTHON_VERSION=2.7
     - USE_CLANG_COMPLETER=true YCMD_PYTHON_VERSION=3.3
+    - YCM_BENCHMARK=true YCMD_PYTHON_VERSION=3.3 COVERAGE=false
 matrix:
   exclude:
     - os: osx

diff --git a/appveyor.yml b/appveyor.yml
@@ -20,12 +20,17 @@ environment:
   - APPVEYOR_BUILD_WORKER_IMAGE: Visual Studio 2017
     arch: 64
     python: 36
+  - APPVEYOR_BUILD_WORKER_IMAGE: Visual Studio 2017
+    arch: 64
+    python: 36
+    YCM_BENCHMARK: true
+    COVERAGE: false
 install:
   - ci\appveyor\appveyor_install.bat
 build_script:
   - ci\appveyor\appveyor_build.bat
 after_build:
-  - codecov
+  - if %COVERAGE% == true ( codecov )
 # Disable automatic tests
 test: off
 cache:

diff --git a/benchmark.py b/benchmark.py
@@ -0,0 +1,51 @@
+#!/usr/bin/env python
+
+from __future__ import print_function
+from __future__ import division
+from __future__ import unicode_literals
+from __future__ import absolute_import
+
+import os
+import os.path as p
+import subprocess
+import sys
+
+DIR_OF_THIS_SCRIPT = p.dirname( p.abspath( __file__ ) )
+DIR_OF_THIRD_PARTY = p.join( DIR_OF_THIS_SCRIPT, 'third_party' )
+
+sys.path.insert( 1, p.abspath( p.join( DIR_OF_THIRD_PARTY, 'argparse' ) ) )
+
+import argparse
+
+
+def ParseArguments():
+  parser = argparse.ArgumentParser()
+  parser.add_argument( '--msvc', type = int, choices = [ 12, 14, 15 ],
+                       default = 15, help = 'Choose the Microsoft Visual '
+                       'Studio version (default: %(default)s).' )
+
+  return parser.parse_known_args()
+
+
+def BuildYcmdLibsAndRunBenchmark( args, extra_args ):
+  build_cmd = [
+    sys.executable,
+    p.join( DIR_OF_THIS_SCRIPT, 'build.py' ),
+    '--clang-completer'
+  ] + extra_args
+
+  os.environ[ 'YCM_BENCHMARK' ] = '1'
+
+  if args.msvc:
+    build_cmd.extend( [ '--msvc', str( args.msvc ) ] )
+
+  subprocess.check_call( build_cmd )
+
+
+def Main():
+  args, extra_args = ParseArguments()
+  BuildYcmdLibsAndRunBenchmark( args, extra_args )
+
+
+if __name__ == "__main__":
+  Main()
diff --git a/build.py b/build.py
@@ -330,14 +330,29 @@ def RunYcmdTests( build_dir ):
   if OnWindows():
     # We prepend the folder of the ycm_core_tests executable to the PATH
     # instead of overwriting it so that the executable is able to find the
-    # python35.dll library.
+    # Python library.
     new_env[ 'PATH' ] = DIR_OF_THIS_SCRIPT + ';' + new_env[ 'PATH' ]
   else:
     new_env[ 'LD_LIBRARY_PATH' ] = DIR_OF_THIS_SCRIPT
 
   CheckCall( p.join( tests_dir, 'ycm_core_tests' ), env = new_env )
 
 
+def RunYcmdBenchmarks( build_dir ):
+  benchmarks_dir = p.join( build_dir, 'ycm', 'benchmarks' )
+  new_env = os.environ.copy()
+
+  if OnWindows():
+    # We prepend the folder of the ycm_core_tests executable to the PATH
+    # instead of overwriting it so that the executable is able to find the
+    # Python library.
+    new_env[ 'PATH' ] = DIR_OF_THIS_SCRIPT + ';' + new_env[ 'PATH' ]
+  else:
+    new_env[ 'LD_LIBRARY_PATH' ] = DIR_OF_THIS_SCRIPT
+
+  CheckCall( p.join( benchmarks_dir, 'ycm_core_benchmarks' ), env = new_env )
+
+
 # On Windows, if the ycmd library is in use while building it, a LNK1104
 # fatal error will occur during linking. Exit the script early with an
 # error message if this is the case.
@@ -390,20 +405,27 @@ def BuildYcmdLib( args ):
 
     CheckCall( [ 'cmake' ] + full_cmake_args, exit_message = exit_message )
 
-    build_target = ( 'ycm_core' if 'YCM_TESTRUN' not in os.environ else
-                     'ycm_core_tests' )
+    build_targets = [ 'ycm_core' ]
+    if 'YCM_TESTRUN' in os.environ:
+      build_targets.append( 'ycm_core_tests' )
+    if 'YCM_BENCHMARK' in os.environ:
+      build_targets.append( 'ycm_core_benchmarks' )
 
-    build_command = [ 'cmake', '--build', '.', '--target', build_target ]
     if OnWindows():
       config = 'Debug' if args.enable_debug else 'Release'
-      build_command.extend( [ '--config', config ] )
+      build_config = [ '--config', config ]
     else:
-      build_command.extend( [ '--', '-j', str( NumCores() ) ] )
+      build_config = [ '--', '-j', str( NumCores() ) ]
 
-    CheckCall( build_command, exit_message = exit_message )
+    for target in build_targets:
+      build_command = ( [ 'cmake', '--build', '.', '--target', target ] +
+                        build_config )
+      CheckCall( build_command, exit_message = exit_message )
 
     if 'YCM_TESTRUN' in os.environ:
       RunYcmdTests( build_dir )
+    if 'YCM_BENCHMARK' in os.environ:
+      RunYcmdBenchmarks( build_dir )
   finally:
     os.chdir( DIR_OF_THIS_SCRIPT )
 

diff --git a/ci/appveyor/appveyor_build.bat b/ci/appveyor/appveyor_build.bat
@@ -7,4 +7,8 @@ if %msvc% == 2013 (
   set msvc=15
 )
 
-python run_tests.py --msvc %msvc%
+if defined YCM_BENCHMARK (
+  python benchmark.py --msvc %msvc%
+) else (
+  python run_tests.py --msvc %msvc%
+)
diff --git a/ci/travis/travis_script.sh b/ci/travis/travis_script.sh
@@ -0,0 +1,5 @@
+if [ "${YCM_BENCHMARK}" == "true" ]; then
+  ./benchmark.py
+else
+  ./run_tests.py
+fi
diff --git a/codecov.yml b/codecov.yml
@@ -14,12 +14,13 @@ coverage:
     changes: true
 
   # We don't want statistics for the tests themselves and certainly not for the
-  # boost libraries. Note that while we exclude the gcov data for these patterns
-  # in the codecov call (codecov --gcov-glob ...), the fact that our code
-  # references these areas also means we need to tell codecov itself to ignore
-  # them from the stats.
+  # benchmarks and boost libraries. Note that while we exclude the gcov data for
+  # these patterns in the codecov call (codecov --gcov-glob ...), the fact that
+  # our code references these areas also means we need to tell codecov itself to
+  # ignore them from the stats.
   ignore:
   - .*/tests/.*
+  - .*/benchmarks/.*
   - .*/BoostParts/.*
 
 comment:

diff --git a/cpp/ycm/CMakeLists.txt b/cpp/ycm/CMakeLists.txt
@@ -222,10 +222,11 @@ endif()
 
 file( GLOB_RECURSE SERVER_SOURCES *.h *.cpp )
 
-# The test sources are a part of a different target, so we remove them
-# The CMakeFiles cpp file is picked up when the user creates an in-source build,
-# and we don't want that. We also remove client-specific code
-file( GLOB_RECURSE to_remove tests/*.h tests/*.cpp CMakeFiles/*.cpp *client* )
+# The test and benchmark sources are a part of a different target, so we remove
+# them. The CMakeFiles cpp file is picked up when the user creates an in-source
+# build, and we don't want that. We also remove client-specific code.
+file( GLOB_RECURSE to_remove tests/*.h tests/*.cpp benchmarks/*.h
+                             benchmarks/*.cpp CMakeFiles/*.cpp *client* )
 
 if( to_remove )
   list( REMOVE_ITEM SERVER_SOURCES ${to_remove} )
@@ -467,3 +468,4 @@ if( SYSTEM_IS_OPENBSD OR SYSTEM_IS_FREEBSD )
 endif()
 
 add_subdirectory( tests )
+add_subdirectory( benchmarks )
diff --git a/cpp/ycm/CandidateRepository.cpp b/cpp/ycm/CandidateRepository.cpp
@@ -85,6 +85,11 @@ std::vector< const Candidate * > CandidateRepository::GetCandidatesForStrings(
 }
 
 
+void CandidateRepository::ClearCandidates() {
+  candidate_holder_.clear();
+}
+
+
 CandidateRepository::~CandidateRepository() {
   for ( const CandidateHolder::value_type & pair : candidate_holder_ ) {
     delete pair.second;

diff --git a/cpp/ycm/CandidateRepository.h b/cpp/ycm/CandidateRepository.h
@@ -53,6 +53,9 @@ class CandidateRepository {
   YCM_DLL_EXPORT std::vector< const Candidate * > GetCandidatesForStrings(
     const std::vector< std::string > &strings );
 
+  // This should only be used to isolate tests and benchmarks.
+  YCM_DLL_EXPORT void ClearCandidates();
+
 private:
   CandidateRepository() {};
   ~CandidateRepository();

diff --git a/cpp/ycm/benchmarks/CMakeLists.txt b/cpp/ycm/benchmarks/CMakeLists.txt
@@ -0,0 +1,60 @@
+# Copyright (C) 2017 ycmd contributors
+#
+# This file is part of ycmd.
+#
+# ycmd is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+#
+# ycmd is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with ycmd.  If not, see <http://www.gnu.org/licenses/>.
+
+project( ycm_core_benchmarks )
+cmake_minimum_required( VERSION 2.8 )
+
+# We don't want to test the benchmark library.
+set( BENCHMARK_ENABLE_TESTING
+     OFF CACHE BOOL "Enable testing of the benchmark library" )
+set( BUILD_SHARED_LIBS OFF )
+
+add_subdirectory( benchmark )
+set( BENCHMARK_INCLUDE_DIRS ${benchmark_SOURCE_DIR}/include )
+set( BENCHMARK_LIBRARIES benchmark )
+
+include_directories( ${ycm_core_SOURCE_DIR}
+                     ${BENCHMARK_INCLUDE_DIRS} )
+
+file( GLOB_RECURSE SOURCES *.h *.cpp )
+
+# We don't want benchmark sources in this target.
+file( GLOB_RECURSE to_remove benchmark/*.h benchmark/*.cpp CMakeFiles/*.cpp )
+
+if( to_remove )
+  list( REMOVE_ITEM SOURCES ${to_remove} )
+endif()
+
+add_executable( ${PROJECT_NAME}
+                ${SOURCES} )
+
+if( MSVC )
+  # Build benchmark and ycm_core_benchmarks targets in cmake ycm/benchmarks
+  # folder.
+  foreach( OUTPUTCONFIG ${CMAKE_CONFIGURATION_TYPES} )
+    string( TOUPPER ${OUTPUTCONFIG} OUTPUTCONFIG )
+    set_target_properties( ${BENCHMARK_LIBRARIES} PROPERTIES
+      RUNTIME_OUTPUT_DIRECTORY_${OUTPUTCONFIG} ${PROJECT_BINARY_DIR} )
+    set_target_properties( ${PROJECT_NAME} PROPERTIES
+      RUNTIME_OUTPUT_DIRECTORY_${OUTPUTCONFIG} ${PROJECT_BINARY_DIR} )
+  endforeach()
+endif()
+
+target_link_libraries( ${PROJECT_NAME}
+                       ${Boost_LIBRARIES}
+                       ycm_core
+                       ${BENCHMARK_LIBRARIES} )
diff --git a/cpp/ycm/benchmarks/IdentifierCompleter_bench.cpp b/cpp/ycm/benchmarks/IdentifierCompleter_bench.cpp
@@ -0,0 +1,59 @@
+// Copyright (C) 2017 ycmd contributors
+//
+// This file is part of ycmd.
+//
+// ycmd is free software: you can redistribute it and/or modify
+// it under the terms of the GNU General Public License as published by
+// the Free Software Foundation, either version 3 of the License, or
+// (at your option) any later version.
+//
+// ycmd is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with ycmd.  If not, see <http://www.gnu.org/licenses/>.
+
+#include "benchmark/benchmark_api.h"
+#include "CandidateRepository.h"
+#include "IdentifierCompleter.h"
+
+namespace YouCompleteMe {
+
+class IdentifierCompleterFixture : public benchmark::Fixture {
+public:
+  void SetUp( const benchmark::State& state ) {
+    CandidateRepository::Instance().ClearCandidates();
+  }
+};
+
+
+BENCHMARK_DEFINE_F( IdentifierCompleterFixture, CandidatesWithCommonPrefix )(
+    benchmark::State& state ) {
+  // Generate a list of candidates of the form a_A_a_[a-z]{5}.
+  std::vector< std::string > candidates;
+  for ( int i = 0; i < state.range( 0 ); ++i ) {
+    std::string candidate = "";
+    int letter = i;
+    for ( int pos = 0; pos < 5; letter /= 26, ++pos ) {
+      candidate = std::string( 1, letter % 26 + 'a' ) + candidate;
+    }
+    candidate = "a_A_a_" + candidate;
+    candidates.push_back( candidate );
+  }
+
+  IdentifierCompleter completer( candidates );
+
+  while ( state.KeepRunning() )
+    completer.CandidatesForQuery( "aA" );
+
+  state.SetComplexityN( state.range( 0 ) );
+}
+
+BENCHMARK_REGISTER_F( IdentifierCompleterFixture, CandidatesWithCommonPrefix )
+    ->RangeMultiplier( 2 )
+    ->Range( 1, 1 << 16 )
+    ->Complexity();
+
+} // namespace YouCompleteMe
diff --git a/cpp/ycm/benchmarks/benchmark/AUTHORS b/cpp/ycm/benchmarks/benchmark/AUTHORS
@@ -0,0 +1,38 @@
+# This is the official list of benchmark authors for copyright purposes.
+# This file is distinct from the CONTRIBUTORS files.
+# See the latter for an explanation.
+#
+# Names should be added to this file as:
+#	Name or Organization <email address>
+# The email address is not required for organizations.
+#
+# Please keep the list sorted.
+
+Albert Pretorius <pretoalb@gmail.com>
+Arne Beer <arne@twobeer.de>
+Christopher Seymour <chris.j.seymour@hotmail.com>
+David Coeurjolly <david.coeurjolly@liris.cnrs.fr>
+Dominic Hamon <dma@stripysock.com>
+Eric Fiselier <eric@efcs.ca>
+Eugene Zhuk <eugene.zhuk@gmail.com>
+Evgeny Safronov <division494@gmail.com>
+Felix Homann <linuxaudio@showlabor.de>
+Google Inc.
+International Business Machines Corporation
+Ismael Jimenez Martinez <ismael.jimenez.martinez@gmail.com>
+Joao Paulo Magalhaes <joaoppmagalhaes@gmail.com>
+JianXiong Zhou <zhoujianxiong2@gmail.com>
+Jussi Knuuttila <jussi.knuuttila@gmail.com>
+Kaito Udagawa <umireon@gmail.com>
+Lei Xu <eddyxu@gmail.com>
+Matt Clarkson <mattyclarkson@gmail.com>
+Maxim Vafin <maxvafin@gmail.com>
+Nick Hutchinson <nshutchinson@gmail.com>
+Oleksandr Sochka <sasha.sochka@gmail.com>
+Paul Redmond <paul.redmond@gmail.com>
+Radoslav Yovchev <radoslav.tm@gmail.com>
+Shuo Chen <chenshuo@chenshuo.com>
+Yusuke Suzuki <utatane.tea@gmail.com>
+Dirac Research 
+Zbigniew Skowron <zbychs@gmail.com>
+Dominik Czarnota <dominik.b.czarnota@gmail.com>