Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize series key parsing on startup #6743

Merged
merged 5 commits into from
May 27, 2016
Merged

Optimize series key parsing on startup #6743

merged 5 commits into from
May 27, 2016

Conversation

jwilder
Copy link
Contributor

@jwilder jwilder commented May 27, 2016

Required for all non-trivial PRs
  • Rebased/mergable
  • Tests pass
  • CHANGELOG.md updated

This optimizes the ParseKey func to help speed up startup times for shards with many keys or very long keys. This should help #6250 in some cases. ParseKey is called for every key when reloading the WAL and TSM file indexes. The main change is avoiding calling the unescape funcs for tag keys and values which always returned a new string even if there was nothing escaped. We also quickly scan the key to see if we might need to unescape anything and skip all of those calls if we don't need to. The remaining allocation time is creating the tags map. The tags map is really just a temporary holder to allow the tags to be indexed. We could rework this further to avoid that map altogether and parse the key directly into the index structures.

benchmark               old ns/op     new ns/op     delta
BenchmarkParseKey-8     5621          2050          -63.53%

benchmark               old allocs     new allocs     delta
BenchmarkParseKey-8     25             24             -4.00%

benchmark               old bytes     new bytes     delta
BenchmarkParseKey-8     1318          1030          -21.85%

@jwilder jwilder added this to the 1.0.0 beta milestone May 27, 2016
@joelegasse
Copy link
Contributor

LGTM 👍

@jwilder jwilder merged commit dd58101 into master May 27, 2016
@jwilder jwilder deleted the jw-parse-key branch May 27, 2016 21:00
@daviesalex
Copy link
Contributor

We just moved our server (~2T of InfluxDB data) to the nightly from last night (with this in it) vs the nightly from the 23rd May. The startup time goes from over an hour to just over 10 minutes. The system load during the startup is roughly doubled (though there are still boat loads of CPUs doing nothing). The output of "perf top" shows totally different limiting factors.

Major CPU users now:

   7.37%  influxd                       [.] runtime.scanobject
   6.02%  influxd                       [.] runtime.heapBitsForObject
   5.57%  [kernel]                      [k] _raw_spin_lock
   5.51%  influxd                       [.] runtime.greyobject
   4.72%  influxd                       [.] sync.(*RWMutex).Lock
   4.55%  influxd                       [.] runtime.mallocgc
   4.14%  influxd                       [.] runtime.heapBitsSweepSpan
   3.78%  influxd                       [.] runtime.mapassign1
   3.37%  influxd                       [.] type..hash.[8]github.com/influxdata/influxdb/monitor/diagnostics.Client

This is a huge win. The next win is to just do more of startup in different goroutines, but thats a totally different kettle of fish.

Thank you!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants