Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dbnode] Decode ReadBits improvements #2197

Merged
merged 21 commits into from
Mar 11, 2020
Merged

Conversation

rallen090
Copy link
Collaborator

@rallen090 rallen090 commented Mar 6, 2020

What this PR does / why we need it:

Changes to ReadBits and PeekBits include:

  • numBits type to avoid casts to uint
  • no err state to avoid err checks
~10% improvement
BEFORE
BenchmarkNextIteration/series_10-12                 6442           2025060 ns/op
AFTER
BenchmarkNextIteration/series_10-12                 6620           1827805 ns/op

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:


Does this PR require updating code package or user-facing documentation?:


@rallen090 rallen090 changed the title Changed numBits to uint to avoid casts and removed err state Decode ReadBits improvements Mar 6, 2020
@@ -29,9 +29,8 @@ import (
// istream encapsulates a readable stream.
type istream struct {
r *bufio.Reader // encoded stream
err error // error encountered
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing the error? Shouldn't ReadBit still fail if the stream has failed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is set from the result of readByteFromStream, but in all of the cases we call this, the parent call also returns that error (e.g. ReadBit). So I see no reason to keep the error stored as state on the stream object itself if we already would have returned it. Is there a reason to keep it, though, that I'm missing? The value of removing it is that we can then remove these if-conditions that are in multiple methods to check if the error is set.

Copy link
Collaborator

@arnikola arnikola Mar 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a noticeable difference on flame graphs? Branch prediction should help reduce the cost of these when there's no error

Mostly concerned about losing some future proofing if the actual impact turns out to be very small

Copy link
Collaborator Author

@rallen090 rallen090 Mar 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it is small but noticeable on the flamegraph. Most of ReadBits is attributable to readByte, and that increases when we remove this err checks (meaning less CPU work in the ReadBits func, as you can see on the far right of the ReadBits in the graphs).

BEFORE (readByte is 10.62%)
image

AFTER (readByte is 13.31%)
image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd still lean on the side of caution here; otherwise there's a bit of a weird reliance that is.r.Byte() will error in the expected fashion otherwise, and if the underlying reader changes you may get weird behaviour

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sounds good to me. I'll update to only include the casting changes.

@arnikola
Copy link
Collaborator

arnikola commented Mar 6, 2020

Are there large changes on the flame graphs?

@robskillington robskillington changed the title Decode ReadBits improvements [dbnode] Decode ReadBits improvements Mar 6, 2020
@@ -30,7 +30,7 @@ import (
const (
defaultDefaultTimeUnit = xtime.Second
defaultByteFieldDictLRUSize = 4
defaultIStreamReaderSizeM3TSZ = 16
defaultIStreamReaderSizeM3TSZ = 8 * 2
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also FYI I was playing around with this setting since it drives how large a buffer we keep in this stream. Seems like it does have some effects on the flamegraphs where the larger buffer avoids time going to the buffer fill as often. But the runtimes still seemed pretty variable so it didn't seem like a change we definitely want to make.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, yeah this might be worth playing around with - potentially in a way that only effects query instead of definitely dbnode as well.

setupProf(usePools, iterations)
}

func setupProf(usePools bool, iterations int) stop {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest refactor to this file no longer supported profiling since the defer p.Stop() was being done in the setupBlocks subfunction, so this fixes that.

@codecov
Copy link

codecov bot commented Mar 9, 2020

Codecov Report

Merging #2197 into master will not change coverage by %.
The diff coverage is n/a.

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #2197   +/-   ##
======================================
  Coverage    71.7%   71.7%           
======================================
  Files        1022    1022           
  Lines       88943   88943           
======================================
  Hits        63779   63779           
  Misses      20817   20817           
  Partials     4347    4347           
Flag Coverage Δ
#aggregator 82.1% <0.0%> (ø)
#cluster 85.3% <0.0%> (ø)
#collector 82.8% <0.0%> (ø)
#dbnode 77.8% <0.0%> (ø)
#m3em 74.4% <0.0%> (ø)
#m3ninx 72.4% <0.0%> (ø)
#m3nsch 51.1% <0.0%> (ø)
#metrics 31.0% <0.0%> (ø)
#msg 74.1% <0.0%> (ø)
#query 68.4% <0.0%> (ø)
#x 83.5% <0.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8f6d216...653a6b4. Read the comment docs.

@rallen090
Copy link
Collaborator Author

Are there large changes on the flame graphs?

The flamegraphs look basically the same now since the casts we are removing don't show up as independent operations. However, we can see the improvement when we inspect the ReadBits and PeekBits source:

Before (cumulative 4.56s)
image

After (cumulative 4.22s)
image

@rallen090 rallen090 merged commit 07e7df8 into master Mar 11, 2020
@rallen090 rallen090 deleted the ra/decode-perf-benchmarking branch March 11, 2020 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants