Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve 'rate() by ()' queries performance #3719

Merged
merged 7 commits into from
May 30, 2024

Conversation

mapno
Copy link
Member

@mapno mapno commented May 28, 2024

What this PR does:

This PR contains two performance improvements for rate() by () queries:

  • Stores the last used RangeAggregator in lastSeries so the map series is accessed the least possible. Accessing a map with a non-string key is very expensive, specially in a such hot path as this one.
  • Uses a different length array depending on the number of grouping by () attributes.

Below are the results for the benchmarks. new.txt uses only the first improvement and new_3.txt uses both.

Benchmark results
goos: darwin
goarch: arm64
pkg: github.com/grafana/tempo/tempodb/encoding/vparquet3
                                                                             │   old.txt   │              new.txt               │             new_3.txt              │
                                                                             │   sec/op    │   sec/op     vs base               │   sec/op     vs base               │
BackendBlockQueryRange/{}_|_rate()/5-12                                         1.742 ± 5%    1.755 ± 3%        ~ (p=0.394 n=6)    1.780 ± 3%        ~ (p=0.132 n=6)
BackendBlockQueryRange/{}_|_rate()/7-12                                         1.787 ± 2%    1.782 ± 0%        ~ (p=0.180 n=6)    1.815 ± 2%        ~ (p=0.093 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(name)/5-12                               2.671 ± 3%    2.493 ± 3%   -6.64% (p=0.002 n=6)    1.976 ± 3%  -26.00% (p=0.002 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(name)/7-12                               3.226 ± 1%    2.900 ± 1%  -10.11% (p=0.002 n=6)    2.108 ± 2%  -34.65% (p=0.002 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(resource.service.name)/5-12              2.118 ± 1%    1.370 ± 3%  -35.30% (p=0.002 n=6)    1.293 ± 3%  -38.96% (p=0.002 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(resource.service.name)/7-12              2.585 ± 1%    1.518 ± 1%  -41.25% (p=0.002 n=6)    1.376 ± 1%  -46.77% (p=0.002 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(span.http.url)/5-12                      3.142 ± 1%    2.449 ± 2%  -22.06% (p=0.002 n=6)    2.357 ± 1%  -24.98% (p=0.002 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(span.http.url)/7-12                      3.603 ± 0%    2.578 ± 1%  -28.47% (p=0.002 n=6)    2.444 ± 1%  -32.16% (p=0.002 n=6)
BackendBlockQueryRange/{resource.service.name=`loki-ingester`}_|_rate()/5-12   299.3m ± 1%   300.4m ± 1%        ~ (p=0.394 n=6)   298.9m ± 1%        ~ (p=0.818 n=6)
BackendBlockQueryRange/{resource.service.name=`loki-ingester`}_|_rate()/7-12   302.4m ± 1%   303.0m ± 1%        ~ (p=0.699 n=6)   301.4m ± 1%        ~ (p=0.485 n=6)
BackendBlockQueryRange/{status=error}_|_rate()/5-12                            385.2m ± 0%   385.9m ± 0%        ~ (p=0.485 n=6)   382.8m ± 0%   -0.62% (p=0.002 n=6)
BackendBlockQueryRange/{status=error}_|_rate()/7-12                            387.2m ± 0%   387.0m ± 1%        ~ (p=0.818 n=6)   385.2m ± 3%        ~ (p=0.093 n=6)
geomean                                                                         1.296         1.123       -13.32%                  1.052       -18.84%

                                                                             │  old.txt   │              new.txt               │             new_3.txt              │
                                                                             │  MB_IO/op  │  MB_IO/op   vs base                │  MB_IO/op   vs base                │
BackendBlockQueryRange/{}_|_rate()/5-12                                        34.95 ± 0%   34.95 ± 0%       ~ (p=1.000 n=6) ¹   34.95 ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{}_|_rate()/7-12                                        34.95 ± 0%   34.95 ± 0%       ~ (p=1.000 n=6) ¹   34.95 ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{}_|_rate()_by_(name)/5-12                              43.58 ± 0%   43.58 ± 0%       ~ (p=1.000 n=6) ¹   43.58 ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{}_|_rate()_by_(name)/7-12                              43.58 ± 0%   43.58 ± 0%       ~ (p=1.000 n=6) ¹   43.58 ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{}_|_rate()_by_(resource.service.name)/5-12             35.54 ± 0%   35.54 ± 0%       ~ (p=1.000 n=6) ¹   35.54 ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{}_|_rate()_by_(resource.service.name)/7-12             35.54 ± 0%   35.54 ± 0%       ~ (p=1.000 n=6) ¹   35.54 ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{}_|_rate()_by_(span.http.url)/5-12                     38.35 ± 0%   38.35 ± 0%       ~ (p=1.000 n=6) ¹   38.35 ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{}_|_rate()_by_(span.http.url)/7-12                     38.35 ± 0%   38.35 ± 0%       ~ (p=1.000 n=6) ¹   38.35 ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{resource.service.name=`loki-ingester`}_|_rate()/5-12   35.54 ± 0%   35.54 ± 0%       ~ (p=1.000 n=6) ¹   35.54 ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{resource.service.name=`loki-ingester`}_|_rate()/7-12   35.54 ± 0%   35.54 ± 0%       ~ (p=1.000 n=6) ¹   35.54 ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{status=error}_|_rate()/5-12                            37.64 ± 0%   37.64 ± 0%       ~ (p=1.000 n=6) ¹   37.64 ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{status=error}_|_rate()/7-12                            37.64 ± 0%   37.64 ± 0%       ~ (p=1.000 n=6) ¹   37.64 ± 0%       ~ (p=1.000 n=6) ¹
geomean                                                                        37.49        37.49       +0.00%                   37.49       +0.00%
¹ all samples are equal

                                                                             │   old.txt   │               new.txt               │              new_3.txt              │
                                                                             │  spans/op   │  spans/op    vs base                │  spans/op    vs base                │
BackendBlockQueryRange/{}_|_rate()/5-12                                        5.173M ± 0%   5.173M ± 0%       ~ (p=1.000 n=6) ¹   5.173M ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{}_|_rate()/7-12                                        7.516M ± 0%   7.516M ± 0%       ~ (p=1.000 n=6) ¹   7.516M ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{}_|_rate()_by_(name)/5-12                              5.173M ± 0%   5.173M ± 0%       ~ (p=1.000 n=6) ¹   5.173M ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{}_|_rate()_by_(name)/7-12                              7.516M ± 0%   7.516M ± 0%       ~ (p=1.000 n=6) ¹   7.516M ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{}_|_rate()_by_(resource.service.name)/5-12             5.173M ± 0%   5.173M ± 0%       ~ (p=1.000 n=6) ¹   5.173M ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{}_|_rate()_by_(resource.service.name)/7-12             7.516M ± 0%   7.516M ± 0%       ~ (p=1.000 n=6) ¹   7.516M ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{}_|_rate()_by_(span.http.url)/5-12                     5.173M ± 0%   5.173M ± 0%       ~ (p=1.000 n=6) ¹   5.173M ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{}_|_rate()_by_(span.http.url)/7-12                     7.516M ± 0%   7.516M ± 0%       ~ (p=1.000 n=6) ¹   7.516M ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{resource.service.name=`loki-ingester`}_|_rate()/5-12   787.3k ± 0%   787.3k ± 0%       ~ (p=1.000 n=6) ¹   787.3k ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{resource.service.name=`loki-ingester`}_|_rate()/7-12   1.159M ± 0%   1.159M ± 0%       ~ (p=1.000 n=6) ¹   1.159M ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{status=error}_|_rate()/5-12                            551.0k ± 0%   551.0k ± 0%       ~ (p=1.000 n=6) ¹   551.0k ± 0%       ~ (p=1.000 n=6) ¹
BackendBlockQueryRange/{status=error}_|_rate()/7-12                            790.3k ± 0%   790.3k ± 0%       ~ (p=1.000 n=6) ¹   790.3k ± 0%       ~ (p=1.000 n=6) ¹
geomean                                                                        3.137M        3.137M       +0.00%                   3.137M       +0.00%
¹ all samples are equal

                                                                             │   old.txt   │              new.txt               │             new_3.txt              │
                                                                             │   spans/s   │   spans/s    vs base               │   spans/s    vs base               │
BackendBlockQueryRange/{}_|_rate()/5-12                                        2.970M ± 4%   2.947M ± 3%        ~ (p=0.394 n=6)   2.907M ± 3%        ~ (p=0.132 n=6)
BackendBlockQueryRange/{}_|_rate()/7-12                                        4.206M ± 2%   4.217M ± 0%        ~ (p=0.180 n=6)   4.142M ± 2%        ~ (p=0.093 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(name)/5-12                              1.937M ± 3%   2.075M ± 3%   +7.12% (p=0.002 n=6)   2.617M ± 3%  +35.13% (p=0.002 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(name)/7-12                              2.330M ± 1%   2.592M ± 1%  +11.25% (p=0.002 n=6)   3.565M ± 2%  +53.01% (p=0.002 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(resource.service.name)/5-12             2.443M ± 1%   3.776M ± 3%  +54.57% (p=0.002 n=6)   4.002M ± 3%  +63.83% (p=0.002 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(resource.service.name)/7-12             2.908M ± 1%   4.950M ± 1%  +70.22% (p=0.002 n=6)   5.464M ± 1%  +87.88% (p=0.002 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(span.http.url)/5-12                     1.647M ± 1%   2.113M ± 2%  +28.30% (p=0.002 n=6)   2.195M ± 1%  +33.30% (p=0.002 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(span.http.url)/7-12                     2.086M ± 0%   2.916M ± 1%  +39.79% (p=0.002 n=6)   3.075M ± 1%  +47.42% (p=0.002 n=6)
BackendBlockQueryRange/{resource.service.name=`loki-ingester`}_|_rate()/5-12   2.631M ± 1%   2.621M ± 1%        ~ (p=0.394 n=6)   2.634M ± 1%        ~ (p=0.818 n=6)
BackendBlockQueryRange/{resource.service.name=`loki-ingester`}_|_rate()/7-12   3.833M ± 1%   3.825M ± 1%        ~ (p=0.699 n=6)   3.845M ± 1%        ~ (p=0.485 n=6)
BackendBlockQueryRange/{status=error}_|_rate()/5-12                            1.430M ± 0%   1.428M ± 0%        ~ (p=0.485 n=6)   1.439M ± 0%   +0.63% (p=0.002 n=6)
BackendBlockQueryRange/{status=error}_|_rate()/7-12                            2.041M ± 0%   2.042M ± 1%        ~ (p=0.818 n=6)   2.052M ± 3%        ~ (p=0.093 n=6)
geomean                                                                        2.421M        2.793M       +15.36%                 2.983M       +23.22%

                                                                             │    old.txt    │               new.txt               │              new_3.txt              │
                                                                             │     B/op      │     B/op       vs base              │     B/op       vs base              │
BackendBlockQueryRange/{}_|_rate()/5-12                                        262.8Mi ± 16%   267.9Mi ± 15%       ~ (p=1.000 n=6)   279.5Mi ± 11%       ~ (p=0.132 n=6)
BackendBlockQueryRange/{}_|_rate()/7-12                                        267.7Mi ±  7%   269.5Mi ±  6%       ~ (p=0.818 n=6)   273.9Mi ±  4%       ~ (p=0.132 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(name)/5-12                              84.59Mi ± 58%   71.22Mi ± 56%       ~ (p=0.589 n=6)   70.62Mi ± 55%       ~ (p=0.589 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(name)/7-12                              66.76Mi ± 31%   66.17Mi ± 41%       ~ (p=0.699 n=6)   76.18Mi ± 44%       ~ (p=0.699 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(resource.service.name)/5-12             82.42Mi ± 34%   80.11Mi ± 35%       ~ (p=0.818 n=6)   86.05Mi ± 22%       ~ (p=0.394 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(resource.service.name)/7-12             78.87Mi ± 19%   73.84Mi ± 53%       ~ (p=0.394 n=6)   73.59Mi ± 22%       ~ (p=0.394 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(span.http.url)/5-12                     317.6Mi ± 15%   317.5Mi ± 17%       ~ (p=1.000 n=6)   313.4Mi ± 15%       ~ (p=0.589 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(span.http.url)/7-12                     328.3Mi ±  8%   331.8Mi ±  2%       ~ (p=0.310 n=6)   319.5Mi ±  8%       ~ (p=0.485 n=6)
BackendBlockQueryRange/{resource.service.name=`loki-ingester`}_|_rate()/5-12   42.57Mi ± 20%   42.75Mi ±  7%       ~ (p=0.818 n=6)   39.77Mi ± 10%       ~ (p=0.699 n=6)
BackendBlockQueryRange/{resource.service.name=`loki-ingester`}_|_rate()/7-12   41.02Mi ± 14%   41.05Mi ± 16%       ~ (p=0.937 n=6)   40.96Mi ± 14%       ~ (p=0.937 n=6)
BackendBlockQueryRange/{status=error}_|_rate()/5-12                            30.15Mi ± 13%   30.59Mi ±  5%       ~ (p=0.818 n=6)   30.64Mi ±  5%       ~ (p=1.000 n=6)
BackendBlockQueryRange/{status=error}_|_rate()/7-12                            31.86Mi ± 12%   32.93Mi ±  4%       ~ (p=0.394 n=6)   31.24Mi ± 12%       ~ (p=0.937 n=6)
geomean                                                                        93.59Mi         92.15Mi        -1.54%                 92.79Mi        -0.86%

                                                                             │   old.txt    │              new.txt               │             new_3.txt              │
                                                                             │  allocs/op   │  allocs/op    vs base              │  allocs/op    vs base              │
BackendBlockQueryRange/{}_|_rate()/5-12                                        1.515M ±  4%   1.504M ±  5%       ~ (p=0.818 n=6)   1.541M ±  3%       ~ (p=0.132 n=6)
BackendBlockQueryRange/{}_|_rate()/7-12                                        1.492M ±  2%   1.515M ±  2%       ~ (p=0.310 n=6)   1.504M ±  3%       ~ (p=0.394 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(name)/5-12                              848.2k ± 19%   779.8k ± 30%       ~ (p=0.589 n=6)   825.6k ± 22%       ~ (p=0.240 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(name)/7-12                              759.4k ± 12%   759.4k ± 12%       ~ (p=0.485 n=6)   801.8k ± 12%       ~ (p=1.000 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(resource.service.name)/5-12             838.7k ± 19%   838.2k ± 20%       ~ (p=1.000 n=6)   839.2k ± 19%       ~ (p=0.589 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(resource.service.name)/7-12             817.8k ±  9%   797.1k ± 12%       ~ (p=0.394 n=6)   797.1k ± 12%       ~ (p=0.132 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(span.http.url)/5-12                     1.769M ±  9%   1.763M ±  7%       ~ (p=0.240 n=6)   1.743M ±  6%  -1.48% (p=0.026 n=6)
BackendBlockQueryRange/{}_|_rate()_by_(span.http.url)/7-12                     1.918M ±  2%   1.919M ±  3%       ~ (p=1.000 n=6)   1.883M ±  3%  -1.83% (p=0.026 n=6)
BackendBlockQueryRange/{resource.service.name=`loki-ingester`}_|_rate()/5-12   495.4k ±  4%   489.7k ±  3%       ~ (p=0.485 n=6)   479.2k ±  4%       ~ (p=0.240 n=6)
BackendBlockQueryRange/{resource.service.name=`loki-ingester`}_|_rate()/7-12   489.7k ±  3%   489.6k ±  3%       ~ (p=0.818 n=6)   486.0k ±  3%       ~ (p=0.485 n=6)
BackendBlockQueryRange/{status=error}_|_rate()/5-12                            412.6k ±  0%   412.6k ±  0%       ~ (p=0.734 n=6)   412.6k ±  0%       ~ (p=0.818 n=6)
BackendBlockQueryRange/{status=error}_|_rate()/7-12                            412.6k ±  3%   412.6k ±  1%       ~ (p=0.394 n=6)   412.6k ±  3%       ~ (p=0.331 n=6)
geomean                                                                        848.9k         840.6k        -0.97%                 845.5k        -0.40%

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@mapno mapno merged commit 81d095e into grafana:main May 30, 2024
14 checks passed
@mapno mapno deleted the rate-by-queries-performance branch May 30, 2024 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants