Measure how efficient `with_label_values()` is #7568

nikurt · 2022-09-06T16:50:40Z

% cargo bench -p near-o11y
   Compiling near-o11y v0.0.0 (/storage/code/nearcore-master/core/o11y)
    Finished bench [optimized] target(s) in 1.05s
     Running unittests src/lib.rs (target/release/deps/near_o11y-ccbb448e66a5a4d9)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/metrics.rs (target/release/deps/metrics-d531efabdee84e90)

running 3 tests
test inc_counter_vec_cached            ... bench:          21 ns/iter (+/- 1)
test inc_counter_vec_cached_str        ... bench:         183 ns/iter (+/- 1)
test inc_counter_vec_with_label_values ... bench:         506 ns/iter (+/- 10)

test result: ok. 0 passed; 0 failed; 0 ignored; 3 measured

Time per inc() call when caching counters: 0.050249520625 microseconds Time per inc() call when caching label values: 0.624767583375 microseconds Time per inc() call when caching strings: 0.659365745875 microseconds Time per inc() call when using `with_label_values()`: 0.87889163825 microseconds

nikurt · 2022-09-06T16:50:47Z

cc: @matklad

matklad · 2022-09-07T11:13:17Z

Do we actually want to merge it? I am skeptical of the value of benchmarks, as we don't actually run them continuously. I think the right approach is to do ad-hoc investigation, document the outcomes at a point in time in some issue, link the code there, but don't get it into master.

No particularly strong opinion though.

nagisa · 2022-09-07T12:07:17Z

NB: this comment applies in general.

The unfortunate outcome of not documenting any results of an investigation is that the outcomes of said investigation become a folklore and are eventually lost to time. For investigations that are trivial to reproduce it might be fine, but ultimately some people leave, others forget details and the same mistake that prompted the investigation in the first place will be made again.

While it isn’t strictly necessary to merge benchmarks per se, I think it is worthwhile to spend some time to at least document what the outcome of this investigation is. Should we use a specific pattern out of these three when implementing metrics? Add an example/recommendations to near_o11y or something at least. Are all the variants fast enough for our use-cases? Make a note somewhere. That way, even if we don’t recall the details, maybe somebody will recall seeing the note and will be able to bring it up and then follow the breadcrumbs from there.

mina86 · 2022-09-07T12:16:03Z

I’ve taken the liberty of adding two more cases.

running 5 tests
test inc_counter_vec_cached                        ... bench:   15 ns/iter (+/- 0)
test inc_counter_vec_cached_str                    ... bench:  139 ns/iter (+/- 1)
test inc_counter_vec_with_label_values             ... bench:  391 ns/iter (+/- 5)
test inc_counter_vec_with_label_values_smartstring ... bench:  352 ns/iter (+/- 9)
test inc_counter_vec_with_label_values_stack       ... bench:  261 ns/iter (+/- 4)

mina86 · 2022-09-07T12:21:32Z

From that I think a quick improvement would be to offer FormattedNum type which is constructed from an integer and formats it into internal buffer. The usage would be: with_label_values(&[&FormattedNum::new(shard_id)]]).

mina86 · 2022-09-07T12:33:23Z

OK, one more. I always suspected Rust’s format is shite:

test inc_counter_vec_cached                            ... bench:   15 ns/iter (+/- 0)
test inc_counter_vec_cached_str                        ... bench:  143 ns/iter (+/- 3)
test inc_counter_vec_with_label_values                 ... bench:  397 ns/iter (+/- 8)
test inc_counter_vec_with_label_values_smartstring     ... bench:  344 ns/iter (+/- 5)
test inc_counter_vec_with_label_values_stack           ... bench:  264 ns/iter (+/- 14)
test inc_counter_vec_with_label_values_stack_no_format ... bench:  140 ns/iter (+/- 2)

mina86 · 2022-09-07T12:46:30Z

Can’t help myself… ;)

test inc_counter_vec_cached                            ... bench:   15 ns/iter (+/- 0)
test inc_counter_vec_cached_str                        ... bench:  140 ns/iter (+/- 1)
test inc_counter_vec_with_label_values                 ... bench:  388 ns/iter (+/- 5)
test inc_counter_vec_with_label_values_to_string       ... bench:  298 ns/iter (+/- 9)
test inc_counter_vec_with_label_values_smartstring     ... bench:  346 ns/iter (+/- 3)
test inc_counter_vec_with_label_values_stack           ... bench:  262 ns/iter (+/- 7)
test inc_counter_vec_with_label_values_stack_no_format ... bench:  137 ns/iter (+/- 3)

Mostly I was curious about overhead of format!("{val}") compared to val.to_string(). Turns out it’s noticeable.

matklad · 2022-09-07T13:09:49Z

Couple of crates to be aware of in this area:

nikurt · 2022-09-07T13:13:06Z

outcome of not documenting any results of an investigation

This PR is an ongoing investigation, not a result of an investigation yet.
@nagisa, Based on the alternatives and benchmark results, I will document the results and make recommendations.

@mina86's test cases

Thank you! I simply wanted to compare the worst case .with_label_values() with the best case (cached). The plan was to investigate the alternatives and benchmark them.
@mina86 happened to suggest a bunch of alternatives already, thank you very much!

As per benchmarks in near#7568, using format! as opposed to calling to_string directly has ~100ns overhead. There’s no reason not to get rid of that overhead especially since using to_string is actually shorter to type. Note that it may be beneficial to further optimise integer formatting by using fixed-size buffer and custom conversion which doesn’t use std. This optimisation is outside the scope of this commit.

@matklad

cc: @matklad ``` % cargo bench -p near-o11y Compiling near-o11y v0.0.0 (/storage/code/nearcore-master/core/o11y) Finished bench [optimized] target(s) in 1.05s Running unittests src/lib.rs (target/release/deps/near_o11y-ccbb448e66a5a4d9) running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s Running benches/metrics.rs (target/release/deps/metrics-d531efabdee84e90) running 3 tests test inc_counter_vec_cached ... bench: 21 ns/iter (+/- 1) test inc_counter_vec_cached_str ... bench: 183 ns/iter (+/- 1) test inc_counter_vec_with_label_values ... bench: 506 ns/iter (+/- 10) test result: ok. 0 passed; 0 failed; 0 ignored; 3 measured ```

As per benchmarks in #7568, using format! as opposed to calling to_string directly has ~100ns overhead. There’s no reason not to get rid of that overhead especially since using to_string is actually shorter to type. Note that it may be beneficial to further optimise integer formatting by using fixed-size buffer and custom conversion which doesn’t use std. This optimisation is outside the scope of this commit.

@matklad

cc: @matklad ``` % cargo bench -p near-o11y Compiling near-o11y v0.0.0 (/storage/code/nearcore-master/core/o11y) Finished bench [optimized] target(s) in 1.05s Running unittests src/lib.rs (target/release/deps/near_o11y-ccbb448e66a5a4d9) running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s Running benches/metrics.rs (target/release/deps/metrics-d531efabdee84e90) running 3 tests test inc_counter_vec_cached ... bench: 21 ns/iter (+/- 1) test inc_counter_vec_cached_str ... bench: 183 ns/iter (+/- 1) test inc_counter_vec_with_label_values ... bench: 506 ns/iter (+/- 10) test result: ok. 0 passed; 0 failed; 0 ignored; 3 measured ```

As per benchmarks in #7568, using format! as opposed to calling to_string directly has ~100ns overhead. There’s no reason not to get rid of that overhead especially since using to_string is actually shorter to type. Note that it may be beneficial to further optimise integer formatting by using fixed-size buffer and custom conversion which doesn’t use std. This optimisation is outside the scope of this commit.

nikurt and others added 3 commits September 7, 2022 11:36

Remove a redundant complicated testcase.

4272c28

Make it a bench.

f04fef1

Merge branch 'master' into nikurt-label-values

57281cb

nikurt marked this pull request as ready for review September 7, 2022 10:02

nikurt requested a review from a team as a code owner September 7, 2022 10:02

nikurt requested a review from mm-near September 7, 2022 10:02

pompon0 approved these changes Sep 7, 2022

View reviewed changes

mina86 added 2 commits September 7, 2022 14:14

add smartstring and stack versions

7621e8b

Merge remote-tracking branch 'up/master' into HEAD

4c442db

inc_counter_vec_with_label_values_stack_no_format

0201764

inc_counter_vec_with_label_values_to_string

62efd6f

nikurt added the S-automerge label Sep 7, 2022

Merge master into nikurt-label-values

1512a91

near-bulldozer bot merged commit 4c32bf0 into master Sep 7, 2022

near-bulldozer bot deleted the nikurt-label-values branch September 7, 2022 13:41

mina86 mentioned this pull request Sep 7, 2022

Prefer to_string to format!("{}") #7581

Merged

mina86 mentioned this pull request Jun 7, 2023

Prefer x.to_string() over format!("{}", x) ComposableFi/composable#3671

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Measure how efficient `with_label_values()` is #7568

Measure how efficient `with_label_values()` is #7568

nikurt commented Sep 6, 2022 •

edited

Loading

nikurt commented Sep 6, 2022

matklad commented Sep 7, 2022

nagisa commented Sep 7, 2022

mina86 commented Sep 7, 2022

mina86 commented Sep 7, 2022

mina86 commented Sep 7, 2022

mina86 commented Sep 7, 2022

matklad commented Sep 7, 2022

nikurt commented Sep 7, 2022 •

edited

Loading

Measure how efficient with_label_values() is #7568

Measure how efficient with_label_values() is #7568

Conversation

nikurt commented Sep 6, 2022 • edited Loading

nikurt commented Sep 6, 2022

matklad commented Sep 7, 2022

nagisa commented Sep 7, 2022

mina86 commented Sep 7, 2022

mina86 commented Sep 7, 2022

mina86 commented Sep 7, 2022

mina86 commented Sep 7, 2022

matklad commented Sep 7, 2022

nikurt commented Sep 7, 2022 • edited Loading

Measure how efficient `with_label_values()` is #7568

Measure how efficient `with_label_values()` is #7568

nikurt commented Sep 6, 2022 •

edited

Loading

nikurt commented Sep 7, 2022 •

edited

Loading