Add benchmark for libbeat diskqueue implementation #31594

leehinman · 2022-05-11T13:33:58Z

What does this PR do?

Adds a benchmark test to libbeat disqueue implementation. This benchmark produces, consumes & acks a million events to the queue.

Why is it important?

This benchmark will allow us to see how changes to queue implementation affect performance.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
~~- [ ] I have made corresponding changes to the documentation~~
~~- [ ] I have made corresponding change to the default configuration files~~
I have added tests that prove my fix is effective or that my feature works
~~- [ ] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.~~

How to test this PR locally

go test -bench=. -benchtime 10x -timeout 60m

Related issues

Relates [Meta][Feature] Implement encrypted disk queue elastic-agent-shipper#33

elasticmachine · 2022-05-11T13:34:01Z

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

elasticmachine · 2022-05-11T13:54:13Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2022-05-12T18:09:39.009+0000
Duration: 68 min 29 sec

Test stats 🧪

Test	Results
Failed	0
Passed	22558
Skipped	1938
Total	24496

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.
/package : Generate the packages and run the E2E tests.
/beats-tester : Run the installation tests with beats-tester.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

cmacknz · 2022-05-11T17:19:57Z

libbeat/publisher/queue/diskqueue/benchmark_test.go

+		Content: beat.Event{
+			Timestamp: eventTime,
+			Fields: mapstr.M{
+				"message": msgs[rand.Intn(len(msgs))],


Does using a random event make the benchmark results less reproducible? Is there a notable improvement vs just using a fixed sequence, or maybe the compromise of using the same seed in the random number generator for every run?

A static event vs a random event doesn't make much difference with the results. I get about a 200msec variance on -benchtime 10x runs either way.

I'll add the seed, that way order of benchmarks won't matter.

cmacknz · 2022-05-11T17:21:57Z

Can you summarize the results you get from this?

Also, does it run on every commit? It would be nice to get the output as report attached to each CI run, but that is maybe out of scope for this change.

fearful-symmetry · 2022-05-11T20:48:36Z

libbeat/publisher/queue/diskqueue/benchmark_test.go

+	}
+}
+
+//setup creates the disk queue, you must remove the directory it makes


Is there a reason we can't remove it after setup? If not, we might at least want to print the dir path so someone can manually clean it up without having to hunt around.

updated comments

libbeat/publisher/queue/diskqueue/benchmark_test.go

faec

My only question is similar to @cmacknz, how does this interact with CI? (Does it take long to run? Are the results surfaced anywhere when not running it manually?) If it doesn't have any expected "failure" outcome then should it be disabled on a default test run with a +build directive or something?

Also FYI, depending on which goes in first this may conflict with my pending PR that removes queue.Consumer from the queue API #31502 -- it should be an easy merge though, the only difference is that Get is now a method on the queue itself so there's no need to create a consumer explicitly.

leehinman · 2022-05-12T17:06:46Z

For results we get output like:

go test -bench=Benchmark1M_1k -benchtime 10x -timeout 60m
goos: darwin
goarch: amd64
pkg: github.com/elastic/beats/v7/libbeat/publisher/queue/diskqueue
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Benchmark1M_1k-16    	      10	38365574332 ns/op
PASS
ok  	github.com/elastic/beats/v7/libbeat/publisher/queue/diskqueue	424.300s

which is the average of 10 runs, that produced & consumed 1 million events each. or 26k events/sec.

go benchmark tests do NOT run during CI, and I don't think we want them to. For this to work it would be a fair amount of I/O and I'm pretty confident that CI schedulers would limit the I/O, making any kind of comparison useless.

leehinman · 2022-05-12T17:12:51Z

My only question is similar to @cmacknz, how does this interact with CI? (Does it take long to run? Are the results surfaced anywhere when not running it manually?) If it doesn't have any expected "failure" outcome then should it be disabled on a default test run with a +build directive or something?

gotestsum which runs all the go tests, does not run benchmarks. So you need to run this by hand. But I think that is reasonable. It took ~7 minutes to run the benchmark 10 times on my Mac, and really you need to run 50 or 100 times to get enough data to start doing comparisons. I don't think we can really justify that kind of energy consumption on every commit.

Also FYI, depending on which goes in first this may conflict with my pending PR that removes queue.Consumer from the queue API #31502 -- it should be an easy merge though, the only difference is that Get is now a method on the queue itself so there's no need to create a consumer explicitly.

Thanks, should be an easy fix.

* Add benchmark for libbeat diskqueue implementation

leehinman added enhancement Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team labels May 11, 2022

leehinman requested a review from a team as a code owner May 11, 2022 13:33

leehinman requested review from belimawr and fearful-symmetry and removed request for a team May 11, 2022 13:33

botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels May 11, 2022

mergify bot assigned leehinman May 11, 2022

leehinman force-pushed the benchmark_disk_queue branch from 2d49f09 to 7e9643f Compare May 11, 2022 15:05

cmacknz requested a review from faec May 11, 2022 17:16

cmacknz reviewed May 11, 2022

View reviewed changes

fearful-symmetry reviewed May 11, 2022

View reviewed changes

libbeat/publisher/queue/diskqueue/benchmark_test.go Show resolved Hide resolved

leehinman force-pushed the benchmark_disk_queue branch from e28b105 to 985bb52 Compare May 12, 2022 00:15

faec reviewed May 12, 2022

View reviewed changes

leehinman added 4 commits May 12, 2022 12:33

Add benchmark for libbeat diskqueue implementation

6c93b02

skip some linter checks

835a0bc

incorporate feedback

7d89564

change to diskqueue.Get instead of consumer.Get

4a6cc54

leehinman force-pushed the benchmark_disk_queue branch from 1743618 to 4a6cc54 Compare May 12, 2022 18:09

cmacknz approved these changes May 16, 2022

View reviewed changes

leehinman merged commit aafa2d4 into elastic:main May 17, 2022

chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023

Add benchmark for libbeat diskqueue implementation (#31594)

cd16847

* Add benchmark for libbeat diskqueue implementation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmark for libbeat diskqueue implementation #31594

Add benchmark for libbeat diskqueue implementation #31594

leehinman commented May 11, 2022

elasticmachine commented May 11, 2022

elasticmachine commented May 11, 2022 •

edited by jenkins-beats-ci bot

Loading

Build stats

Test stats 🧪

cmacknz May 11, 2022

leehinman May 12, 2022

cmacknz commented May 11, 2022

fearful-symmetry May 11, 2022

leehinman May 12, 2022

faec left a comment

leehinman commented May 12, 2022

leehinman commented May 12, 2022

Add benchmark for libbeat diskqueue implementation #31594

Add benchmark for libbeat diskqueue implementation #31594

Conversation

leehinman commented May 11, 2022

What does this PR do?

Why is it important?

Checklist

How to test this PR locally

Related issues

elasticmachine commented May 11, 2022

elasticmachine commented May 11, 2022 • edited by jenkins-beats-ci bot Loading

💚 Build Succeeded

Build stats

Test stats 🧪

💚 Flaky test report

🤖 GitHub comments

cmacknz May 11, 2022

Choose a reason for hiding this comment

leehinman May 12, 2022

Choose a reason for hiding this comment

cmacknz commented May 11, 2022

fearful-symmetry May 11, 2022

Choose a reason for hiding this comment

leehinman May 12, 2022

Choose a reason for hiding this comment

faec left a comment

Choose a reason for hiding this comment

leehinman commented May 12, 2022

leehinman commented May 12, 2022

elasticmachine commented May 11, 2022 •

edited by jenkins-beats-ci bot

Loading