Add CI for running benchmarks #326

YJDoc2 · 2023-06-24T06:42:13Z

Solves #324

As discussed, I have added CI for running benchmarks each time a PR is opened, and subsequently, when anyone comments /run-bench on the PR. This will run the benchmarks and post the result table as a comment on the PR. On subsequent runs, the same comment will be edited and updated with new results.

~~Because github needs the CI file to be present in default branch to run a CI, the bench CI won't run for this PR.~~

The CI is failing here, because the comment post action needs write permission on PRs, which is not present for default github token provided to CI in this repo (?). I have opened a dummy PR here on my fork. you can see the posted results there, as well as try out the comment based invocation of the benchmarks.

I have also changed the benchmarking method as discussed. We used to simply clone the cmark-gfm and run the bench from there ; now I have copied the benchmark running script from there, and use hyperfine to run them. Now we also run benchmark on pulldown-cmark and cmark-gfm . Markdown-rs was also considered, but it is a library-only repo, so we will probably need to setup cargo project with simple binary which reads from stdin and runs the markdow-rs parse on it for running benchmarks on it. That is not done in this PR.

Some issues I have noticed :

Benchmark CI takes ~5 minutes total to build and run benchmark on all repos. Major time is taken in building all of the binaries, the benchmarks only take ~1 minute total to run.
The time measurement varies for each CI run considerably. Not sure if this is GH runner issue or what, because when running this locally, the measurements were consistent across several runs. What this would mean is that we cannot compare measurement from a previous CI run to current, but only the values in current CI run. This also means that the relative section in the measurement is most sensible for understanding difference and changes in speed. You can see the comment history of the measurement comment on my PR (linked above) to see what I mean by all this

benches/bench.sh

Makefile

.github/workflows/benchmarks.yml

YJDoc2 · 2023-06-26T11:24:12Z

@kivikakk @digitalmoksha Please take a look when possible :)

digitalmoksha

@YJDoc2 this looks pretty good from my angle! Just had a few comments.

digitalmoksha · 2023-06-26T17:43:05Z

Makefile

+bench-comrak: build-comrak-branch
+	git clone https://github.com/progit/progit.git ${ROOT}/vendor/progit || true > /dev/null
+	cd benches && \
+	hyperfine --prepare 'sudo sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' --warmup 3 --min-runs ${MIN_RUNS}  -L binary comrak-${COMMIT} './bench.sh ./{binary}'


Hmm, do we really need to use sudo? I tried make bench-comrak locally, and hyperfine just hung during warmup, and no idea why. Turns out sudo was asking for a password. I don't really feel comfortable giving a bench mark program sudo access.

.github/workflows/benchmarks.yml

Makefile

kivikakk · 2023-06-27T00:45:11Z

This is looking amazing! Thank you so much.

I echo @digitalmoksha's comments about the use of sudo — is there usually a measurable difference between clearing the caches for our purposes? I imagine the warmup runs will remove a lot of the startup variation even without doing so.

The time measurement varies for each CI run considerably. Not sure if this is GH runner issue or what, because when running this locally, the measurements were consistent across several runs. What this would mean is that we cannot compare measurement from a previous CI run to current, but only the values in current CI run. This also means that the relative section in the measurement is most sensible for understanding difference and changes in speed. You can see the comment history of the measurement comment on my PR (linked above) to see what I mean by all this

Yes, that makes a lot of sense. I'm hopeful the relative data itself is accurate enough, often enough — the inter-run differences are probably caused by being scheduled onto hosts with vastly different loads at that time, but there's no predicting when that might happen during a run.

YJDoc2 · 2023-06-27T05:31:16Z

Hey @kivikakk @digitalmoksha , thanks for the review :)

Hmm, do we really need to use sudo? I tried make bench-comrak locally, and hyperfine just hung during warmup, and no idea why. Turns out sudo was asking for a password. I don't really feel comfortable giving a bench mark program sudo access.

I echo @digitalmoksha's comments about the use of sudo — is there usually a measurable difference between clearing the caches for our purposes? I imagine the warmup runs will remove a lot of the startup variation even without doing so.

I added the cache clearing step because otherwise hyperfine gave a warning that "first run is an outlier, which might be an effect of cache". Also when I tried without cache clearing, the measurements were considerably different. (consistent, but different). That said, I don't like the sudo use either, especially for benchmarks, so I'll try to figure out a workaround. Maybe increasing the number of warmup runs might solves the issue. Will get back after updating this.

Thanks :)

YJDoc2 · 2023-07-04T05:15:28Z

@kivikakk @digitalmoksha I have removed sudo and added markdown-it rs in comparisons. The benchmark value change from using the cache drops, but it is consistent within itself, so it should be fine. I have also added timestamp in the comment. Please take a look.

I'll see why the CI is failing...

README.md

digitalmoksha · 2023-07-10T00:11:41Z

@YJDoc2 this look good to me, nice job.

Co-authored-by: digitalMoksha <brett@digitalmoksha.com>

kivikakk · 2023-07-10T01:33:10Z

I'll look into the CI failure this afternoon with a view to merging this.

YJDoc2 · 2023-07-10T05:11:35Z

Hey, sorry I wasn't able to update on this last week.

I'll look into the CI failure this afternoon with a view to merging this.

Thanks a lot. Is it ok if I rebase it with main and force push? Maybe the CI is expecting some new things in master that aren't pulled in this branch.

YJDoc2 · 2023-07-10T09:47:18Z

@kivikakk I have updated the proc-macro2 version, as that was causing the build failure on nightly.
Related issue dtolnay/proc-macro2#356 .
I haven't updated anything else in cargo lock.

kivikakk

Looks excellent. Tested all locally — thank you so much for your effort! 🤍

YJDoc2 added 2 commits June 21, 2023 15:56

Add benchmark script, use hyperfine

587e2ca

Setup CI for running benchmarks and posting comments

a798db3

YJDoc2 commented Jun 24, 2023

View reviewed changes

benches/bench.sh Show resolved Hide resolved

Makefile Show resolved Hide resolved

.github/workflows/benchmarks.yml Show resolved Hide resolved

digitalmoksha reviewed Jun 26, 2023

View reviewed changes

YJDoc2 added 2 commits June 28, 2023 12:03

Remove prepare step from hyperfine and add date-time in the o/p

645bf34

Add markdown-it rs for benchmark comparison

b2c3480

digitalmoksha reviewed Jul 10, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

digitalmoksha reviewed Jul 10, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

digitalmoksha reviewed Jul 10, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

Apply suggestions from code review

6f9297b

Co-authored-by: digitalMoksha <brett@digitalmoksha.com>

update proc-macro2 crate version

e72e9a7

kivikakk approved these changes Jul 11, 2023

View reviewed changes

kivikakk linked an issue Jul 11, 2023 that may be closed by this pull request

[RFC] CI for running Benchmarks #324

Closed

kivikakk merged commit 216848f into kivikakk:main Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CI for running benchmarks #326

Add CI for running benchmarks #326

YJDoc2 commented Jun 24, 2023 •

edited

Loading

YJDoc2 commented Jun 26, 2023

digitalmoksha left a comment

digitalmoksha Jun 26, 2023

kivikakk commented Jun 27, 2023

YJDoc2 commented Jun 27, 2023

YJDoc2 commented Jul 4, 2023

digitalmoksha commented Jul 10, 2023

kivikakk commented Jul 10, 2023

YJDoc2 commented Jul 10, 2023

YJDoc2 commented Jul 10, 2023

kivikakk left a comment

Add CI for running benchmarks #326

Add CI for running benchmarks #326

Conversation

YJDoc2 commented Jun 24, 2023 • edited Loading

YJDoc2 commented Jun 26, 2023

digitalmoksha left a comment

Choose a reason for hiding this comment

digitalmoksha Jun 26, 2023

Choose a reason for hiding this comment

kivikakk commented Jun 27, 2023

YJDoc2 commented Jun 27, 2023

YJDoc2 commented Jul 4, 2023

digitalmoksha commented Jul 10, 2023

kivikakk commented Jul 10, 2023

YJDoc2 commented Jul 10, 2023

YJDoc2 commented Jul 10, 2023

kivikakk left a comment

Choose a reason for hiding this comment

YJDoc2 commented Jun 24, 2023 •

edited

Loading