Skip to content

Commit

Permalink
bench: add run outputs
Browse files Browse the repository at this point in the history
This makes it easy to link to benchmarks when someone asks, but also
serves as a good way to archive benchmark data at defined points for
comparison later.

We also make a (feeble) attempt at putting a "pretty" version of a
subset of benchmarks in the README of each run directory.
  • Loading branch information
BurntSushi committed Apr 30, 2021
1 parent 9b8f141 commit 7810d6c
Show file tree
Hide file tree
Showing 9 changed files with 77,242 additions and 5 deletions.
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ repository = "https://github.com/BurntSushi/rust-memchr"
readme = "README.md"
keywords = ["memchr", "char", "scan", "strchr", "string"]
license = "Unlicense/MIT"
exclude = ["/ci/*", "/.travis.yml", "/Makefile", "/appveyor.yml"]
exclude = ["/bench", "/.github", "/fuzz"]
edition = "2018"

[workspace]
Expand Down
44 changes: 44 additions & 0 deletions bench/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
This directory defines a large suite of benchmarks for both the memchr and
memmem APIs in this crate. A selection of "competitor" implementations are
chosen. In general, benchmarks are meant to be a tool for optimization. That's
why there is so many: we want to be sure we get enough coverage such that our
benchmarks approximate real world usage. When some benchmarks look a bit slower
than we expect (for one reason another), we can use profiling tools to look at
codegen and attempt to improve that case.

Because there are so many benchmarks, if you run all of them, you might want to
step away for a cup of coffee (or two). Therefore, the typical way to run them
is to select a subset. For example,

```
$ cargo bench -- 'memmem/krate/.*never.*'
```

runs all benchmarks for the memmem implementation in this crate with searches
that never produce any matches. This will still take a bit, but perhaps only a
few minutes.

Running a specific benchmark can be useful for profiling. For example, if you
want to see where `memmem/krate/prebuiltiter/huge-en/common-one-space` is
spending all of its time, you would first want to run it (to make sure the code
is compiled):

```
$ cargo bench -- memmem/krate/prebuiltiter/huge-en/common-one-space
```

And then run it under your profiling tool (I use `perf` on Linux):

```
$ perfr --callgraph cargo bench -- memmem/krate/prebuiltiter/huge-en/common-one-space --profile-time 3
```

Where
[`perfr` is my own wrapper around `perf`](https://github.com/BurntSushi/dotfiles/blob/master/bin/perfr),
and the `--profile-time 3` flag means, "just run the code for 3 seconds, but
don't do anything else." This makes the benchmark harness get out of the way,
which lets the profile focus as much as possible on the code being measured.

See the README in the `runs` directory for a bit more info on how to use
`critcmp` to look at benchmark data in a way that makes it easy to do
comparisons.
2 changes: 2 additions & 0 deletions bench/data/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
This directory contains benchmark corpora. Each sub-directory contains a README
documenting the corpus a bit more.
12 changes: 12 additions & 0 deletions bench/data/code/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
This data contains corpora generated from source code. These sorts of corpora
are important because code is something that is frequently searched.

This corpus was generated by running

```
$ find ./library/alloc -name '*.rs' -print0 \
| xargs -0 cat > .../memchr/bench/data/code/rust-library.rs
```

in a checkout of the https://github.com/rust-lang/rust repository at commit
78c963945aa35a76703bf62e024af2d85b2796e2.
146 changes: 146 additions & 0 deletions bench/runs/2021-04-30_initial/README.md

Large diffs are not rendered by default.

Loading

0 comments on commit 7810d6c

Please sign in to comment.