Add bench.sh script to automate benchmarking DataFusion against itself #6131

alamb · 2023-04-26T20:29:03Z

Which issue does this PR close?

Closes #6127

Rationale for this change

TLDR to make it easier to run the benchmarks included with DataFusion with a standard set of scenarios

See #6127

What changes are included in this PR?

Add a bench.sh script that orchestrates creating the data files and orchestrating executing the benchmarks
Update benchmark documentation
Remove outdated tpch_dbgen.sh script

This script currently supports two benchmarks as shown in the usage instructions.

(arrow_dev) alamb@MacBook-Pro-8:~/Software/arrow-datafusion$ ./benchmarks/bench.sh
Error: unknown command:

Orchestrates running benchmarks against DataFusion checkouts

Usage:
./benchmarks/bench.sh data [benchmark]
./benchmarks/bench.sh run [benchmark]
./benchmarks/bench.sh compare <branch1> <branch2>

**********
Examples:
**********
# Create the datasets for all benchmarks in /Users/alamb/Software/arrow-datafusion/benchmarks/data
./bench.sh data

# Run the 'tpch' benchmark on the datafusion checkout in /source/arrow-datafusion
DATAFASION_DIR=/source/arrow-datafusion ./bench.sh run tpch

**********
* Commands
**********
data:         Generates data needed for benchmarking
run:          Runs the named benchmark
compare:      Comares results from benchmark runs

**********
* Benchmarks
**********
all(default): Data/Run/Compare for all benchmarks
tpch:         TPCH inspired benchmark on Scale Factor (SF) 1 (~1GB), single parquet file per table
tpch_mem:     TPCH inspired benchmark on Scale Factor (SF) 1 (~1GB), query from memory

**********
* Supported Configuration (Environment Variables)
**********
DATA_DIR        directory to store datasets
CARGO_COMMAND   command that runs the benchmark binary
DATAFASION_DIR  directory to use (default /Users/alamb/Software/arrow-datafusion/benchmarks/..)

Are these changes tested?

I tested them manually on an x86 mac and a Linux x86 machine.

Are there any user-facing changes?

No, it is just development scripts

alamb · 2023-04-26T21:10:26Z

Some interesting results already -- I ran a quick experiment to see how much 'lto' link time optimization helps. The answer is "quite a bit"

alamb@aal-dev:~/arrow-datafusion2/benchmarks$ python compare.py results/alamb_bench/tpch.json results/alamb_bench_compare/tpch.json
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃ /home/alamb… ┃ /home/alamb… ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │    1269.30ms │    1097.60ms │ +1.16x faster │
│ QQuery 2     │     418.17ms │     309.01ms │ +1.35x faster │
│ QQuery 3     │     393.15ms │     365.30ms │ +1.08x faster │
│ QQuery 4     │     212.83ms │     214.36ms │     no change │
│ QQuery 5     │     534.56ms │     531.17ms │     no change │
│ QQuery 6     │     209.39ms │     184.41ms │ +1.14x faster │
│ QQuery 7     │    1037.82ms │     981.81ms │ +1.06x faster │
│ QQuery 8     │     550.99ms │     540.95ms │     no change │
│ QQuery 9     │     982.00ms │     984.53ms │     no change │
│ QQuery 10    │     613.48ms │     560.14ms │ +1.10x faster │
│ QQuery 11    │     272.45ms │     231.46ms │ +1.18x faster │
│ QQuery 12    │     319.91ms │     320.54ms │     no change │
│ QQuery 13    │    1127.46ms │    1087.70ms │     no change │
│ QQuery 14    │     286.89ms │     263.16ms │ +1.09x faster │
│ QQuery 15    │     255.63ms │     233.42ms │ +1.10x faster │
│ QQuery 16    │     302.94ms │     309.25ms │     no change │
│ QQuery 17    │    2891.05ms │    2628.59ms │ +1.10x faster │
│ QQuery 18    │    3123.23ms │    3154.47ms │     no change │
│ QQuery 19    │     511.97ms │     472.75ms │ +1.08x faster │
│ QQuery 20    │    1042.76ms │     938.75ms │ +1.11x faster │
│ QQuery 21    │    1567.78ms │    1611.91ms │     no change │
│ QQuery 22    │     182.56ms │     171.52ms │ +1.06x faster │
└──────────────┴──────────────┴──────────────┴───────────────┘
alamb@aal-dev:~/arrow-datafusion2/benchmarks$ python compare.py results/alamb_bench/tpch_mem.json results/alamb_bench_compare/tpch_mem.json
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃           -o ┃           -o ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │     876.15ms │     796.57ms │ +1.10x faster │
│ QQuery 2     │     265.87ms │     267.19ms │     no change │
│ QQuery 3     │     169.43ms │     164.82ms │     no change │
│ QQuery 4     │     110.07ms │     116.02ms │  1.05x slower │
│ QQuery 5     │     462.34ms │     449.41ms │     no change │
│ QQuery 6     │      44.40ms │      40.54ms │ +1.10x faster │
│ QQuery 7     │    1099.47ms │    1077.84ms │     no change │
│ QQuery 8     │     241.97ms │     247.20ms │     no change │
│ QQuery 9     │     584.01ms │     606.74ms │     no change │
│ QQuery 10    │     301.95ms │     299.14ms │     no change │
│ QQuery 11    │     239.17ms │     221.07ms │ +1.08x faster │
│ QQuery 12    │     153.73ms │     139.95ms │ +1.10x faster │
│ QQuery 13    │     793.76ms │     753.25ms │ +1.05x faster │
│ QQuery 14    │      59.50ms │      49.38ms │ +1.20x faster │
│ QQuery 15    │     103.03ms │      89.82ms │ +1.15x faster │
│ QQuery 16    │     216.38ms │     213.85ms │     no change │
│ QQuery 17    │    3356.99ms │    2866.85ms │ +1.17x faster │
│ QQuery 18    │    3017.82ms │    2910.50ms │     no change │
│ QQuery 19    │     161.19ms │     137.81ms │ +1.17x faster │
│ QQuery 20    │     924.38ms │     855.74ms │ +1.08x faster │
│ QQuery 21    │    1502.52ms │    1460.54ms │     no change │
│ QQuery 22    │     133.10ms │     128.34ms │     no change │
└──────────────┴──────────────┴──────────────┴───────────────┘

alamb · 2023-04-28T15:18:23Z

benchmarks/README.md

+
+# Benchmark Descriptions:
+
+## `tpch` Benchmark derived from TPC-H

 These benchmarks are derived from the [TPC-H][1] benchmark. And we use this repo as the source of tpch-gen and answers:


I next hope / plan tor review the other benchmarks and consolidate them and their data generation and runner scripts into the bench.sh framework

yjshen

Thanks @alamb!

yjshen · 2023-04-30T02:39:47Z

benchmarks/README.md

+# Gather baseline data for tpch benchmark
+./benchmarks/bench.sh run tpch
+
+# Switch to the branch the branch name is mybranch and gather data


👍 I was curious before about what's the magic for comparing branches

Thanks for the review @yjshen -- I am trying to reduce the amount of magic involved.

I am going to merge this in and we can continue to iterate (next I would like to increase the number of different tests supported)

alamb added the development-process Related to development process of DataFusion label Apr 26, 2023

github-actions bot removed the development-process Related to development process of DataFusion label Apr 26, 2023

alamb changed the title ~~Add bench script to benchmark datafusion against itself~~ Add bench.sh script to benchmark DataFusion against itself Apr 26, 2023

alamb force-pushed the alamb/bench branch from c4eba0b to dc2d426 Compare April 28, 2023 15:01

alamb changed the title ~~Add bench.sh script to benchmark DataFusion against itself~~ Add bench.sh script to automate benchmarking DataFusion against itself Apr 28, 2023

alamb commented Apr 28, 2023

View reviewed changes

Add bench script to benchmark datafusion against itself

7a29fe6

alamb force-pushed the alamb/bench branch from 68f038f to 7a29fe6 Compare April 28, 2023 17:54

improve docs

3b28cd7

alamb marked this pull request as ready for review April 28, 2023 17:58

This was referenced Apr 28, 2023

Add "first impression" benchmark to bench.sh #6156

Open

[Epic] Improved DataFusion Benchmarking #5505

Open

alamb requested a review from andygrove April 28, 2023 20:46

yjshen approved these changes Apr 30, 2023

View reviewed changes

alamb merged commit 58d15c7 into apache:main Apr 30, 2023

chitralverma mentioned this pull request Apr 30, 2023

benchmark(python): Shift polars TPC-H benchmarks to main repo and include in CI pola-rs/polars#7740

Open

alamb mentioned this pull request May 1, 2023

Add parquet filter and sort to bench.sh #6172

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add bench.sh script to automate benchmarking DataFusion against itself #6131

Add bench.sh script to automate benchmarking DataFusion against itself #6131

alamb commented Apr 26, 2023 •

edited

Loading

alamb commented Apr 26, 2023

alamb Apr 28, 2023

yjshen left a comment

yjshen Apr 30, 2023

alamb Apr 30, 2023

Add bench.sh script to automate benchmarking DataFusion against itself #6131

Add bench.sh script to automate benchmarking DataFusion against itself #6131

Conversation

alamb commented Apr 26, 2023 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

alamb commented Apr 26, 2023

alamb Apr 28, 2023

Choose a reason for hiding this comment

yjshen left a comment

Choose a reason for hiding this comment

yjshen Apr 30, 2023

Choose a reason for hiding this comment

alamb Apr 30, 2023

Choose a reason for hiding this comment

alamb commented Apr 26, 2023 •

edited

Loading