Add optional scatterplot to benchcomp output #3077

tautschnig · 2024-03-14T17:44:11Z

Scatterplots should make it easier to immediately spot performance trends (and indeed any differences) rather than having to process a (large) number of table rows.

Uses mermaid-js to produce markdown-embedded plots that will display on the job summary page. Scatterplots are not directly supported by mermaid-js at this point (xycharts only do line or bar charts), so quadrant plots are employed with various diagram items drawn in white to make them disappear.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

Scatterplots should make it easier to immediately spot performance trends (and indeed any differences) rather than having to process a (large) number of table rows. Uses mermaid-js to produce markdown-embedded plots that will display on the job summary page. Scatterplots are not directly supported by mermaid-js at this point (xycharts only do line or bar charts), so quadrant plots are employed with various diagram items drawn in white to make them disappear.

tautschnig · 2024-03-15T15:25:05Z

See https://github.com/model-checking/kani/actions/runs/8296849261?pr=3077 for what those diagrams now look like.

jaisnan

From the images shown, can you leave a comment here or in documentation somewhere on how to interpret those values? What do the values on the top-right of the plot mean, what do the values on the bottom-left mean etc?

tools/benchcomp/benchcomp/visualizers/__init__.py

tools/benchcomp/test/test_regression.py

tautschnig · 2024-03-15T22:04:23Z

From the images shown, can you leave a comment here or in documentation somewhere on how to interpret those values? What do the values on the top-right of the plot mean, what do the values on the bottom-left mean etc?

I am now adding text output to spell out the axis ranges, thank you for calling this out!

This reverts commit 099ec51.

karkhaz

Thank you, just a few comments.

To be honest the latest version is still a bit hard to read. It's less the axis labels and more that I'm not confident that I would be able to spot a 20% increase or decrease from the y=x line, for example. Also if a particular point has moved far away from the y=x line, it's not that easy to determine which proof or benchmark that actually corresponds to.

As a totally different approach to quickly identifying outlying benchmarks---would it be easier if I actually implemented the benchcomp filter command, which allows you to filter benchmarks based on certain criteria before they get emitted to the Markdown table?

That way, we could print a "filtered" set of Markdown tables (containing only regressions) followed by the complete results. That might be superior to the current approach where regressions are highlighted in bold in the complete table.

So the action would look something like this

      - name: Perf Regression Table
        run: |
          echo "# Regressions" >> "$GITHUB_STEP_SUMMARY"
          new/tools/benchcomp/bin/benchcomp \
            --config new/tools/benchcomp/configs/perf-regression.yaml \
            visualize --only dump_markdown_results_table >> "$GITHUB_STEP_SUMMARY"

      - name: Perf All Results Table
        run: |
          echo "# All Results" >> "$GITHUB_STEP_SUMMARY"
          new/tools/benchcomp/bin/benchcomp \
            --config new/tools/benchcomp/configs/perf-all-results.yaml \
            visualize --only dump_markdown_results_table >> "$GITHUB_STEP_SUMMARY"

      - name: Run other visualizations
        run: |
          new/tools/benchcomp/bin/benchcomp \
            --config new/tools/benchcomp/configs/perf-regression.yaml \
            visualize --except dump_markdown_results_table

tools/benchcomp/benchcomp/visualizers/__init__.py

Co-authored-by: Kareem Khazem <karkhaz@karkhaz.com>

tautschnig · 2024-03-18T14:45:28Z

To be honest the latest version is still a bit hard to read. It's less the axis labels and more that I'm not confident that I would be able to spot a 20% increase or decrease from the y=x line, for example. Also if a particular point has moved far away from the y=x line, it's not that easy to determine which proof or benchmark that actually corresponds to.

Yes, I'd indeed love to be able to draw that diagonal line into the diagram, but I haven't been able to figure out a way to do this in Mermaid. Perhaps it's just about time to create bunch of feature requests against Mermaid.

As a totally different approach to quickly identifying outlying benchmarks---would it be easier if I actually implemented the benchcomp filter command, which allows you to filter benchmarks based on certain criteria before they get emitted to the Markdown table?

Perhaps we should have all of these tools available and then see what helps us the most?

karkhaz

Thank you, looking forward to seeing this in CBMC also!

@zhassan-aws

Updated version in all `Cargo.toml` files (via `find . -name Cargo.toml -exec sed -i 's/version = "0.48.0"/version = "0.49.0"/' {} \;`) and ran `cargo build-dev` to have `Cargo.lock` files updated. GitHub generated release notes: ## What's Changed * Upgrade Rust toolchain to 2024-03-14 by @zhassan-aws in #3081 * Disable removal of storage markers by @zhassan-aws in #3083 * Automatic cargo update to 2024-03-18 by @github-actions in #3086 * Bump tests/perf/s2n-quic from `1a7faa8` to `9e39ca0` by @dependabot in #3087 * Upgrade toolchain to nightly-2024-03-15 by @celinval in #3084 * Add optional scatterplot to benchcomp output by @tautschnig in #3077 * Benchcomp scatterplots: quote axis labels by @tautschnig in #3097 * Expand ${var} in benchcomp variant `env` by @karkhaz in #3090 * Add test for #3099 by @zhassan-aws in #3100 * Automatic cargo update to 2024-03-25 by @github-actions in #3103 * Bump tests/perf/s2n-quic from `1a7faa8` to `0a60ec1` by @dependabot in #3104 * Implement validity checks by @celinval in #3085 * Add `benchcomp filter` command by @karkhaz in #3105 * Add CI test for --use-local-toolchain by @jaisnan in #3074 * Upgrade Rust toolchain to `nightly-2024-03-21` by @adpaco-aws in #3102 * Use `intrinsic_name` to get the intrinsic name by @adpaco-aws in #3114 * Bump tests/perf/s2n-quic from `0a60ec1` to `2d5e891` by @dependabot in #3118 * Allow modifies clause for verification only by @feliperodri in #3098 * Automatic cargo update to 2024-04-01 by @github-actions in #3117 * Automatic cargo update to 2024-04-04 by @github-actions in #3122 * Remove bookrunner by @tautschnig in #3123 * Upgrade Rust toolchain to nightly-2024-03-29 by @feliperodri in #3116 * Remove unnecessary build step for some workflows by @zhassan-aws in #3124 * Ensure storage markers are kept in std code by @zhassan-aws in #3080 **Full Changelog**: kani-0.48.0...kani-0.49.0

@zhassan-aws

Updated version in all `Cargo.toml` files (via `find . -name Cargo.toml -exec sed -i 's/version = "0.48.0"/version = "0.49.0"/' {} \;`) and ran `cargo build-dev` to have `Cargo.lock` files updated. GitHub generated release notes: ## What's Changed * Upgrade Rust toolchain to 2024-03-14 by @zhassan-aws in model-checking#3081 * Disable removal of storage markers by @zhassan-aws in model-checking#3083 * Automatic cargo update to 2024-03-18 by @github-actions in model-checking#3086 * Bump tests/perf/s2n-quic from `1a7faa8` to `9e39ca0` by @dependabot in model-checking#3087 * Upgrade toolchain to nightly-2024-03-15 by @celinval in model-checking#3084 * Add optional scatterplot to benchcomp output by @tautschnig in model-checking#3077 * Benchcomp scatterplots: quote axis labels by @tautschnig in model-checking#3097 * Expand ${var} in benchcomp variant `env` by @karkhaz in model-checking#3090 * Add test for model-checking#3099 by @zhassan-aws in model-checking#3100 * Automatic cargo update to 2024-03-25 by @github-actions in model-checking#3103 * Bump tests/perf/s2n-quic from `1a7faa8` to `0a60ec1` by @dependabot in model-checking#3104 * Implement validity checks by @celinval in model-checking#3085 * Add `benchcomp filter` command by @karkhaz in model-checking#3105 * Add CI test for --use-local-toolchain by @jaisnan in model-checking#3074 * Upgrade Rust toolchain to `nightly-2024-03-21` by @adpaco-aws in model-checking#3102 * Use `intrinsic_name` to get the intrinsic name by @adpaco-aws in model-checking#3114 * Bump tests/perf/s2n-quic from `0a60ec1` to `2d5e891` by @dependabot in model-checking#3118 * Allow modifies clause for verification only by @feliperodri in model-checking#3098 * Automatic cargo update to 2024-04-01 by @github-actions in model-checking#3117 * Automatic cargo update to 2024-04-04 by @github-actions in model-checking#3122 * Remove bookrunner by @tautschnig in model-checking#3123 * Upgrade Rust toolchain to nightly-2024-03-29 by @feliperodri in model-checking#3116 * Remove unnecessary build step for some workflows by @zhassan-aws in model-checking#3124 * Ensure storage markers are kept in std code by @zhassan-aws in model-checking#3080 **Full Changelog**: model-checking/kani@kani-0.48.0...kani-0.49.0

tautschnig self-assigned this Mar 14, 2024

tautschnig added the Z-BenchCI Tag a PR to run benchmark CI label Mar 14, 2024

tautschnig force-pushed the benchcomp-scatterplot branch from d1069cb to f47914f Compare March 15, 2024 08:54

tautschnig added Z-BenchCI Tag a PR to run benchmark CI and removed Z-BenchCI Tag a PR to run benchmark CI labels Mar 15, 2024

tautschnig added 4 commits March 15, 2024 10:07

Quote benchmark name

a6ec298

Rounding, test fix

7769fff

Don't produce 1.0 value

37aea31

Fix init string

25d856a

tautschnig removed their assignment Mar 15, 2024

tautschnig marked this pull request as ready for review March 15, 2024 15:24

tautschnig requested a review from a team as a code owner March 15, 2024 15:24

tautschnig mentioned this pull request Mar 15, 2024

Run performance comparison in CI using Kani's Benchcomp diffblue/cbmc#8171

Merged

3 tasks

jaisnan reviewed Mar 15, 2024

View reviewed changes

tools/benchcomp/benchcomp/visualizers/__init__.py Show resolved Hide resolved

tools/benchcomp/test/test_regression.py Show resolved Hide resolved

Include axis range information

1490250

tautschnig added 4 commits March 15, 2024 22:07

Disable debug

099ec51

Revert "Disable debug"

bd3d562

This reverts commit 099ec51.

Fix whitespace

82f23fc

Make padding explicit

6ec598d

karkhaz reviewed Mar 18, 2024

View reviewed changes

tautschnig and others added 3 commits March 18, 2024 15:33

Update tools/benchcomp/benchcomp/visualizers/__init__.py

cd1dd67

Co-authored-by: Kareem Khazem <karkhaz@karkhaz.com>

Update tools/benchcomp/benchcomp/visualizers/__init__.py

b7e0317

Co-authored-by: Kareem Khazem <karkhaz@karkhaz.com>

Update tools/benchcomp/benchcomp/visualizers/__init__.py

df9234a

Co-authored-by: Kareem Khazem <karkhaz@karkhaz.com>

tautschnig added 3 commits March 19, 2024 08:50

Merge remote-tracking branch 'origin/main' into benchcomp-scatterplot

6dd86a7

Try different padding

7d25d36

Refine padding

426751a

tautschnig added 2 commits March 19, 2024 13:05

More whitespace

b7c3a14

More whitespace

735a5e8

karkhaz approved these changes Mar 19, 2024

View reviewed changes

tautschnig merged commit ec2b8b3 into model-checking:main Mar 20, 2024
20 checks passed

tautschnig deleted the benchcomp-scatterplot branch March 20, 2024 06:21

tautschnig mentioned this pull request Apr 5, 2024

Bump Kani version to 0.49.0 #3126

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optional scatterplot to benchcomp output #3077

Add optional scatterplot to benchcomp output #3077

tautschnig commented Mar 14, 2024

tautschnig commented Mar 15, 2024

jaisnan left a comment

tautschnig commented Mar 15, 2024

karkhaz left a comment

tautschnig commented Mar 18, 2024

karkhaz left a comment

Add optional scatterplot to benchcomp output #3077

Add optional scatterplot to benchcomp output #3077

Conversation

tautschnig commented Mar 14, 2024

tautschnig commented Mar 15, 2024

jaisnan left a comment

Choose a reason for hiding this comment

tautschnig commented Mar 15, 2024

karkhaz left a comment

Choose a reason for hiding this comment

tautschnig commented Mar 18, 2024

karkhaz left a comment

Choose a reason for hiding this comment