SwingSet benchmarking tool, phase 1 #8239

FUDCo · 2023-08-23T02:13:05Z

This is the first pass on benchmarking. I labeled it "phase 1" because I think this is just the jumping off point for what I expect will be a lot of evolution. The contents of this PR are a complete unit of work, in the sense that what we have with this is a stable snapshot, but it's not really at the point where we can declare "we have benchmarking now". In other words, there will be much more to follow, but this here is suitable for merging.

I've added a bunch of names to the reviewer list, because a lot of you engaged with my lengthy Slack threads discussing some of the challenges that I was faced with in doing this work, but if you think other people have the matter better in hand or you just aren't that interested, please feel free to remove yourself as a reviewer.

Note that this PR is structured as a series of commits that are individually reviewable; the changes here are probably best understood by reading through the commits one at a time.

What these changes ultimately lead to is a version of the vault-opening benchmark that can be run from inside swingset-runner using swingset-runner's benchmarking facilities. Most of what's here is all the various tweaks that were needed in order to make that possible. One thing that is hard to distinguish at this stage is which changes/additions were necessary to run this particular benchmark in something resembling the chain environment vs. which are necessary to run such things in general -- it's hard to generalize from a sample of 1. In particular, I think it's quite plausible that there are things missing that will only be discovered when we try to benchmark other operations.

These changes fix the issues I identified with things that were blocking the bootstrap vat's bootstrap method from completing, and to make sure that the changes necessary for that don't break any of our other tests (which, to make a long story short, they did at first). In some sense this an academic question, since we have already booted the mainnet chain, but in another important sense they're not academic at all because we are constantly restarting and rerunning things when doing development and testing.

A non-comprehensive list of things that still need to be done:

Make it more straightforward to run swingset-runner from anywhere; right now it really likes to be run from the packages/swingset-runner directory.
Produce a few more benchmarks to better learn where the cowpaths are
Based on the above, try to package a lot of the supporting code for running these benchmarks into a library
Better tooling for harvesting measurement data and feeding it to somewhere it can be analyzed
Explicit exploitation of the various v8/xs perf measurement APIs, now that we have a place to stand to do that
More deliberate unpacking of @warner's notion of different levels of benchmarks (e.g., whether or not to incorporate requests over the bridge, whether or not to incorporate the rest of the chain, etc). In particular, it would be nice to not have to maintain slightly different versions of the same code that differ just so they can run in these different contexts.
Based on what we figure out about the above, more work on tools to drive actual benchmarking from some kind of table-driven configuration
Incorporate some portion of the above into our CI tooling so that things get profiled regularly or when things change

For the record, to run the vaults benchmark, go into the packages/swingset-runner directory and enter the incantation:

bin/runner --chain --sbench --benchmark ${ROUNDS} --config demo/vaultPerfTest/vaultPerfTest-config.json run

where ROUNDS is the number of rounds of the test you want to execute.
If you want to be extra amused, add the --verbose option also. Be forewarned that the output will be prodigious.

dckc

The changes in inter-protocol and boot look OK.

I didn't look closely at the rest.

packages/SwingSet/src/controller/initializeSwingset.js

packages/boot/test/bootstrapTests/supports.js

packages/inter-protocol/src/proposals/addAssetToVault.js

packages/inter-protocol/src/proposals/price-feed-proposal.js

packages/swingset-runner/src/main.js

packages/swingset-runner/demo/vaultPerfTest/vaultPerfTest-config.json

warner

swingset/swingset-runner changes look ok to me, modulo my discomfort with the (necessary, as currently written) dependency "inversion"

I rely upon others to review the changes to inter-protocol/ and boot/ , but they seem ok to me too.

warner · 2023-08-29T06:16:44Z

packages/swingset-runner/package.json

    "@agoric/ertp": "^0.16.2",
    "@agoric/internal": "^0.3.2",
+    "@agoric/inter-protocol": "^0.16.1",


I'm a bit weirded out by having swingset-runner sit above inter-protocol/deploy-script-support in the dependency graph. In general, swingset code is unaware of chains or the Agoric economic layers, and if we intend to move all swingset-related packages to their own repo some day, this will be a sticking point.. we may need to leave swingset-runner behind in agoric-sdk.

We've talked about having a separate package which provides the benchmarking support: when we do that, please consider moving swingset-runner/src/chain.js and the demo/vaultPerfTest/ stuff that provokes this dependency to that other package, and returning swingset-runner to only depend upon swingset stuff. Or maybe making a @agoric/swingset-runner-chain package to hold it.

We've talked about having a separate package which provides the benchmarking support: when we do that, please consider moving swingset-runner/src/chain.js and the demo/vaultPerfTest/ stuff that provokes this dependency to that other package

An existing candidate for such another package is packages/boot. It has some somewhat similar test materials.

@turadg now that you untangled the dependency hierarchy in agoric-sdk, is a dependency from swingset-runner to inter-protocol a serious regression? I don't think it breaks the "no cycles" rule, but perhaps you figured out some other structural organization that is important to preserve?

dependency from swingset-runner to inter-protocol a serious regression

I had this concern but this doesn't actually create any cycles. CI would prevents such a regression.

weirded out by having swingset-runner sit above inter-protocol/deploy-script-support in the dependency graph

That makes sense to me. The vaults benchmark being run would have to be high up the graph but the "runner" tool could be lower.

I share Brian's concern about contaminating swingset-runner with stuff that's not part of SwingSet (mainly chain.js -- I don't think demo/vaultPerfTest is as much of a concern as it's isolated), but given how things are structured right now there wasn't really any other graceful way to do it. The longer term approach, I think, is for swingset-runner to support some kind of plugin scheme, so that the chain-specific stuff (or other hypothetical future uses for SwingSet) that needs to be loaded outside the SwingSet can be accommodated with code that would live in a different package, be tested independently, etc. But that would involve playing various games with dynamic import and so on, and that's just not on the critical path at the moment. I think what's in this PR is the best we're going to do for now. However, I will create a separate issue to put in the backlog so we don't lose track.

Update: added #8278 to address this.

…ark stuff - Change the default parity of init vs. resume: The `--init` command line option is now redundant as it is now the default (though it is still supported for backwards compatibility). In its place is a new `--resume` option with the same meaning of the older (also still supported) `--noinit` option. What this means is that absent command line parameters to the contrary, the default behavior of a swingset-runner execution will be to re-initialize the kernel state at the start of a run. This is a UX simplification, a reflection of the fact that in practice we found we wanted `--init` about 99% of the time, while exercising swingset resumption was very rare. Various scripts that depended on the old default behavior have been updated for the new world order. - There is a new command line option `--usebundlecache` that if present will configure the swingset being run to use a bundle cache. This will noticeably speed up successive executions of swingsets to a lot of bundling (for example, if the have a lot of vats). The bundle cache will be placed in the same directory with the swingstore (which is also typically where the swingset's source files and config file (if it has one) are maintained).

Remove awaits that were blocking bootstrap from completing because they were waiting on events that can only happen after bootstrap is done, while taking care to log any rejections that these post-bootstrap activities yield.

…m bootstrap

In this mode, the configured swingset is loaded indirectly, interposing a swingset-runner-provided benchmark controller as the swingset's bootstrap vat. The benchmark author is expected to provide a benchmark driver vat in a file named `vat-benchmark.js` adjacent to the swingset's config file. The benchmark driver vat is expected to expose the methods `setup` and `runBenchmarkRound`. The controller bootstrap invokes the swingset's "real" bootstrap and then tells the benchmark driver vat to perform setup for the benchmark. It then serves as a controller for triggering the execution of the directed number of bootstrap rounds. See demo/vaultPerfTest/vat-benchmark.js for an example.

This mode is invoked via the `--chain` command line option. It adds bridge and timer devices, simulated chain storage, and kicks off bootstrap using the core proposals manifest mechanism. This is an attempt to provide a minimal emulation of the services provided by cosmic-swingset without actually putting cosmic-swingset itself into the mix. I expect the functionality here to evolve considerably as we add more chain benchmarks.

Note: This commit also includes the stripped version of the Ava-based vaults benchmark, which is like the regular Ava-based benchmark but has had all the timing and data collection code removed from it (making it much shorter and easier to read) in the expectation of external rather than internal measurement. This was an engineering stepping stone to getting the benchmark to run inside a swingset, but is being committed to the repo because we suspect it may be a useful reference in the future.

…n context

FUDCo requested review from turadg, warner, mhofman, dckc, michaelfig, gibson042 and toliaqat August 23, 2023 02:13

dckc reviewed Aug 23, 2023

View reviewed changes

FUDCo force-pushed the swingset-perf-tool branch 2 times, most recently from 8f82a9e to 5fa8795 Compare August 28, 2023 17:45

FUDCo added swingset-runner package: swingset-runner performance Performance related issues labels Aug 28, 2023

FUDCo self-assigned this Aug 28, 2023

warner approved these changes Aug 29, 2023

View reviewed changes

FUDCo added 9 commits August 29, 2023 13:13

fix: minor fixes and cleanups in preparation for benchmark support

1277176

fix: remove manifest declarations that things are consumed that aren't

9e0272c

fix: remove infelicitous awaits in proposals

838ed96

Remove awaits that were blocking bootstrap from completing because they were waiting on events that can only happen after bootstrap is done, while taking care to log any rejections that these post-bootstrap activities yield.

fix: make tests work again after some (bogus) awaits were removed fro…

ca0a3aa

…m bootstrap

feat: enable swingset config file to be anywhere relative to executio…

9777c3a

…n context

FUDCo force-pushed the swingset-perf-tool branch from 5fa8795 to 9777c3a Compare August 29, 2023 20:13

FUDCo added the automerge:rebase Automatically rebase updates, then merge label Aug 29, 2023

mergify bot merged commit d56e0f7 into master Aug 29, 2023
78 checks passed

mergify bot deleted the swingset-perf-tool branch August 29, 2023 20:36

FUDCo mentioned this pull request Aug 29, 2023

Change swingset-runner modularity so we can move chain stuff out of it #8278

Open

This was referenced Sep 7, 2023

Introducing The Benchmarkerator, a platform for standalone performance benchmarks #8312

Merged

Implement framework for benchmarking operations on chain #8327

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SwingSet benchmarking tool, phase 1 #8239

SwingSet benchmarking tool, phase 1 #8239

FUDCo commented Aug 23, 2023

dckc left a comment

warner left a comment

warner Aug 29, 2023

dckc Aug 29, 2023

dckc Aug 29, 2023

turadg Aug 29, 2023

FUDCo Aug 29, 2023

FUDCo Aug 29, 2023

SwingSet benchmarking tool, phase 1 #8239

SwingSet benchmarking tool, phase 1 #8239

Conversation

FUDCo commented Aug 23, 2023

dckc left a comment

Choose a reason for hiding this comment

warner left a comment

Choose a reason for hiding this comment

warner Aug 29, 2023

Choose a reason for hiding this comment

dckc Aug 29, 2023

Choose a reason for hiding this comment

dckc Aug 29, 2023

Choose a reason for hiding this comment

turadg Aug 29, 2023

Choose a reason for hiding this comment

FUDCo Aug 29, 2023

Choose a reason for hiding this comment

FUDCo Aug 29, 2023

Choose a reason for hiding this comment