continuous testing - bot like traffic to stress test reliability and availability #59

InoMurko · 2020-06-02T15:16:45Z

We would like to stress test our services (childchain + vault + postgres specifically) and how they work under continuous trafic.

The testing kit should do the following:

act like a true accounting backoffice
deposit, transact, exit
configurable TPS and actors (ethereum addresses)
configurable TPS and actors over time span (5 tps over 5h, 3tps over 604800s)

Backoffice testing kit will do accounting like operations:

deposit - gas = omg network balance
deposit - gas - (gets debited) = omg network balance
deposit - gas + (gets credited) = omg network balance
deposit - gas + (gets credited) + SE = ethereum balance

Update. Task list based on #59 (comment) and #59 (comment) :

Create a module in the perf project that will create deposits and test balances on the child chain and the root chain. TPS, addresses, period, geth node url have to be configurable
Create a simple UI which will show test parameters (status, configuration, balances etc). Also, UI can start and configure tests.
Integrate datadog to collect stats (https://app.datadoghq.com/dashboard/pz5-694-h7e/app-metrics-vm-stats?from_ts=1598185098067&live=true&to_ts=1600777098067&tpl_var_dashboard_app_env=development) and metrics
Prepare docker container/release so the app can be deployed
Create API so circle ci builds can trigger the perf app

achiurizo · 2020-08-07T06:25:10Z

Follow up look into:

merge and/or delete elixir-omg perf sub app
merge and/or delete omg-js integration tests (figure out if we can absorb this)

ayrat555 · 2020-08-07T07:20:50Z

@InoMurko @achiurizo
My plan of action:

move omg_performance application into perf project. Currently, it's not obvious because omg_performance uses child_chain as a dependency {:omg_child_chain, in_umbrella: true, only: [:test], runtime: false},, maybe it can be solved with common behaviour or something.
move perf into the specs repo
make perf work both for elixir-omg and childchain
look into omg-js integration tests

is it ok?

InoMurko · 2020-08-07T08:32:10Z

So there's gonna be a couple of issues with your approach, that's because omg_performance is rather big (there are byzantine_events tests and perf exits tests as well). The proper way would be to implement these 1:1 in cabbage and in performance (depends on what the test does, not all are performance related). You'll need to look at them case by case.

Before you do 2 and 3, we need to look at what specs is becoming in terms of testing and how it's used. Having multiple repositories is notoriously hard to maintain and keeping them in sync.
Imagine the following scenario:
A new feature requires you do add a cabbage or/and performance tests to spec repo. How do you test that in cirleci within your repository? cc @achiurizo
I don't have an answer.

ayrat555 · 2020-08-07T08:36:54Z

Versioning the tests repo or adding feature tags to tests

InoMurko · 2020-08-07T08:39:20Z

can you elaborate on that? 🤔

ayrat555 · 2020-08-07T08:45:40Z

We can add feature tags to tests @moduletag my_new_feature and run these tests in repos where the feature is available mix test --include my_new_feature
the specs repo is added as a submodule so we can use different branches/tags/commit_hashes in different repos

InoMurko · 2020-08-07T08:47:31Z

regarding

this can quickly become a mess and I imagine it's hard to maintain
I don't see how this solves the problem of having multiple features being done in parallel, do they all branch from each other? how do you get the master tests back into branches and tags?

ayrat555 · 2020-08-07T08:53:14Z

will we have to maintain 2 childchains indefinitely? Shouldn't only one childchain leave in the end?
common tests will be stored in the master branch, specific tests will be stored in the branches. when the master is updated, child branches will have to pull changes. blockscout block explorer uses this approach: it maintains several instances of blockscout for different sidechains

InoMurko · 2020-08-07T09:06:12Z

The idea is that we'll be able to drop childchain1. but that's anywhere between 6months to a year from now.
specific tests will be stored in the branches. so if I have multiple features, do I reference multiple branches and submodule them?
For example:
Ayrat and Arthur both work on childchain2:

Ayrat works on /v2/block.get and adds tests into cabbage branch ayrat/v2_block_get
Arthur works on /v2/alarms.get and adds tests into cabbage branch arthur/v2_alarms_get

If you merge your PR into childchain2, how does Arhur in his PR reference your newly added tests when it gets into childchain2 master?

So... what I'm trying to avoid is having to maintain specs repo like it's a new project/product. Because to maintain it, we will need to dedicate engineering resources to merge PRs, resolve conflicts and branches. But since we're spawning a new repo I don't think there's any other way.

As long as every stakeholder (@achiurizo @T-Dnzt ) are aware of the associated cost and risk.

boolafish · 2020-08-07T09:09:29Z

maintain a syncing master-v2 branch like elixir-omg is probably the best practical idea so far? 😅

your v2/block.get, v2/alarms.get would be merged to master-v2 and auto sync promises any conflict would be resolved ASAP.

achiurizo · 2020-08-10T21:27:34Z

So... what I'm trying to avoid is having to maintain specs repo like it's a new project/product. Because to maintain it, we will need to dedicate engineering resources to merge PRs, resolve conflicts and branches. But since we're spawning a new repo I don't think there's any other way.

I think for the current interim, this will probably be the case until we decommission childchain1. It's not ideal, but the specs repository current setup seems cleanest for the current setup(public v. private / watcher v. childchain v. elixir-omg).

maintain a syncing master-v2 branch like elixir-omg is probably the best practical idea so far? 😅

It really probably has to come to some form of branch management like this. Ignoring the API example above, contract changes alone(like the v2 contracts being worked on) will need some tracking branch on the other repositories to test the changes(so that would be now childchain, specs, elixir-omg/watcher). I'm open to other suggestions for us to consider, but currently the cost and risk are acknowledged.

I'll also mention that we are also hiring for a QA role 🔜 . Hopefully this specs project could also use support from this position as well.

InoMurko · 2020-09-04T10:10:00Z

@ayrat555 could we get a review of the current
specs and performance testing? Are we able to do with the existing tooling what the ticket is requesting?

boolafish · 2020-09-04T15:04:02Z

We have a watcher_info tests in perf: https://github.com/omgnetwork/elixir-omg/blob/master/priv/perf/apps/load_test/test/load_tests/runner/watcher_info_test.exs

The main logic is in this file: https://github.com/omgnetwork/elixir-omg/blob/master/priv/perf/apps/load_test/lib/scenario/account_transactions.ex

This one simulates a user using watcher info APIs to create transaction and send transaction (and randomly hit other watcher info APIs). The script was tested to be able to run for a day long time ago (note, at the time hakney retry was turned on). Might be a good one to use it for simulate production load testing (plus this is more similar to how our user would hit our service).

ayrat555 · 2020-09-17T07:55:45Z

Follow up look into:

merge and/or delete elixir-omg perf sub app

merge and/or delete omg-js integration tests (figure out if we can absorb this)

omg-js integration tests (org-mode format)

** addTokenTest

*** add token that the chain had

*** add token that the chain did not have

** amountTypesTest

*** validates types that approveToken acceps

** challengeExitTest

*** challenge a dishonest exit

** challengeInFlightExitInputSpentTest

*** challenge an in-flight exit as non canonical and challenge an invalid input piggyback

** challengeInFlightExitOutputSpentTest

*** challenge output from a canonical invalid output piggyback

** createSubmitTypedTransactionTest

*** create and submit a single currency transaction using submitTyped

** createTransactionTest

*** create and submit a single currency transaction

*** create a single currency transaction

*** create and submit a mixed currency transaction

*** create an incomplete transaction

*** split a utxo into 4

** deleteNonPiggybackedInFlightExitTest

*** delete an ife if not piggybacked after the first phase and return ife bond

** depositTest

*** deposit ETH to the Plasma contract

*** deposit ERC20 tokens to the Plasma contract

** exitWithoutWatcherTest

*** creates required exit data

** getExitQueueTest

*** processes exits and returnds empty queue

** inFlightExitChallengeResponseTest

*** invalid IFE challenge response

** inFlightExitChallengeTest

*** challenge an in-flight exit by providing a competitor

*** challenge an in-flight exit as non canonical

** inFlightExitTest

*** exit a ChildChain transaction

*** exit a ChildChain transaction that is not included

** mergeUtxoTest

*** merge utxos to a single utxo

** metadataTest

*** checks if childchain adds different types of metadata

** simpleStartStandardExitTest

*** start standard exit for a ChildChain transaction

** standardExitTest

*** exit a deposit

*** exit a ChildChain transaction

*** exit ERC20 token

** transferTest

*** transfer ETH and ERC20 tokens on the childchain

*** transfer funds on the childchain with eth currency

*** transfer ERC20 tokens on the childchain

*** transfer ETH on the childchain

*** send a transaction with 4 inputs and 4 outputs

merge and/or delete omg-js integration tests (figure out if we can absorb this)

Do we really need to delete anything from omg-js integration tests? I think the main purpose of them is to test the correctness of omg-js, not elixir-omg.
Also, I think omg-js integration tests cover more case than specs. specs is missing some challenge cases, add token tests, different kind of transaction submit cases

merge and/or delete elixir-omg perf sub app

I think it's possible to use perf but it will have to be extended because:

It doesn't have cases described in the issue (deposit, transact, exit)
I think it's meant only to measure execution time since it doesn't check if balances are correct

chaperon library used for perf has Loop which can be used to run a test repeatedly for a given duration.

So my suggestion is to create separate cases which will cover points in the issue

kevsul · 2020-09-18T07:10:56Z

Do we really need to delete anything from omg-js integration tests?

No, I don't think we need to delete anything here - apart from testing the omg-js code itself, they also provide good example code for JS developers.

It doesn't have cases described in the issue (deposit, transact, exit)

There's already deposit and transact code in there, and PR omgnetwork/elixir-omg#1732 (review needed 😄 ) introduces a test for starting standard exits. Exit processing, IFEs, challenges, etc still need to be added.

I think it's meant only to measure execution time since it doesn't check if balances are correct

Yeah, the perf tests are just that - to measure perf/load/stress. It's assumed that correctness is already tested elsewhere.

One thing I find cumbersome about Chaperon tests is how they're configured - if you want to change e.g. how many iterations, you have to edit the config file, recompile and run. It would be great to be able to pass params on the command line. There are lots of configuration params though, so ideally we could set defaults in a file and override them on the command line. I don't know whats the best way of doing that in Elixir though...

InoMurko · 2020-09-18T07:46:18Z

So if we agree that omg-js and perf/ are two separate projects, not supposed to cover the same thing... I propose we do the following:

Lets make perf usable all around by:

package perf into a release
interact with perf via a simple frontend (perhaps liveview?) or some frontend machinery that allows us to quickly build forms
create RESTful APIs so that we can interact with it from our CI/CD tooling
do not forget the ability to point the test towards different environment
set configuration via website and REST API

Of course, the question is, why would we built a frontend if we have omg-js - I think we will see that the amount of pressure elixir node can create (in terms of IO and concurrency) would far exceed any other tooling.

As this is considered developer tooling and testing, we will specify our requirements on the fly.
@achiurizo can we get your approval on this?

boolafish · 2020-09-18T08:48:38Z

can you elaborate a bit more on why the frontend is needed?

boolafish · 2020-09-18T09:10:39Z

To elaborate more on what’s needed for release pipeline, I think it would either being run directly on the release pipeline’s agent to respond back test succeed or not or perf need to call a web hook back the release pipeline to continue or abort the automated promotion

InoMurko · 2020-09-18T11:39:35Z

In terms of usability, APIs are great for machine 2 machine collaboration.

APIs are less suitable for observability and introspection where humans are needed. And as we turn these tests into stateful services that might run tests for more then 24hours this kind of tooling will become useful. Imagine Datadog being just APIs...

boolafish · 2020-09-18T23:52:31Z

I think I kind of get but still not following well. I get UI is better for human, curious what do you expect it to show/use from UI. From description comment above it seems to provide ability to set config via a form UI?

For other part, if we need any inspection of the perf test running, should we just get ability of observation on datadog. Or do you want to build fancy UI showing how loads are hitting 😱😱

ayrat555 · 2020-09-21T07:29:33Z

I was playing with LiveView over the weekend. It seems it's pretty neat because you can write interactive web pages almost without writing any js code. But I have the same question as @boolafish. what kind of real-time information will be shown there (errors, rps, current test, progress, etc) ?

InoMurko · 2020-09-21T08:41:20Z

Will provide details by EOD!

InoMurko · 2020-09-21T14:52:20Z

Because these tests are going to be statefull, we would display:

pick available tests and set test configuration
review and export test data
view currently running test, it's progress, configuration (what parameters were used to kick off the test), statistics (what is the success rate of the currently running test, what TPS it's using, account balances, accounts in use, average response time received, ...). For certain things it will definitely be easier to integrate with Datadog! Look at the screenshot below.
provide hooks to kick off tests from accounts that are in MetaMask for example, or provide private keys,...
https://github.com/zorbash/observer_live useful to see the impact of the test on the node

For example, data like this definetly belongs into DD:

achiurizo · 2020-09-21T18:07:13Z

As this is considered developer tooling and testing, we will specify our requirements on the fly.
@achiurizo can we get your approval on this?

I think this is great. Thanks for looking into this! I approve. 👍

achiurizo · 2020-09-21T20:30:03Z

pick available tests and set test configuration
review and export test data
view currently running test, it's progress, configuration (what parameters were used to kick off the test), statistics (what is the success rate of the currently running test, what TPS it's using, account balances, accounts in use, average response time received, ...). For certain things it will definitely be easier to integrate with Datadog! Look at the screenshot below.
provide hooks to kick off tests from accounts that are in MetaMask for example, or provide private keys,...
https://github.com/zorbash/observer_live useful to see the impact of the test on the node

This is just a suggestion, while the form and UI does sound nice to work with, Datadog also provides synethtic testing, this allows us to create custom API/browser tests within Datadog for a service. A couple of nice things;

PRO: You can do this all within datadog and record the response. I believe you can also save the tests
PRO: Can specify a lot of configurations like location, alarms, assertions(did you get a 200) and test frequency (do this every 1 hr)
CON: You are stuck in Datadog.
CON: Maybe need to do some re-work to show all the information above (stats and TPS may need to be grab from another graph/visualization not in the API synthetic test)

ayrat555 · 2020-09-22T10:28:00Z

what can we do now as a proof of concept:

Create a testing module that will trigger deposits and check for balances on the root chain and the child chain. This test will have configurable:

from and to addresses
tps
period
geth node url

Add frontend which will allow:

configuring values described above for the test
starting the test
monitoring the test ( status, configuration, response speed )
monitoring the chain state ( balances )

boolafish · 2020-09-23T01:53:48Z

FYI, share some knowledges:

It is not really possible to trigger specific "tps" accurately. Most frameworks, including chaperon, allows to config concurrent user/session instead. One will need to "translate" concurrent session into TPS. However, it would not be super accurate. For instance, in past experience, if we config to run TPS 100 from my Mac, it usually results in average TPS of 9x. If we would like to assert the test result, I guess we would have to, for instance, config target TPS to 110 or 120 and check the end result is > 100.
To check the balance, we probably need sth like: feat: add metrics: balance per currency, number of unspent outputs, number of addresses elixir-omg#1666 that submits childchain balance metric to datadog. Otherwise I am not sure how could you get total balance of childchain.
Shortest path yo make perf useful now without manual action so that we can hook into current release pipeline automation is to add assertion in the test with datadog metric and put all var that need to config from env var I guess. (eg. TPS, period, addresses...) and let the pipeline agent runs the perf tests directly. The pipeline can set its own env var to run the test.

ayrat555 · 2020-09-23T07:35:02Z

@boolafish I have a question the third point. Do you suggest adding datadog metrics to the current tests? will it be useful?

boolafish · 2020-09-23T08:17:46Z

ah...was inspired by this: https://k6.io/blog/datadog-alerts-fail-load-test

But something not going so far as 👆above would be like getting API latency metric (eg P90/P99) and assert on that. That is sth probably not that great if we use metric from perf as it would be the round time latency metric instead. For TPS, probably we can get it by perf itself. And for API Error rate I guess we can put a metric counter on the level of Tesla middleware to collect (?) 🤔

I guess above might be some basic metrics we would can assert against (latency, TPS, error rate) for performance tests. Then we can have automation tests saying we can run TPS 100 for 30 minutes with P99 latency to not exceed Xms and error rate < Y%. Currently.....lol it really depends on the release manager to decide what to see after running the load test. (we run load test on sandbox when there is release. But there is no good interpretation atm)

For sth more fancy like memory leak, CPU rate we might apply the idea from the article.

boolafish · 2020-09-30T10:32:29Z

Link to some pipeline requirement and random thought: #141 (comment)

boolafish · 2020-10-05T22:31:17Z

An update on the release tooling. We have decide to go forward with spinnaker which have out of box support to trigger/monitor kubernate job. As a result, the way to hook perf and release pipeline would be trigger perf with k job.

InoMurko · 2020-12-11T17:34:26Z

Done part of performance and spinnaker. thanks @ayrat555 @boolafish

achiurizo added testing testing the description that this is a test L Large amount of work labels Jun 17, 2020

ayrat555 self-assigned this Aug 7, 2020

boolafish mentioned this issue Aug 19, 2020

Clean up perf to be able to run against env omgnetwork/elixir-omg#1600

Closed

ayrat555 mentioned this issue Sep 29, 2020

Performance app API / UI #141

Closed

This was referenced Oct 12, 2020

Performance tests #145

Closed

interpreting results of performance tests #149

Closed

InoMurko closed this as completed Dec 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

continuous testing - bot like traffic to stress test reliability and availability #59

continuous testing - bot like traffic to stress test reliability and availability #59

InoMurko commented Jun 2, 2020 •

edited by ayrat555

Loading

achiurizo commented Aug 7, 2020

ayrat555 commented Aug 7, 2020 •

edited

Loading

InoMurko commented Aug 7, 2020

ayrat555 commented Aug 7, 2020

InoMurko commented Aug 7, 2020

ayrat555 commented Aug 7, 2020

InoMurko commented Aug 7, 2020

ayrat555 commented Aug 7, 2020

InoMurko commented Aug 7, 2020 •

edited

Loading

boolafish commented Aug 7, 2020

achiurizo commented Aug 10, 2020

InoMurko commented Sep 4, 2020

boolafish commented Sep 4, 2020

ayrat555 commented Sep 17, 2020 •

edited

Loading

kevsul commented Sep 18, 2020

InoMurko commented Sep 18, 2020

boolafish commented Sep 18, 2020

boolafish commented Sep 18, 2020 •

edited

Loading

InoMurko commented Sep 18, 2020 •

edited

Loading

boolafish commented Sep 18, 2020

ayrat555 commented Sep 21, 2020

InoMurko commented Sep 21, 2020

InoMurko commented Sep 21, 2020

achiurizo commented Sep 21, 2020

achiurizo commented Sep 21, 2020

ayrat555 commented Sep 22, 2020

boolafish commented Sep 23, 2020 •

edited

Loading

ayrat555 commented Sep 23, 2020

boolafish commented Sep 23, 2020 •

edited

Loading

boolafish commented Sep 30, 2020

boolafish commented Oct 5, 2020

InoMurko commented Dec 11, 2020

continuous testing - bot like traffic to stress test reliability and availability #59

continuous testing - bot like traffic to stress test reliability and availability #59

Comments

InoMurko commented Jun 2, 2020 • edited by ayrat555 Loading

achiurizo commented Aug 7, 2020

ayrat555 commented Aug 7, 2020 • edited Loading

InoMurko commented Aug 7, 2020

ayrat555 commented Aug 7, 2020

InoMurko commented Aug 7, 2020

ayrat555 commented Aug 7, 2020

InoMurko commented Aug 7, 2020

ayrat555 commented Aug 7, 2020

InoMurko commented Aug 7, 2020 • edited Loading

boolafish commented Aug 7, 2020

achiurizo commented Aug 10, 2020

InoMurko commented Sep 4, 2020

boolafish commented Sep 4, 2020

ayrat555 commented Sep 17, 2020 • edited Loading

kevsul commented Sep 18, 2020

InoMurko commented Sep 18, 2020

boolafish commented Sep 18, 2020

boolafish commented Sep 18, 2020 • edited Loading

InoMurko commented Sep 18, 2020 • edited Loading

boolafish commented Sep 18, 2020

ayrat555 commented Sep 21, 2020

InoMurko commented Sep 21, 2020

InoMurko commented Sep 21, 2020

achiurizo commented Sep 21, 2020

achiurizo commented Sep 21, 2020

ayrat555 commented Sep 22, 2020

boolafish commented Sep 23, 2020 • edited Loading

ayrat555 commented Sep 23, 2020

boolafish commented Sep 23, 2020 • edited Loading

boolafish commented Sep 30, 2020

boolafish commented Oct 5, 2020

InoMurko commented Dec 11, 2020

InoMurko commented Jun 2, 2020 •

edited by ayrat555

Loading

ayrat555 commented Aug 7, 2020 •

edited

Loading

InoMurko commented Aug 7, 2020 •

edited

Loading

ayrat555 commented Sep 17, 2020 •

edited

Loading

boolafish commented Sep 18, 2020 •

edited

Loading

InoMurko commented Sep 18, 2020 •

edited

Loading

boolafish commented Sep 23, 2020 •

edited

Loading

boolafish commented Sep 23, 2020 •

edited

Loading