Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use fast-check to test concurrency consistency invariants #382

Closed
CMCDragonkai opened this issue Jun 12, 2022 · 35 comments
Closed

Use fast-check to test concurrency consistency invariants #382

CMCDragonkai opened this issue Jun 12, 2022 · 35 comments
Assignees
Labels
development Standard development r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy

Comments

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jun 12, 2022

Specification

We are likely to have alot of concurrency consistency invariants to test. EFS showed us alot of potential pitfalls and bugs in race conditions. SI helps at least ensure that we always have consistency at the DB layer, however it turns out testing concurrency is always really complicated and involves some overhead. Simple expect tests are not enough.

Here comes https://github.com/dubzzz/fast-check, it's a property based testing based on quickcheck. This can be used for quick light-weight model based testing.

Primarily quick-check related libraries provide 2 things:

  1. A quick way of generating random data according to some schema, which is useful for any kind fuzzing (and we saw how fuzzing was useful in solving our DB encoding issues)
  2. And a way to specify models and testing software if they fit our models.

While other model-based testing tools such as ModelJUnit 1 are perfectly valid alternatives, as a model-based testing tool, we selected QuickCheck [4] for the fact that it provides test case shrinking functionality whereby upon detecting a failure, it searches for smaller test cases which also generate the fault to aid debugging. Although initially developed as a tool for the Haskell programming language, it has been implemented on a number of platforms and for this work, we utilise the Erlang implementation [2]. QuickCheck is primarily a random test data generation tool but it also supports model-based testing. Given a model and a set of pre and postconditions, the tool is able to automatically generate test cases which are used to verify that the conditions hold in the context of different sequences of actions.

So we should see if fast-check can help us do concurrency testing in a number of domains:

  1. Our network domain has lots of concurrency issues
  2. Vaults domain - lots of reads/writes here
  3. Nodes domain - lots of management of nodes
  4. Gestalts and ACL domain

Additional context

Tasks

  1. Investigate how fast-check can help with concurrent tests and fuzz tests
  2. Investigate how to integrate fast-check into jest testing including timeouts and reporting
  3. Apply fast-check to EFS tests and PK tests
@CMCDragonkai CMCDragonkai added the development Standard development label Jun 12, 2022
@teebirdy teebirdy added r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy labels Jul 24, 2022
@tegefaulkes
Copy link
Contributor

Ok, looks like using a combination of fc.asyncProperty and fc.scheduler we gain finer control over the order of concurrent operations. it will run the test multiple times with a random order for the scheduler. This will help a lot in the EFS tests where in a lot of cases we run the test twice with two different orders. So we can reduce the lines of code for tests there.

One of the better features here is that we can specify a deterministic test for re-creating failures. Or we can define our own test conditions that will be run every time. When a test fails it will log out the exact conditions that led to the failures. To make full use of this we will need our own custom arbitraries https://github.com/dubzzz/fast-check/blob/main/packages/fast-check/documentation/AdvancedArbitraries.md .

Here is an example of how to use this.

  test('concurrent trusting gestalts', async () => {
    const acl = await ACL.createACL({ db, logger });
    await fc.assert(
      fc.asyncProperty(fc.scheduler(), async (s) => {

        await Promise.all([
          s.schedule(acl.setNodesPerm([nodeIdG1First, nodeIdG1Second] as Array<NodeId>, {
          gestalt: {
            notify: null,
          },
          vaults: {
            [vaultId1]: { pull: null },
          },
        })),
        s.schedule(acl.setNodesPerm([nodeIdG1First, nodeIdG1Second] as Array<NodeId>, {
          gestalt: {
            notify: null,
          },
          vaults: {
            [vaultId1]: { pull: null },
          },
        })),
        await s.waitAll()
        ]);
      })
    );
  })

example of the failure


Error: Property failed after 1 tests
{ seed: -1604596825, path: "0", endOnFailure: true }
Counterexample: [schedulerFor()`
-> [task${2}] promise rejected with value new Error("")
-> [task${1}] promise resolved`]
Shrunk 0 time(s)
Got error: ErrorDBTransactionConflict

Stack trace: ErrorDBTransactionConflict: 
    at constructor_.commit (/home/faulkes/matrixcode/polykey/js-polykey/node_modules/@matrixai/db/src/DBTransaction.ts:461:17)
    at /home/faulkes/matrixcode/polykey/js-polykey/node_modules/@matrixai/db/src/DB.ts:210:15
    at withF (/home/faulkes/matrixcode/polykey/js-polykey/node_modules/@matrixai/resources/src/utils.ts:31:7)

Hint: Enable verbose mode in order to have the list of all failing values encountered during the run

    at asyncThrowIfFailed (/home/faulkes/matrixcode/polykey/js-polykey/node_modules/fast-check/lib/check/runner/utils/RunDetailsFormatter.js:133:11)

I think fast-check's utility is limited here. If we're just testing for transaction conflicts then the order of operations doesn't really matter much. It can be useful for randomising values for testing such as IDs and strings if we want to add more robustness to our testing.

@tegefaulkes
Copy link
Contributor

tegefaulkes commented Aug 16, 2022

As for generating random values for testing. It can be done like so;

  test('random values', async () => {
    await fc.assert(
      fc.asyncProperty(
        // We can generate arrays
        fc.array(fc.integer(), {maxLength: 5}), 
        // We can filter for conditions on values
        fc.float().filter( num => num > 10), 
        fc.boolean(), 
        async (integer, float, bool) => {
        // We can set pre-conditions to adhere too.
        // If this condition is false then the test is tried again with new values.
        // The test will fail if this fails too often.
        fc.pre(integer[0] < float);
        console.log(integer, float, bool);
        // The fc.assert will fail in the case of a thrown error or if we return any falsy value.
        return bool;
      })
    );
  })

We provide a set of arbitraries types that are generated and then provided to the test. One very useful feature is that when a test failure happens it will shrink test values down to the simplest set causing the failure.

Example failure;

Error: Property failed after 1 tests
{ seed: -728782760, path: "2:1:0:0:1:1:1:1:1:1:1:1:1:1:1:1:1:1:1:1:1:1:1:1:1", endOnFailure: true }
Counterexample: [[0],10.000000953674316,false]
Shrunk 24 time(s)
Got error: Property failed by returning false

Hint: Enable verbose mode in order to have the list of all failing values encountered during the run

    at asyncThrowIfFailed (/home/faulkes/matrixcode/polykey/js-polykey/node_modules/fast-check/lib/check/runner/utils/RunDetailsFormatter.js:133:11)

So this can be useful for for finding any intermittent errors in a test. Using the test failure information here we can deterministically run the same conditions that caused the failure over and over again. Very useful for finding some of the harder to find bugs.

@CMCDragonkai
Copy link
Member Author

Detection of a transaction conflict wouldn't be useful unless the idea is upon a random selection of async operations, some will have conflicts and some will not under certain conditions.

It can also be applied to the counter racing to ensure that we don't have conflicts in that case and that the end state is consistent.

Also many of our domains have an expected end state after concurrent operations. We should use this.

@tegefaulkes
Copy link
Contributor

Creating custom arbitrary seem simple enough. You can derive them from an existing arbitrary. Doing this we can also benefit from shrinkage.

  test('custon arbitrary', async () => {
     const nodeIdArb: fc.Arbitrary<NodeId> = fc.uint8Array({maxLength: 32, minLength: 32})
       .map((array) => IdInternal.fromBuffer<NodeId>(Buffer.from(array)));

    await fc.assert(
      fc.asyncProperty(
        nodeIdArb,
        async (nodeId) => {
          console.log(nodeId);
          return false;
        })
    );

  })

Will throw and shrink to;

Error: Property failed after 1 tests
{ seed: -1277985303, path: "0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0", endOnFailure: true }
Counterexample: [IdInternal.from([0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0])]
Shrunk 29 time(s)
Got error: Property failed by returning false

Hint: Enable verbose mode in order to have the list of all failing values encountered during the run

    at asyncThrowIfFailed (/home/faulkes/matrixcode/polykey/js-polykey/node_modules/fast-check/lib/check/runner/utils/RunDetailsFormatter.js:133:11)

@tegefaulkes
Copy link
Contributor

So in summary, this will be very useful for the following test cases.

  • Concurrency tests where the order affects the outcome of the test. EFS does a lot of this.
  • Using randomly generated test inputs for a variety of conditions are recreating an failures deterministically. For example setting permissions with a random set of permissions and nodes. Or testing node graph operations on random sets of nodes.
  • Testing against state machines.

It's not so useful for;

  • concurrency tests where the concurrent call results in error regardless of order.

@tegefaulkes
Copy link
Contributor

For longer running tests we will need to adjust the amount of testing done using the numRuns and interruptAfterTimeLimit parameters for fc.assert.

@tegefaulkes
Copy link
Contributor

domains that can make use of the deterministic random value generation.

  • ACL
  • claims
  • gestalts
  • nodes
  • node graph
  • sigchain
  • validation?

can make use of scheduler for concurrency

  • discovery - concurrently adding and consuming the queue, we can schedule mocking results as well.
  • network - May be useful to combine tests where dfferent events happen. Such as who initiates a connection end.
  • notifications - concurrent adding, reading and removing notifications.
  • sessions
  • vaults - for checking if locking prevents transaction conflicts and enforces seralisation.

@CMCDragonkai
Copy link
Member Author

Can we start integrating it into our db changes so we can test for transaction conflicts? Starting from the RPC handlers. I reckon some tests can be replaced with using this too.

@CMCDragonkai
Copy link
Member Author

New PR required.

@CMCDragonkai
Copy link
Member Author

Ok, looks like using a combination of fc.asyncProperty and fc.scheduler we gain finer control over the order of concurrent operations. it will run the test multiple times with a random order for the scheduler. This will help a lot in the EFS tests where in a lot of cases we run the test twice with two different orders. So we can reduce the lines of code for tests there.

One of the better features here is that we can specify a deterministic test for re-creating failures. Or we can define our own test conditions that will be run every time. When a test fails it will log out the exact conditions that led to the failures. To make full use of this we will need our own custom arbitraries https://github.com/dubzzz/fast-check/blob/main/packages/fast-check/documentation/AdvancedArbitraries.md .

Here is an example of how to use this.

  test('concurrent trusting gestalts', async () => {
    const acl = await ACL.createACL({ db, logger });
    await fc.assert(
      fc.asyncProperty(fc.scheduler(), async (s) => {

        await Promise.all([
          s.schedule(acl.setNodesPerm([nodeIdG1First, nodeIdG1Second] as Array<NodeId>, {
          gestalt: {
            notify: null,
          },
          vaults: {
            [vaultId1]: { pull: null },
          },
        })),
        s.schedule(acl.setNodesPerm([nodeIdG1First, nodeIdG1Second] as Array<NodeId>, {
          gestalt: {
            notify: null,
          },
          vaults: {
            [vaultId1]: { pull: null },
          },
        })),
        await s.waitAll()
        ]);
      })
    );
  })

example of the failure


Error: Property failed after 1 tests
{ seed: -1604596825, path: "0", endOnFailure: true }
Counterexample: [schedulerFor()`
-> [task${2}] promise rejected with value new Error("")
-> [task${1}] promise resolved`]
Shrunk 0 time(s)
Got error: ErrorDBTransactionConflict

Stack trace: ErrorDBTransactionConflict: 
    at constructor_.commit (/home/faulkes/matrixcode/polykey/js-polykey/node_modules/@matrixai/db/src/DBTransaction.ts:461:17)
    at /home/faulkes/matrixcode/polykey/js-polykey/node_modules/@matrixai/db/src/DB.ts:210:15
    at withF (/home/faulkes/matrixcode/polykey/js-polykey/node_modules/@matrixai/resources/src/utils.ts:31:7)

Hint: Enable verbose mode in order to have the list of all failing values encountered during the run

    at asyncThrowIfFailed (/home/faulkes/matrixcode/polykey/js-polykey/node_modules/fast-check/lib/check/runner/utils/RunDetailsFormatter.js:133:11)

I think fast-check's utility is limited here. If we're just testing for transaction conflicts then the order of operations doesn't really matter much. It can be useful for randomising values for testing such as IDs and strings if we want to add more robustness to our testing.

Here's an alternative using void to avoid linting issues. Also we want to know if scheduleFunction should be used instead because we want to avoid executing the asynchronous work before we start the scheduler.

  test('concurrent trusting gestalts', async () => {
    const acl = await ACL.createACL({ db, logger });
    await fc.assert(
      fc.asyncProperty(fc.scheduler(), async (s) => {
          s.scheduleFunction()
          void s.schedule(
            acl.setNodesPerm(
              [nodeIdG1First, nodeIdG1Second] as Array<NodeId>,
              {
                gestalt: {
                  notify: null,
                },
                vaults: {
                  [vaultId1]: { pull: null },
                },
              }
            )
          );
          void s.schedule(
            acl.setNodesPerm([nodeIdG1First, nodeIdG1Second] as Array<NodeId>, 
            {
              gestalt: {
                notify: null,
              },
              vaults: {
                [vaultId1]: { pull: null },
              },
            }
          ));
          await s.waitAll();
      })
    );
  })

The Promise.all seemed unnecessary because we can just await s.waitAll(), it is the scheduler to run all the tasks anyway.

The expectation is that the callback passed at the end of fc.asyncProperty will either not return false or not throw an exception. If it throws an exception, the report of fast check should be telling us, how to minimally reproduce it.

@CMCDragonkai
Copy link
Member Author

It looks like the fast check report tells us how to reproduce the failure. @tegefaulkes should try this using the given code:

{ seed: -1604596825, path: "0", endOnFailure: true }
Counterexample: [schedulerFor()`
-> [task${2}] promise rejected with value new Error("")
-> [task${1}] promise resolved`]

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Aug 17, 2022

We need to configure the number of times it runs, and to have its max timeout in line with jest's test timeout too. This only applies for asynchronous tests.

We can default it to the global.defaultTimeout for all usages of fc. This can be partially applied in tests/utils.ts or tests/utils/fc.ts.

When a test has an overridden timeout, it should probably apply to the fc as well.

If it runs alot, it may blow up our test times overall. We might want to reduce the number of times it runs for very simple tests.

@CMCDragonkai
Copy link
Member Author

I think the random arbitraries can be useful for any data that we are processing. This includes the NodeId as you've demonstrated for NodeGraph, but also all the other IDs as well like VaultId like in VaultName.

For things like IdSortable, it would be more difficult to generate an arbitrary for this because it monotonic. But it is possible to generate deterministic IdSortable by plugging the timeSource and randomSource.

I think alot of vault domain operations can be applied to randomly generated files. Arbitrary buffers basically.

Any time we are reading something where we have to parse the data, would be suitable. So imagine reading a file, or reading off the network, any case where we are parsing that data into a useful structure can make use of fuzzing the input data. In many cases we may just be doing JSON.parse along with the json schema validation. So we could ask fast check to randomly generate JSON to ensure that our parser is working correctly. If it could be schema driven that would be nice.

So I think for any of our functions in PK, we have a first line of defense being the type signature. Like number, SomeClassType... etc. We don't need test alternative types for these functions, we rely on the type system to guarantee this. We would only use fast check at the value level. So for example if a function takes a number, then we could fast check the numeric values themselves like 0, Infinity, NaN, 1.5. So then fast check acts like a second level of defense. However I don't think this has to apply to every function we use... because that is likely to take lot of test time.

@CMCDragonkai
Copy link
Member Author

As for the asynchronous scheduling, I think that would work quite nicely when we have more than 2 operations running concurrently.

But I also think it's a good idea to use it for any place where we want to assert how a race condition may or may not be possible.

I think the first place would be the counter racing areas. Like any place where we have multiple creation operations, that may step on each other, lets use fast check to ensure that they don't step on each other. Test any concurrent operations that shouldn't be clobbering the data the end.

@CMCDragonkai
Copy link
Member Author

One example EFS where none of the operations should return a transaction conflict, because it's supposed to lock everything up. We could use this tool to fuzz the EFS sending in lots of reads/writes in different ways and at the end ensure that we don't throw a transaction conflicts.

Other post conditions should also be considered. Not sure what they might right now.

@CMCDragonkai
Copy link
Member Author

@tegefaulkes should find out how to reproduce test failures and report it here, can be useful in the future.

@tegefaulkes
Copy link
Contributor

This may be useful to streamline fast-check with jest.
https://www.npmjs.com/package/@fast-check/jest

@CMCDragonkai
Copy link
Member Author

That can be useful since it will support only and skip and todo.

@CMCDragonkai
Copy link
Member Author

Does it integrate timeouts though from jest?

@tegefaulkes
Copy link
Contributor

It would reduce the noise of the fc.assert and fc.property blocks. The details are light so i don't know if it integrates with the jest timeouts. Let me have a quick look at the source.

@tegefaulkes
Copy link
Contributor

It doesn't do anything fancy with the parameters. So no it won't propagate the jest timeouts to the fc.assert.

@tegefaulkes
Copy link
Contributor

Given a failed attempt

Error: Property failed after 1 tests
{ seed: 489541824, path: "0", endOnFailure: true }
Counterexample: [6]
Shrunk 0 time(s)
Got error: Error: expect(received).toBeLessThanOrEqual(expected)

Expected: <= 5
Received:    6

Stack trace: Error: expect(received).toBeLessThanOrEqual(expected)

Expected: <= 5
Received:    6
    at /home/faulkes/matrixCode/polykey/js-encryptedfs/tests/EncryptedFS.concurrent.test.ts:64:26
    at AsyncProperty.predicate (/home/faulkes/matrixCode/polykey/js-encryptedfs/node_modules/fast-check/lib/check/property/AsyncProperty.js:16:96)
    at AsyncProperty.run (/home/faulkes/matrixCode/polykey/js-encryptedfs/node_modules/fast-check/lib/check/property/AsyncProperty.generic.js:40:39)
    at asyncRunIt (/home/faulkes/matrixCode/polykey/js-encryptedfs/node_modules/fast-check/lib/check/runner/Runner.js:24:21)
    at Object.<anonymous> (/home/faulkes/matrixCode/polykey/js-encryptedfs/tests/EncryptedFS.concurrent.test.ts:61:7)

Hint: Enable verbose mode in order to have the list of all failing values encountered during the run

    at asyncThrowIfFailed (/home/faulkes/matrixCode/polykey/js-encryptedfs/node_modules/fast-check/lib/check/runner/utils/RunDetailsFormatter.js:133:11)

There are two ways we can recreate the result.

1st by using the seed information provided { seed: 489541824, path: "0", endOnFailure: true }. This will force the test to run determinitically the same way to produce the outcome first time. This would be good for debugging.

the 2nd is to add the counter example Counterexample: [6] to the examples list { examples: [[6]] }. The examples provided this way are tested against first before using randomised data. Note that adding examples doesn't make the test run more times. So if your examples exceed the number of tests to run then the tests stop on the examples before running on generated data.

For the scheduler the seed method will work fine, and the example method can be used as well.

// The counter example will look like...
Counterexample: [6,schedulerFor()`
-> [task${2}] promise resolved with value "b"
-> [task${1}] promise resolved with value "a"`]

// And can be use with...
{
  example: [
    [6,schedulerFor()`
-> [task${2}] promise resolved with value "b"
-> [task${1}] promise resolved with value "a"`]
  ]
}

@CMCDragonkai
Copy link
Member Author

Where is the example property passed into?

@tegefaulkes
Copy link
Contributor

For the first it would've been

await fc.assert(
        fc.asyncProperty(fc.integer(), async (number) => {...}
)

The second

await fc.assert(
        fc.asyncProperty(fc.integer(), fc.scheduler(), async (number, s) => {...}
)

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Aug 19, 2022

When dealing with EFS, the result check can use regular expressions. This is because the result is just a byte string.

When dealing with PK, the result check may have to use asymmetric matchers to work agains the object instead. Jest extended provides additional asymmetric matchers in particular the toSatisfy. This is because we are dealing with rich objects and not just byte strings.

In a way, the more ideal test case would be indicate that a given way of doing things should result in a particular output deterministically. But it's difficult to enumerate all the ways that could be the case, so instead, it's easier to say all these possible matches are correct, but then try to test as many concurrent variants as possible. This is all due to observable non-determinism.

A model check on the other hand would argue that each output state is a state that needs to be checked with a deterministic path through the state machine. But this works best at a higher level, otherwise it's overkill for smaller dataflow functions. So front end components but also databases and other stateful components. This is where xstate could be used with fast check... but no documentation on this atm.

@CMCDragonkai
Copy link
Member Author

Example of command generation: #438 (comment)

@CMCDragonkai
Copy link
Member Author

Some notes about fast-check:

  1. Default number of runs is 100.
  2. The timeouts are still to be integrated into jest: Integrate jest timeouts to @fast-check/jest dubzzz/fast-check#3084 (interruptAfterTimeLimit is useful, but number of runs is actually more useful, we would generally prefer tests to be fast and exhaustive, the balance depends on the programmer)
  3. There's a bug where zero arbitraries don't work with the jest integration: Jest Fast Check doesn't allow zero arbitraries dubzzz/fast-check#3227
  4. You can't always generate arbitraries for things that are really random. Like if you use webcrypto's random values, you aren't able to seed that getRandomValues (as there is no facility to do this), so you aren't able to deterministically generate random values. In that case, the only way to do it is to just call it inside the predicate. But it's ideally preferable to use arbitraries as you can do shrinking and you can reproduce the bug.
  5. Arbitraries are preferred over preconditions. Generally any precondition can be redone as a composed arbitrary.

@CMCDragonkai
Copy link
Member Author

Actually it turns out you cannot use fc without at least 1 arbitrary. And if we are doing this, we might as well do it with just a normal loop or test.each.

@CMCDragonkai
Copy link
Member Author

This is pretty useful https://github.com/dubzzz/fast-check/blob/296cde419382eac6fe257ac4105a893dbdf5cfcc/packages/fast-check/documentation/Tips.md#combine-with-other-faker-or-random-generator-libraries.

So it's easy enough to integrate other faker libraries such as this https://www.npmjs.com/package/@faker-js/faker.

This can be quite useful for fuzz testing our UI too.

@CMCDragonkai
Copy link
Member Author

The fast-check jest integration will come in from #270.

However we should finish this issue in #450.

@tegefaulkes
Copy link
Contributor

This one is pretty large. It can be converted into an epic and each domain a separate issue.

@CMCDragonkai
Copy link
Member Author

My recent work with CertManager actually illustrate how to do model checking testing. It's just a matter of deriving the minimal model that can be used to check constraints. At the same time, if we expect a bunch of commands to run without resulting in a conflict, this can also be encoded upon running each individual command.

However one thing is that with the async model runs, I believe they run each command serialised. So to properly test asynchronous concurrency, we would have to make fast check model checking to run the commands in concurrently. I'm not entirely sure how to achieve this atm, beyond making composed commands that run multiple commands together.

The question is whether model checking testing works well with asynchronous concurrency testing as discussed above, and if there's a combination of these 2 that can be done.

@CMCDragonkai
Copy link
Member Author

Subsequently I've made use of the fc.scheduler to be able to run commands asynchronously and randomly in the Sigchain tests to test for atomic serialisability of the Sigchain.addClaim method.

The work was done based on this dubzzz/fast-check#564.

So it works pretty well with the fast-check jest integration, the scheduler is just another arbitrary. The only thing lacking is the need to manually run the function call later. Unfortunately FC doesn't have a built in scheduleDelayedFunction atm, so we have to do this manually.

We did have a utility function called testsUtils.scheduleCall, which does this, but I'm not sure about this implementation, therefore so far I've not used it yet.


I noticed that in some cases, model checking isn't actually required if there's only a single mutation method. In the case of Sigchain, that's addClaim. In that case, it's sufficient just to create a set of property tests, that use addClaim with random data, and then run expectations on the resulting data.

That means model checking is mostly useful when there are multiple mutation methods.

At the same time, the current model checking I'm doing in CertManager is using an async model, but it isn't exactly concurrent. I believe it's synchronously calling asynchronous commands and awaiting them.

If we really want to combine model checking (best used with multiple mutation methods) and concurrent scheduling, we have to use fc.scheduledModelRun, and not fc.asyncModelRun.

But if we only 1 single mutation method, then model checking is not necessary, but fc.scheduler can still be used to test concurrency of that single mutation method.

More details here:

https://github.com/dubzzz/fast-check/blob/e7b361ffe13ff22a458d25431788e3caa359646a/packages/fast-check/documentation/RaceConditions.md#model-based-testing-and-race-conditions

In particular this is important:

One important fact to know when mixing model based testing with schedulers is that neither check nor run should rely on the completion of other scheduled tasks to fulfill themselves but they can - and most of the time have to - trigger new scheduled tasks. No other scheduled task will be resolved during the execution of check or run.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Oct 28, 2022

So the only last thing to explore is the usage of fc.scheduledModelRun for a domain that multiple concurrent mutation methods.

We could upgrade the CertManager.test.ts from asyncModelRun to scheduledModelRun instead.

And then this issue can be closed, as we now have examples of how to do this in new domains upon the merge of #446.

Subsequent smaller issues can be created to upgrade particular tests to use fast check. But from now, most tests should use fast check.

Domains that have fast check applied:

  • keys
  • sigchain

@CMCDragonkai
Copy link
Member Author

I think we have a good handle on how this all works. And @tegefaulkes you've been applying it to other places now too. I think we can close this. Please add your own notes about how you've been using fast check here too.

Then later we can add a guide to wour wiki for development.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy
Development

No branches or pull requests

3 participants