Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proptest based differential contract testing #578

Merged
merged 2 commits into from
Oct 28, 2021

Conversation

cburgdorf
Copy link
Collaborator

@cburgdorf cburgdorf commented Oct 25, 2021

What was wrong?

We could do more differential contract testing. The idea is to write certain contracts in both Fe and Solidity and then fuzz them with random data and compare if we find cases where they return different results.

How was it fixed?

  1. I added a new category of tests / fixtures called "differential".
  2. These tests use proptest a library for property testing
  3. Each test needs a Soldity and Fe contract with the same name e.g. math_u8.fe + math_u8.sol
  4. Inputs are generated by proptest based on strategies and we can then assert that they performed the same

Open questions

  1. The tests take a considerable amount of execution time. Should we run them based on a cronjob (GitHub Actions supports that) instead of running them on every PR. Alternatively we could also run them on every second, third, .., tenth PR or something like that.

  2. I guess that with some extra work we could generate the whole test setup if we inspect the ABI of the test contracts and generate strategies suitable for each public method. But I'm not yet sure if it's really that helpful in practice or if a bit more manual moderation is actually better.

  3. I've started with math_u8 and math_i8 tests because the range of that datatype is quite tiny but the bugs that are caught for u8/i8 are often transferable to the bigger numerics. The tests already caught a bug where negations (-x) where not properly checked and the fix is included in this PR.

@cburgdorf cburgdorf force-pushed the christoph/tests/property-testing branch 3 times, most recently from 1e25bbb to 98679d5 Compare October 27, 2021 10:13
@cburgdorf cburgdorf force-pushed the christoph/tests/property-testing branch from 98679d5 to ff6540e Compare October 27, 2021 12:33
@codecov-commenter
Copy link

Codecov Report

Merging #578 (277d63a) into master (e9c1ee1) will increase coverage by 1.10%.
The diff coverage is 88.46%.

❗ Current head 277d63a differs from pull request most recent head ff6540e. Consider uploading reports for the commit ff6540e to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #578      +/-   ##
==========================================
+ Coverage   87.03%   88.13%   +1.10%     
==========================================
  Files          87       87              
  Lines        6432     6998     +566     
==========================================
+ Hits         5598     6168     +570     
+ Misses        834      830       -4     
Impacted Files Coverage Δ
crates/yulgen/src/mappers/expressions.rs 90.80% <71.42%> (-2.24%) ⬇️
crates/yulgen/src/runtime/functions/math.rs 95.39% <94.11%> (-0.10%) ⬇️
crates/yulgen/src/names/mod.rs 100.00% <100.00%> (ø)
crates/analyzer/src/traversal/declarations.rs 94.59% <0.00%> (-5.41%) ⬇️
crates/analyzer/src/traversal/expressions.rs 89.15% <0.00%> (-1.57%) ⬇️
crates/yulgen/src/mappers/assignments.rs 74.19% <0.00%> (-0.81%) ⬇️
crates/yulgen/src/mappers/functions.rs 92.30% <0.00%> (-0.23%) ⬇️
crates/analyzer/src/db.rs 100.00% <0.00%> (ø)
crates/lowering/src/mappers/expressions.rs 100.00% <0.00%> (ø)
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e9c1ee1...ff6540e. Read the comment docs.

@cburgdorf cburgdorf marked this pull request as ready for review October 27, 2021 14:03
@cburgdorf cburgdorf force-pushed the christoph/tests/property-testing branch from ff6540e to a6b2a5b Compare October 27, 2021 14:05
@cburgdorf
Copy link
Collaborator Author

This adds about ~3 minutes to the CI run. I can explore how to run that on a Github Actions cronjob instead of with each PR if you like?

Copy link
Member

@g-r-a-n-t g-r-a-n-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we run them based on a cronjob (GitHub Actions supports that) instead of running them on every PR.

The ideal setup imo would be to have a suite of these proposition tests running at all times against master. It would also be nice to have some sort of notification or issue creation action when invalidating inputs are found. In response, we would add the input to a set of regression checks and fix the bug.

fn math_u8(val in 0u8..=255, val2 in 0u8..=255) {
with_executor(&|mut executor| {

let harness = DualHarness::from_fixture(&mut executor, "math_u8", "Foo", &[]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To quote the great @agroce, "throughput is king in fuzzing".

It looks like we are creating an entirely new executor for each test, which (I imagine) is significantly reducing the throughput.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly then the executor is holding on to all the chain state so reusing the same executor across multiple tests could make the tests become non deterministic where equal inputs to a test could mean different results depending on which tests and inputs ran before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this would only be feasible with contracts that don't mutate state. So if we end up running alot of tests that meet this criteria, then it might be worth looking into.

It would be nice to know how long it takes to initialize the executor and harness vs how long it takes to run the tests..

@cburgdorf
Copy link
Collaborator Author

In response, we would add the input to a set of regression checks and fix the bug.

Yeah, that's kind of what's happening with that differential.txt file that proptest maintains for you. When it finds a bug it writes the inputs for that test into that file so that these cases continue to be checked in the future.

The ideal setup imo would be to have a suite of these proposition tests running at all times against master.

So, I understand that as: We would not have it run against each PR (if it adds too much time to that) but we would invoke the run each time that master changes and ideally automate the filing of bugs.

@cburgdorf
Copy link
Collaborator Author

Ok, I'm going to merge this as is but as I keep playing with these tests a bit more I will also look into how to run these tests independently from PRs and change that with a follow up PR.

@cburgdorf cburgdorf merged commit 96bc51b into ethereum:master Oct 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants