proptest based differential contract testing #578

cburgdorf · 2021-10-25T16:08:16Z

What was wrong?

We could do more differential contract testing. The idea is to write certain contracts in both Fe and Solidity and then fuzz them with random data and compare if we find cases where they return different results.

How was it fixed?

I added a new category of tests / fixtures called "differential".
These tests use proptest a library for property testing
Each test needs a Soldity and Fe contract with the same name e.g. math_u8.fe + math_u8.sol
Inputs are generated by proptest based on strategies and we can then assert that they performed the same

Open questions

The tests take a considerable amount of execution time. Should we run them based on a cronjob (GitHub Actions supports that) instead of running them on every PR. Alternatively we could also run them on every second, third, .., tenth PR or something like that.
I guess that with some extra work we could generate the whole test setup if we inspect the ABI of the test contracts and generate strategies suitable for each public method. But I'm not yet sure if it's really that helpful in practice or if a bit more manual moderation is actually better.
I've started with math_u8 and math_i8 tests because the range of that datatype is quite tiny but the bugs that are caught for u8/i8 are often transferable to the bigger numerics. The tests already caught a bug where negations (-x) where not properly checked and the fix is included in this PR.

codecov-commenter · 2021-10-27T13:10:13Z

Codecov Report

Merging #578 (277d63a) into master (e9c1ee1) will increase coverage by 1.10%.
The diff coverage is 88.46%.

❗ Current head 277d63a differs from pull request most recent head ff6540e. Consider uploading reports for the commit ff6540e to get more accurate results

@@            Coverage Diff             @@
##           master     #578      +/-   ##
==========================================
+ Coverage   87.03%   88.13%   +1.10%     
==========================================
  Files          87       87              
  Lines        6432     6998     +566     
==========================================
+ Hits         5598     6168     +570     
+ Misses        834      830       -4

Impacted Files	Coverage Δ
crates/yulgen/src/mappers/expressions.rs	`90.80% <71.42%> (-2.24%)`	⬇️
crates/yulgen/src/runtime/functions/math.rs	`95.39% <94.11%> (-0.10%)`	⬇️
crates/yulgen/src/names/mod.rs	`100.00% <100.00%> (ø)`
crates/analyzer/src/traversal/declarations.rs	`94.59% <0.00%> (-5.41%)`	⬇️
crates/analyzer/src/traversal/expressions.rs	`89.15% <0.00%> (-1.57%)`	⬇️
crates/yulgen/src/mappers/assignments.rs	`74.19% <0.00%> (-0.81%)`	⬇️
crates/yulgen/src/mappers/functions.rs	`92.30% <0.00%> (-0.23%)`	⬇️
crates/analyzer/src/db.rs	`100.00% <0.00%> (ø)`
crates/lowering/src/mappers/expressions.rs	`100.00% <0.00%> (ø)`
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e9c1ee1...ff6540e. Read the comment docs.

cburgdorf · 2021-10-27T14:06:27Z

This adds about ~3 minutes to the CI run. I can explore how to run that on a Github Actions cronjob instead of with each PR if you like?

g-r-a-n-t

Should we run them based on a cronjob (GitHub Actions supports that) instead of running them on every PR.

The ideal setup imo would be to have a suite of these proposition tests running at all times against master. It would also be nice to have some sort of notification or issue creation action when invalidating inputs are found. In response, we would add the input to a set of regression checks and fix the bug.

g-r-a-n-t · 2021-10-27T16:51:27Z

crates/tests/src/differential.rs

+    fn math_u8(val in 0u8..=255, val2 in 0u8..=255) {
+        with_executor(&|mut executor| {
+
+            let harness = DualHarness::from_fixture(&mut executor, "math_u8", "Foo", &[]);


To quote the great @agroce, "throughput is king in fuzzing".

It looks like we are creating an entirely new executor for each test, which (I imagine) is significantly reducing the throughput.

If I understand correctly then the executor is holding on to all the chain state so reusing the same executor across multiple tests could make the tests become non deterministic where equal inputs to a test could mean different results depending on which tests and inputs ran before.

Yeah, this would only be feasible with contracts that don't mutate state. So if we end up running alot of tests that meet this criteria, then it might be worth looking into.

It would be nice to know how long it takes to initialize the executor and harness vs how long it takes to run the tests..

cburgdorf · 2021-10-28T11:13:32Z

In response, we would add the input to a set of regression checks and fix the bug.

Yeah, that's kind of what's happening with that differential.txt file that proptest maintains for you. When it finds a bug it writes the inputs for that test into that file so that these cases continue to be checked in the future.

The ideal setup imo would be to have a suite of these proposition tests running at all times against master.

So, I understand that as: We would not have it run against each PR (if it adds too much time to that) but we would invoke the run each time that master changes and ideally automate the filing of bugs.

cburgdorf · 2021-10-28T11:20:03Z

Ok, I'm going to merge this as is but as I keep playing with these tests a bit more I will also look into how to run these tests independently from PRs and change that with a follow up PR.

cburgdorf force-pushed the christoph/tests/property-testing branch 3 times, most recently from 1e25bbb to 98679d5 Compare October 27, 2021 10:13

Ensure negations are always checked for over/underflows

9a5c57d

cburgdorf force-pushed the christoph/tests/property-testing branch from 98679d5 to ff6540e Compare October 27, 2021 12:33

cburgdorf marked this pull request as ready for review October 27, 2021 14:03

proptest based differential contract testing

a6b2a5b

cburgdorf force-pushed the christoph/tests/property-testing branch from ff6540e to a6b2a5b Compare October 27, 2021 14:05

cburgdorf requested a review from g-r-a-n-t October 27, 2021 14:06

g-r-a-n-t approved these changes Oct 27, 2021

View reviewed changes

cburgdorf merged commit 96bc51b into ethereum:master Oct 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proptest based differential contract testing #578

proptest based differential contract testing #578

cburgdorf commented Oct 25, 2021 •

edited

Loading

codecov-commenter commented Oct 27, 2021

cburgdorf commented Oct 27, 2021

g-r-a-n-t left a comment

g-r-a-n-t Oct 27, 2021

cburgdorf Oct 28, 2021

g-r-a-n-t Nov 1, 2021

cburgdorf commented Oct 28, 2021

cburgdorf commented Oct 28, 2021

proptest based differential contract testing #578

proptest based differential contract testing #578

Conversation

cburgdorf commented Oct 25, 2021 • edited Loading

What was wrong?

How was it fixed?

Open questions

codecov-commenter commented Oct 27, 2021

Codecov Report

cburgdorf commented Oct 27, 2021

g-r-a-n-t left a comment

Choose a reason for hiding this comment

g-r-a-n-t Oct 27, 2021

Choose a reason for hiding this comment

cburgdorf Oct 28, 2021

Choose a reason for hiding this comment

g-r-a-n-t Nov 1, 2021

Choose a reason for hiding this comment

cburgdorf commented Oct 28, 2021

cburgdorf commented Oct 28, 2021

cburgdorf commented Oct 25, 2021 •

edited

Loading