Skip to content

Fuzzing with Zest

Rohan Padhye edited this page Sep 1, 2021 · 24 revisions

Zest 101

This tutorial walks through the process of writing a standalone test driver using JQF, a structured input generator using junit-quickcheck, and fuzzing with the Zest algorithm using command-line scripts. If your test program is built using Maven, then it is much easier to use the JQF Maven Plugin, which does not require you to install JQF locally or to run bash scripts.

This is a basic tutorial that demonstrates how to fuzz a toy program with tens of lines of code. After going through this document, you can check out the advanced tutorial on Fuzzing a Compiler with hundreds of thousands of lines of code using a non-trivial generator.

Requirements

For this tutorial, you will need: Java 9+, Apache Maven (to build JQF), Bash (to launch fuzzing scripts). This tutorial has been tested on MacOS and Ubuntu Linux. It may also work on Windows under Cygwin or Git bash.

Step 0: Build JQF

git clone https://github.com/rohanpadhye/jqf
cd jqf
mvn package

From here on, the tutorial will refer to the directory in which JQF has been cloned as /path/to/jqf.

Step 1: Write some application code

In this tutorial, we are going to test some logic that deals with JDK GregorianCalendar objects, which represent dates in the widely used Gregorian calendar system.

Let's create a directory (/path/to/tutorial) and then put the application logic in a file called CalendarLogic.java:

import java.util.GregorianCalendar;
import static java.util.GregorianCalendar.*;

/* Application logic */
public class CalendarLogic {
    // Returns true iff cal is in a leap year
    public static boolean isLeapYear(GregorianCalendar cal) {
        int year = cal.get(YEAR);
        if (year % 4 == 0) {
            if (year % 100 == 0) {
                return false;
            }
            return true;
        }
        return false;
    }

    // Returns either of -1, 0, 1 depending on whether c1 is <, =, > than c2
    public static int compare(GregorianCalendar c1, GregorianCalendar c2) {
        int cmp;
        cmp = Integer.compare(c1.get(YEAR), c2.get(YEAR));
        if (cmp == 0) {
            cmp = Integer.compare(c1.get(MONTH), c2.get(MONTH));
            if (cmp == 0) {
                cmp = Integer.compare(c1.get(DAY_OF_MONTH), c2.get(DAY_OF_MONTH));
                if (cmp == 0) {
                    cmp = Integer.compare(c1.get(HOUR), c2.get(HOUR));
                    if (cmp == 0) {
                        cmp = Integer.compare(c1.get(MINUTE), c2.get(MINUTE));
                        if (cmp == 0) {
                            cmp = Integer.compare(c1.get(SECOND), c2.get(SECOND));
                            if (cmp == 0) {
                                cmp = Integer.compare(c1.get(MILLISECOND), c2.get(MILLISECOND));
                            }
                        }
                    }
                }
            }
        }
        return cmp;
    }
}

Step 2: Write a test driver

Here is a first draft of a JUnit-style test driver that verifies the leap-year calculation logic. Save the following in a file called CalendarTest.java.

import java.util.*;
import static java.util.GregorianCalendar.*;
import static org.junit.Assert.*;
import static org.junit.Assume.*;

public class CalendarTest {

    public void testLeapYear(GregorianCalendar cal) {
        // Assume that the date is Feb 29
        assumeTrue(cal.get(MONTH) == FEBRUARY);
        assumeTrue(cal.get(DAY_OF_MONTH) == 29);

        // Under this assumption, validate leap year rules
        assertTrue(cal.get(YEAR) + " should be a leap year", CalendarLogic.isLeapYear(cal));
    }

    public void testCompare(List<GregorianCalendar> cals) {
        // Sort list of calendar objects using our custom comparator function
        Collections.sort(cals, CalendarLogic::compare);

        // If they have an ordering, then the sort should succeed
        for (int i = 1; i < cals.size(); i++) {
            Calendar c1 = cals.get(i-1);
            Calendar c2 = cals.get(i);
            assumeFalse(c1.equals(c2)); // Assume that we have distinct dates
            assertTrue(c1 + " should be before " + c2, c1.before(c2));  // Then c1 < c2
        }
    }
}

The method testLeapYear() checks the following property: assuming that an input GregorianCalendar represents the date February 29, then it must belong to a year that is divisible by 4, but not divisible by 100. Astute readers among you would notice that this logic is wrong. We expect an AssertionError to be thrown when provided with a valid counter-example. However, we need something that can generate random instances of GregorianCalendar.

Similarly, the test method testCompare() sorts an input list of calendar objects using the comparator defined in CalendarLogic. It then asserts that every consecutive pair of items in the sorted list is indeed ordered by the before() relation. Again, there is a bug lurking in the CalendarLogic.compare method, which is only revealed by corner case inputs.

Step 2: Write an input generator

JQF leverages the junit-quickcheck framework to produce structured inputs. In order to generate inputs for type T, we need a class that extends Generator<T>. Such a subclass need only provide a method that can produce random instances of T using a provided source of randomness. The following is our generator for GregorianCalendar objects, in a file called CalendarGenerator.java:

import java.util.GregorianCalendar;
import java.util.TimeZone;

import com.pholser.junit.quickcheck.generator.GenerationStatus;
import com.pholser.junit.quickcheck.generator.Generator;
import com.pholser.junit.quickcheck.random.SourceOfRandomness;

import static java.util.GregorianCalendar.*;

public class CalendarGenerator extends Generator<GregorianCalendar> {

    public CalendarGenerator() {
        super(GregorianCalendar.class); // Register the type of objects that we can create
    }

    // This method is invoked to generate a single test case
    @Override
    public GregorianCalendar generate(SourceOfRandomness random, GenerationStatus __ignore__) {
        // Initialize a calendar object
        GregorianCalendar cal = new GregorianCalendar();
        cal.setLenient(true); // This allows invalid dates to silently wrap (e.g. Apr 31 ==> May 1).

        // Randomly pick a day, month, and year
        cal.set(DAY_OF_MONTH, random.nextInt(31) + 1); // a number between 1 and 31 inclusive
        cal.set(MONTH, random.nextInt(12) + 1); // a number between 1 and 12 inclusive
        cal.set(YEAR, random.nextInt(cal.getMinimum(YEAR), cal.getMaximum(YEAR)));

        // Optionally also pick a time
        if (random.nextBoolean()) {
            cal.set(HOUR, random.nextInt(24));
            cal.set(MINUTE, random.nextInt(60));
            cal.set(SECOND, random.nextInt(60));
        }

        // Let's set a timezone
        // First, get supported timezone IDs (e.g. "America/Los_Angeles")
        String[] allTzIds = TimeZone.getAvailableIDs();

        // Next, choose one randomly from the array
        String tzId = random.choose(allTzIds);
        TimeZone tz = TimeZone.getTimeZone(tzId);

        // Assign it to the calendar
        cal.setTimeZone(tz);

	// Return the randomly generated calendar object
        return cal;
    }
}

The main entry point to the generator is the generate() method. This method takes two parameters: a pseudo-random-number generator and a status object. For now, let's ignore the latter as we do not generally need to use it. The SourceOfRandomness is a high-level API for generating random values and making random choices. In the generate() method, we make various random choices to construct and return a randomly generated instance of GregorianCalendar.

Step 3: Annotate test driver

Now that we have a class that can generate random instances of GregorianCalendar, we can specify in our test driver that we want to use this particular generator to create our inputs. To do this, we use the @From(CalendarGenerator.class) annotation on the cal parameter to our test method testLeapYear(). Check out the junit-quickcheck documentation for advanced ways of composing generators (e.g. automatically synthesizing generators from constructors or public fields).

We also need a couple of other annotations to make this a test driver that JQF can fuzz. First, we need to annotate the test class with @RunWith(JQF.class). This tells JUnit that we are using the JQF engine to invoke test methods. Second, we must annotate the test method with @Fuzz. This helps JQF find the methods in the test class for which it can generate inputs. Here is our updated test driver CalendarTest.java with the @RunWith, @Fuzz, and @From annotations applied:

import java.util.*;
import static java.util.GregorianCalendar.*;
import static org.junit.Assert.*;
import static org.junit.Assume.*;

import org.junit.runner.RunWith;
import com.pholser.junit.quickcheck.*;
import com.pholser.junit.quickcheck.generator.*;
import edu.berkeley.cs.jqf.fuzz.*;

@RunWith(JQF.class)
public class CalendarTest {

    @Fuzz
    public void testLeapYear(@From(CalendarGenerator.class) GregorianCalendar cal) {
        // Assume that the date is Feb 29
        assumeTrue(cal.get(MONTH) == FEBRUARY);
        assumeTrue(cal.get(DAY_OF_MONTH) == 29);

        // Under this assumption, validate leap year rules
        assertTrue(cal.get(YEAR) + " should be a leap year", CalendarLogic.isLeapYear(cal));
    }

    @Fuzz
    public void testCompare(@Size(max=100) List<@From(CalendarGenerator.class) GregorianCalendar> cals) {
        // Sort list of calendar objects using our custom comparator function
        Collections.sort(cals, CalendarLogic::compare);

        // If they have an ordering, then the sort should succeed
        for (int i = 1; i < cals.size(); i++) {
            Calendar c1 = cals.get(i-1);
            Calendar c2 = cals.get(i);
            assumeFalse(c1.equals(c2)); // Assume that we have distinct dates
            assertTrue(c1 + " should be before " + c2, c1.before(c2));  // Then c1 < c2
        }
    }
}

For testCompare, notice that we annotated only the type parameter of the generic list with @From, but not the List itself. That is because junit-quickcheck ships with default generators for collections that simply create a data structure of n random elements, where n itself is randomly chosen. We use the optional @Size annotation to bound the size of n. For more information on how to configure the default generators or how to add custom configuration annotations for your own generators, check out the junit-quickcheck tutorial on configuring generators.

Step 4: Compile classes

Let's compile CalendarLogic.java, CalendarTest.java and CalendarGenerator.java using javac. Since we are using classes from JQF and its dependencies (such as JUnit and junit-quickcheck), we also need to list all of those JARs on the classpath. Fortunately, JQF provides a handy script called classpath.sh which simply expands to the long list of JARs that contain JQF and everything else it depends on. So, the final class path is the current directory (.) which contains the test driver and generator, appended to the output of the classpath.sh script, using the Java class-path separator :.

javac -cp .:$(/path/to/jqf/scripts/classpath.sh) CalendarLogic.java CalendarGenerator.java CalendarTest.java

This step is not needed if you use Maven/Gradle in your application. Simply list jqf-fuzz as a test-dependency. The build system will take care of resolving any required transitive dependencies.

Optional: Test with JUnit and QuickCheck

Every JQF test is also a junit-quickcheck test, which means that it can simply run like any other JUnit test. For example, you can run the test in your IDE (green button in IntelliJ) or include it in your Maven test suite to be run as part of mvn test. In the absence of an IDE or build system, you can also invoke JUnit on the command-line:

java -cp .:$(/path/to/jqf/scripts/classpath.sh) org.junit.runner.JUnitCore CalendarTest

You might see an error message like Assumption is too strong; too many inputs discarded. This is because by default, junit-quickcheck randomly samples GregorianCalendar objects by making calls to CalendarGenerator.generate() with a standard pseudo-random source. Of course, most dates that are randomly generated by this generator will NOT fall on February 29 and therefore most of the generated inputs lead to a violation of the assumptions at the beginning of the testLeapYear() method. Therefore, QuickCheck by itself cannot perform a meaningful test.

Step 5: Fuzz with Zest

To run the Zest algorithm, launch the script jqf-zest from the bin directory in the JQF repository, and provide the name of the class and method that needs to be tested on the command-line. Note that this script configures the classpath using the -c option; use the classpath.sh script as before to include the JQF classes and their dependencies.

Fuzzing testLeapYear()

/path/to/jqf/bin/jqf-zest -c .:$(/path/to/jqf/scripts/classpath.sh) CalendarTest testLeapYear

These scripts are not needed if you are using the JQF Maven Plugin, which allows you to launch Zest via: mvn jqf:fuzz -Dclass=CalendarTest -Dmethod=testLeapYear.

Zest status screen

Upon launching the above command, you should see a status screen that looks like this:

Zest: Validity Fuzzing with Parametric Generators
-------------------------------------------------
Test name:            CalendarTest#testLeapYear
Results directory:    /path/to/tutorial/fuzz-results
Elapsed time:         5s (no time limit)
Number of executions: 5,856
Valid inputs:         271 (4.63%)
Cycles completed:     4
Unique failures:      1
Queue size:           3 (3 favored last cycle)
Current parent input: 0 (favored) {240/240 mutations}
Execution speed:      1,300/sec now | 1,098/sec overall
Total coverage:       8 (0.01% of map)
Valid coverage:       6 (0.01% of map)

This screen indicates the method that is being fuzzed, the directory where the results of fuzzing will be stored (this can be overriden using an extra command-line argument to jqf-zest), and various statistics about the fuzzing process. For example, in 5 seconds, Zest has produced over 5,800 inputs at an average of 1,000+ inputs per second. Of these, 271 are valid, which means that they passed all the assumeTrue() statements in our test driver -- in our case, it means that 4.63% of the generated inputs fell on February 29. This is great because the probability of a randomly generated date falling on February 29 is less than 0.3%! Zest's validity fuzzing algorithm allows it to generate a higher proportion of valid inputs than would be possible using uniform random sampling.

The status screen also shows how many branches have been exercised in our application by all inputs ("Total coverage") and by valid inputs ("Valid coverage". These numbers include the implicit branches introduced by assume/assert. Since our test was tiny, the total number of branches is quite small. If you are fuzzing a complex application like a compiler or web server, this number is usually in the thousands.

Interestingly, there is one "Unique failure". A failure is an input that triggers an undocumented run-time exception or assertion violation while executing the test. Failing inputs are saved in fuzz-results/failures. Apart from failures, Zest saves a corpus of successful inputs in the directory fuzz-results/corpus. These inputs cover different control-flow paths in the test program. Therefore, these inputs could be used for regression testing. The size of this corpus is given by "Queue size" in the status screen, since these inputs are stored in a cyclic queue for continuous mutational fuzzing.

Fuzzing can be stopped at any time by pressing CTRL+C. We can repro the failure to see what went wrong using the jqf-repro script:

/path/to/jqf/bin/jqf-repro -c .:$(/path/to/jqf/scripts/classpath.sh) CalendarTest testLeapYear fuzz-results/failures/id_000000 

This will print something like:

java.lang.AssertionError: 3600 should be a leap year
	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.assertTrue(Assert.java:41)
	at CalendarTest.testLeapYear(CalendarTest.java:21)

Showing that the second assertion was violated when the year was 3600. But of course! Years that are multiples of 100 can be leap years if they are also multiples of 400. This is clearly a bug in our logic, which we can go and fix in CalendarLogic.java:

...
public static boolean isLeapYear(GregorianCalendar cal) {
        int year = cal.get(YEAR);
        if (year % 4 == 0) {
            if (year % 100 == 0) {
                /* New branch: fixes bug on years like '3600' */
            	if (year % 400 == 0) {
            		return true;
            	}
                return false;
            }
            return true;
        }
        return false;
    }
...

Fuzzing testCompare()

/path/to/jqf/bin/jqf-zest -c .:$(/path/to/jqf/scripts/classpath.sh) CalendarTest testCompare

You should see a status screen like this:

Zest: Validity Fuzzing with Parametric Generators
-------------------------------------------------
Test name:            CalendarTest#testCompare
Results directory:    /path/to/tutorial/fuzz-results
Elapsed time:         15s (no time limit)
Number of executions: 6,449
Valid inputs:         6,368 (98.74%)
Cycles completed:     1
Unique failures:      1
Queue size:           26 (5 favored last cycle)
Current parent input: 16 (favored) {272/920 mutations}
Execution speed:      480/sec now | 421/sec overall
Total coverage:       14 (0.02% of map)
Valid coverage:       14 (0.02% of map)

In this test, the percentage of valid inputs is very high (98%)! That is because the assumptions are not very strong -- Zest only has to make sure that the inputs it generates remain distinct. However, the test execution speed is slightly slower, because a list of calendar objects must be generated and then sorted in each test.

Again, we have one unique failure! We can repro the failing test case to observe the assertion violation:

/path/to/jqf/bin/jqf-repro -c .:$(/path/to/jqf/scripts/classpath.sh) CalendarTest testCompare fuzz-results/failures/id_000000 

The bug lies in the fact that the CalendarLogic::compare method ignores time zones. If dates from two time zones that are H hours apart differ by less than H hours, then our custom compare() method may disagree with the Calendar.before() method. The Zest algorithm can effortlessly find such a bug using coverage-guided validity fuzzing.

Next Steps

Check out the advanced tutorial on Fuzzing a Compiler.