Target is a library for testing Haskell functions based on refinement type specifications.
First things first, get Target from Hackage
$ cabal install target
You'll also need a recent version of Z3 or CVC4.
To get acquainted with refinement types and Target, let's examine
a small grading library called Scores
.
We'll need to import two modules from Target.
Test.Target
exports the main testing API, and
Test.Target.Targetable
exports the
Targetable
type-class, which represents types for which we can
generate constrained values. We'll need the latter module for one of
the later examples when we define our own datatype.
module Scores where
import Test.Target
import Test.Target.Targetable
A refinement type decorates the basic Haskell types with logical predicates drawn from an efficiently decidable theory. For example,
{-@ type Nat = {v:Int | 0 <= v} @-}
{-@ type Pos = {v:Int | 0 < v} @-}
{-@ type Rng N = {v:Int | 0 <= v && v < N} @-}
are refinement types describing the set of integers that are
non-negative, strictly positive, and in the interval [0, N)
respectively. We will also build up function and collection
types over base refinement types like the above.
Let's write a function rescale
that takes a source range [0,r1)
,
a target range [0,r2)
, and a score n
from the source range,
and returns the linearly scaled score in the target range.
For example, rescale 5 100 2
should return 40
.
Here's a first attempt at rescale
{-@ rescale :: r1:Nat -> r2:Nat -> s:Rng r1 -> Rng r2 @-}
rescale r1 r2 s = s * (r2 `div` r1)
Let's load our code into GHCi and test it!
ghci> :set -XTemplateHaskell
ghci> :l Scores.hs
ghci> target rescale 'rescale "Scores.hs"
The main function Target exports is
target
target :: Testable f => f -> TH.Name -> FilePath -> IO ()
which drives the entire testing process. It also provides
targetWith
to customize some of the options, and
targetResult
which returns the test outcome instead of printing it.
(Since the refinement type specifications are given in special comments, we use Template Haskell to give target the exact name of the function we want to test. Unfortunately we can't extract the path to the module from the Template Haskell name, so we have to provide it separately..)
Unfortunately, target quickly responds with
Found counter-example: (1, 0, 0)
Indeed, rescale 1 0 0
results in 0
which is not in the target
Rng 0
, as the latter is empty! We could fix this in various ways,
e.g. by requiring the ranges are non-empty:
{-@ rescale :: r1:Pos -> r2:Pos -> s:Rng r1 -> Rng r2 @-}
Now, Target accepts the function and reports
ghci> target rescale 'rescale "Scores.hs"
OK. Passed all tests.
Using the refinement type specification for rescale
,
Target systematically tests the implementation by generating
all valid inputs up to a given depth bound (defaulting to 5)
that respect the pre-conditions, running the function, and
checking that the output satisfies the post-condition.
Suppose we have normalized all scores to be out of 100
{-@ type Score = Rng 100 @-}
Now we'll write a function to compute a weighted average of a list of scores.
{-@ average :: [(Int, Score)] -> Score @-}
average [] = 0
average wxs = total `div` n
where
total = sum [w * x | (w, x) <- wxs ]
n = sum [w | (w, _) <- wxs ]
It can be tricky to verify this function as it requires non-linear reasoning about an unbounded collection. However, we can gain a great degree of confidence by systematically testing it using the type specification; indeed, Target responds:
ghci> target average 'average "Scores.hs"
Found counter-example: [(0,0)]
Clearly, an unfortunate choice of weights can trigger a divide-by-zero; we can fix this by requiring the weights be non-zero:
{-@ average :: [({v:Int | v /= 0}, Score)] -> Score @-}
but now Target responds with
ghci> target average 'average "Scores.hs"
Found counter-example: [(-3,3),(3,0)]
which also triggers the divide-by-zero! Let's play it safe and require positive weights,
{-@ average :: [(Pos, Score)] -> Score @-}
at which point Target reports that all tests pass.
The very nature of our business requires that at the end of the day,
we order students by their scores. We can represent ordered lists by
requiring the elements of the tail t
to be greater than the head h
:
{-@ data OrdList a
= ONil
| OCons {h :: a, t :: OrdList {v:a | h <= v}}
@-}
Target can only test functions that operate on Targetable
types, so we'll
also need to make OrdList
Targetable
. We could write our own instance, but
they turn out to be quite mechanical, so instead we'll use GHC.Generics
to
derive an instance automatically.
data OrdList a = ONil | OCons a (OrdList a)
deriving Generic
instance Targetable a => Targetable (OrdList a)
We can now write a function to insert a score into an ordered list:
{-@ insert :: (Ord a) => a -> OrdList a -> OrdList a @-}
Target automatically generates all ordered lists (up to a given depth)
and executes insert
to check for any errors.
Everyone has a few bad days. Let us write a function that takes the
best k
scores for a particular student. That is, the output
must satisfy a structural constraint -- that its size
equals k
. We can encode the size of a list with a logical
measure function:
{-@ measure len :: [a] -> Int
len [] = 0
len (x:xs) = 1 + len xs
@-}
Now, we can stipulate that the output indeed has k
scores:
{-@ best :: k:Nat -> [Score] -> {v:[Score] | k = len v} @-}
best k xs = take k $ reverse $ sort xs
Now, Target quickly finds a counterexample:
ghci> target best 'best "Scores.hs"
Found counter-example: (2,[])
Of course -- we need to have at least k
scores to start with!
{-@ best :: k:Nat -> {v:[Score] | k <= len v}
-> {v:[Score] | k = len v}
@-}
and now, Target is assuaged and reports no counterexamples.
Note that we don't have to write a custom Targetable
instance
to generate lists of a specific length, the standard one suffices!
Perhaps instead of taking the k
best grades, we would like
to pad each individual grade, and, furthermore, we want to
be able to experiment with different padding functions. Let's
rewrite average
to take a functional argument, and
stipulate that it can only increase a Score
.
{-@ padAverage :: (s:Score -> {v:Score | s <= v}) -> [(Pos, Score)] -> Score @-}
padAverage f [] = f 0
padAverage f wxs = total `div` n
where
total = sum [w * f x | (w, x) <- wxs ]
n = sum [w | (w, _) <- wxs ]
Target automatically checks that padAverage
is
a safe generalization of average
. Randomized
testing tools can also generate functions, but those
functions are unlikely to satisfy non-trivial constraints,
thereby burdening the user with custom generators.
You can specify a lot of interesting properties with refinement types and we've only scratched the surface here. Check out the LiquidHaskell blog for more examples; Target uses the same specification language as LiquidHaskell so the examples should all be testable.