Skip to content
This repository has been archived by the owner on Jun 2, 2023. It is now read-only.

Recursion schemes training examples and exercises

Notifications You must be signed in to change notification settings

softwaremill/recursion-training

Repository files navigation

Recursion schemes

This repository contains documentation, samples and exercises for the Recursion Schemes workshop.

Tip
"these things are easier to use than understand"

Slides from my presentation Recursion Schemes by example

Running

Load this project in sbt and just launch run, then select the exercise you’d like to execute.

Introduction

Recursive structures are a common pattern and most developers have worked with such data at least a few times.

Trees

Nodes and leafs

           root-node
           /   \
  child-node   leaf
  /     \
leaf    leaf

List

Either a Cons(a, List) or Nil Cons(a, Cons(b, Cons(c, Nil))) // List(a, b, c)

Json

sealed trait Json[+F]
object Json {
  case object      Null                              extends Json
  final case class Bool  (value: Boolean)            extends Json[Nothing]
  final case class Str   (value: String)             extends Json[Nothing]
  final case class Num   (value: Double)             extends Json[Nothing]
  final case class Arr[F](values: List[F])           extends Json[F]
  final case class Obj[F](fields: List[(String, F)]) extends Json[F]
}

ASTs

SELECT * FROM users u WHERE u.age > 25 AND UPPER(u.name) LIKE "J%"
FILTER:
- source:
    SELECT
        -- selection: *
        -- from: users
- criteria:
    AND
        - GT(u.age, 25)
        - LIKE(
            UPPER(
                u.name)
            J%)

What you can do with recursive data

  • Print

  • Translate into another structure

  • Enrich

  • Optimize

Manual recursion

sealed trait Expr

case class IntValue(v: Int)           extends Expr
case class DecValue(v: Double)        extends Expr
case class Sum(a: Expr, b: Expr)      extends Expr
case class Multiply(a: Expr, b: Expr) extends Expr
case class Divide(a: Expr, b: Expr)   extends Expr

def eval(e: Expr): Double =
e match {
  case IntValue(v)      => v.toDouble
  case DecValue(v)      => v
  case Sum(e1, e2)      => eval(e1) + eval(e2)
  case Multiply(e1, e2) => eval(e1) * eval(e2)
  case Divide(e1, e2)   => eval(e1) / eval(e2)
}

Fixed point datatypes

Making an ADT polymorphic

Ideally we’d love to have something more elegant. We are looking for a tool which takes:

  • A recursive expression of type Expr,

  • a function which evaluates Expr ⇒ Double For example case Sum(d1, d2) ⇒ d1 + d2

Such tool evaluates whole expression to a Double

Types like Sum(a: Expr, b: Expr) force us to deal only with Exprs. Ideally we’d like to have our eval definition to look like:

// does not compile, but it's only an illustration of a direction
def eval(e: Expr): Double =
e match {
  case Sum(dbl1: Double, dbl2: Double) => dbl1 + dbl2 // etc
}

Let’s make our expression polymorphic.

sealed trait Expr[A]

case class IntValue[A](v: Int)           extends Expr[A]
case class DecValue[A](v: Double)        extends Expr[A]
case class Sum[A](a: A, b: A)            extends Expr[A]
case class Multiply[A](a: A, b: A)       extends Expr[A]
case class Divide[A](a: A, b: A)         extends Expr[A]

That’s much better, because this allows us express our evaluations as:

def evalToDouble(exp: Expr[Double]): Double = exp match {
  case IntValue(v) => v.toDouble
  case DecValue(v) => v
  case Sum(d1, d2) => d1 + d2
  case Multiply(d1, d2) => d1 * d2
  case Divide(d1, d2) => d1 / d2
}

Such evaluation is what we aim for, because it doesn’t look like recursion. It looks more like a set of rules, which we can apply to a recursive structure with some blackbox tool which will recursively build the result.

But let’s stop here for a while, because polymorphic expressions come with a cost…​ First, consider a trivial expression:

val intVal = IntValue[Unit](10) // Expr[Unit]

What about more complex expressions?

  val sumExp: Expr[Expr[Unit]] =
    Sum(
      IntValue[Unit](10), // Expr[Unit]
      IntValue[Unit](5)
    )

Fixing nested Exprs

how to deal with types like Expr[Expr[Expr[A]]]? Let’s wrap in:

case class Fix[F[_]](unFix: F[Fix[F]])

val fixedIntExpr: Fix[Expr] = Fix(IntValue[Fix[Expr]](10))

The Fix type allows us to represent any Expr[Expr[Expr…​.[A]]] as Fix[Expr]

Wait, why did we need this`Fix` thing?

A step back

We wanted to use evaluation definition which doesn’t look like recursion.

We are looking for a tool which takes:

  • A recursive expression of type Expr,

  • a function which evaluates a single simple Expr ⇒ Double For example case Sum(d1, d2) ⇒ d1 + d2

To be able to express such rules, we needed to go from Expr to Expr[A]. To avoid issues with nested types, we introduced Fix[Expr]

Putting it all together

Once we have:

  • A polymorphic recursive structure based on Expr[A]

  • An evaluation recipe expressed as a set of rules for each sub-type (Expr[B] ⇒ B)

  • A Fix[F[_]] wrapper

We can now use a tool to put this all together. Such tool is called…​

Catamorphism

Scheme

A generic foldRight for data stuctures. In case of recursive data, this means folding bottom-up:

  val division =
    Divide(DecValue(5.2), Sum(IntValue(10), IntValue(5)))
            Divide                             Divide
           /    \                              /    \
DecValue(5.2)   Sum            -->   DecValue(5.2)  Sum
                / \                                 / \
     IntValue(10)  IntValue(5)                   10.0 5.0
            Divide                             Divide
           /    \                              /    \
DecValue(5.2)   Sum            -->            5.2  15.0
                / \
             10.0  5.0
            Divide             -->            5.2 / 15.0
           /    \
         5.2   15.0

Going bottom-up, we use our set of rules on leafs, then we build higher nodes basing on lower nodes. Catamorphism is a generic tool, so you don’t have to implement it!

Matryoshka and cata

The Matryoshka library does catamorphism for you:

val recursiveExpr: Fix[Expr] = ??? // your tree

def evalToDouble(expr: Expr[Double]): Double

// the magic call
recursiveExpression.cata(evalToDouble) // returns Double

The .cata() call runs the whole folding process and constructs the final Double value for you, provided just a set of rules for indiviual node types.

Expression functor

Matryoshka’s .cata() is a blackbox, but it has one more requirement. It’s mechanism assumes that a Functor instance is available for your datatype.

This means that you must provide a recipe for how to map Expr[A] to Expr[B] having a function f: A ⇒ B. For example to transform Sum[A](a1: A, a2: A) into Sum[B](b1: B, b2: B) you need to do Sum(f(a1), f(a2)). Such recipe has to be provided for all possible cases of Expr.

import scalaz.Functor

implicit val exprFunctor: Functor[Expr] = new Functor[Expr] {
  override def map[A, B](expr: Expr[A])(f: A => B): Expr[B] = expr match {
    case IntVal(v) => IntVal[B](v)
    case Sum(a1, a2) => Sum(f(a1), f(a2))
    case ... // etc.
  }
}

This is finally all what we need! Here’s a summary of our ingredients:

  1. A recursive structure Expr[A]

  2. A Functor for this type

  3. A set or evaulation rules for individual cases

  4. Fix

  5. catamorphism

4 & 5 are provided by Matryoshka.

Algebra

Our evaluation function (point 3.) is called an Algebra. From Matryoshka:

type Algebra[F[_], A] = F[A] => A

def evalToDouble(expr: Expr[Double]): Double

val evalToDouble: Algebra[Expr, Double]

Some syntax sugar to work with Fix

Remember this?

Fix(Sum(Fix(IntValue[Fix[Expr]](10)), Fix(IntValue[Fix[Expr]](5))))

There`s some syntax sugar to help:

Sum(IntValue[Fix[Expr]](10).embed, IntValue[Fix[Expr]](5).embed).embed

Handy especially for larger expressions (see exercise 03). To unpack from Fix, you can use unFix:

val fixedSum: Fix[Expr] =
  Sum(
    IntValue[Fix[Expr]](10).embed,
    IntValue[Fix[Expr]](5).embed
  ).embed

fixedSum.unFix match {
  case Sum(...) =>
}

unFix is fine, but instead use .project which does the same, but is a more general function which work on other recursive wrappers. Sorry for the spoiler, but Fix is not the only one around, and you don’t want to get tied directly to it!

Transforming recursive expressions

Let’s say we have an Expr and we want to optimize it to express Multiply(x, x) as Square(x). We’d like to have a tool which walks our tree and maps a given Expr to another Expr

           Divide                                     Divide
           /    \                                     /    \
DecValue(5.2)   Multiply         -->       DecValue(5.2) Square(10)
                  / \
       IntValue(10)  IntValue(10)

We are looking for a function like:

def mapNode(t: Fix[Expr])(f: Fix[Expr] => Fix[Expr])

def optimize(expr: Fix[Expr]): Fix[Expr] = expr.project match {
    case Multiply(a, b) if (a == b) => Square(a)
    case other => other
}

val optimizedExpr = mapNode(exprTree)(optimize)

Matryoshka offers such a tool, and it’s called transCataT:

  val optimizedExpr: Fix[Expr] = exprTee.transCataT(optimize)

cataM

Sometimes our evaluation produces a wrapped value. Suppose we want to evaluate the expression to a Double, but handle division by zero by wrapping the result in an Either. Here’s our new evaluation:

def evalToDouble(exp: Expr[Double]): \/[String, Double] = exp match {
  case IntValue(v) => v.toDouble.right
  case _ => ??? // etc
}

We can’t just use cata, because cata works with Algebra.

type Algebra[F[_], A] = F[A] => A

// F[A] => M[A], where M[A] means Either[String, A]
def evalToDouble(exp: Expr[Double]): Either[String, Double]

Algebra is a function type Algebra[F[_], A] = F[A] ⇒ A, while our new evaluation is of type F[A] ⇒ M[A]. If our evaluation has such signature, we can use cataM:

val correctExpr: Fix[Expr] =
  Sum(
    DecValue[Fix[Expr]](5.2).embed,
    Divide(
      DecValue[Fix[Expr]](3.0).embed,
      DecValue[Fix[Expr]](3.0).embed
    ).embed
  ).embed

val incorrectExpr: Fix[Expr] =
  Sum(
    DecValue[Fix[Expr]](5.2).embed,
    Divide(
      DecValue[Fix[Expr]](3.0).embed,
      DecValue[Fix[Expr]](0.0).embed // !!!!!!!!
    ).embed
  ).embed

correctExpr.cataM(evalToDouble) // Right(6.2)
incorrectExpr.cataM(evalToDouble) // Left("Division by zero!")

However, there’s one more requirement. Our Functor[Expr] is not enough, Matryoshka needs a Traverse[Expr].

Tip
Instead of writing a Functor for your recursive data type, write a Traverse. It is also a Functor, it’s pretty much the same amount of work, and it may become useful in case you need cataM.

Anamorphism

As we learned, cata is a bottom-up folding of a structure. Anamorphism works in the opposite direction and allows to unfold a structure. For this we need the dual of Algebra - Coalgebra:

type Coalgebra[F[_], A] = A => F[A]

Such morphism is called an unfold, because it takes an object and recursively builds up a structure basing on it.

// Int => Expr[Int]
val toBinary: Coalgebra[Expr, Int] = (n: Int) =>
n match {
  case 0 => IntValue(0)
  case 1 => IntValue(1)
  case 2 => IntValue(2)
  case _ if n % 2 == 0 => Multiply(2, n / 2)
  case _ => Sum(1, n - 1)
}

val toText: Algebra[Expr, String] = {
case IntValue(v)    => v.toString
case Sum(a, b)      => s"($a + $b)"
case Multiply(a, b) => s"($a * $b)"
}

// unfold with anamorphism
val expr = 31.ana.apply[Fix[Expr]](toBinary)
// and now fold with catamorphism
val binAsStr = expr.cata(toText) // (1 + (2 * (1 + (2 * (1 + (2 * (1 + 2)))))))

A composition of ana+cata is called hylomorphism

val binAsStr = 31.hylo(toText, toBinary)

Paramorphism

Also a fold, very similar to catamorphism. However, it adds one more extra feature - with paramorphism, you not only build current state basing on evaluation of previous states, but you also have access to these states. Confusing? Here’s an example: Let’s say we want to use a special algorithm for a specific case:

We want to print (3 + -5) as (5 - 3) for better readability.

With cata, we don’t have enough information except Strings:

case Sum(left: String, right: String) => ??? // what was left and right before their evaluation?

Parsing Strings here doesn’t seem like a good idea. With para, we can use a richer kind of algebra:

case Sum((leftSrc, leftStr), (rightSrc, rightStr)) =>
  leftSrc.project match {
    case IntValue(a) =>
      rightSrc.project match {
        case IntValue(b) if a > 0 && b < 0 => s"($a - ${-b})"
        case IntValue(b) if b > 0 && a < 0 => s"($b - ${-a})"
        case _                             => s"$leftStr + $rightStr"
      }
    case _ => s"$leftStr + $rightStr"
  }

This is a different kind of Algebra: Expr[(Fix[Expr], String)] ⇒ String, so it’s F[(T, A)] ⇒ A, where T means Fix[Expr] for this particular case. To be more precise, it’s

type GAlgebra[W[_], F[_], A] = F[W[A]] => A

Where in our case F[W[A]] is Expr[Tuple2[T, A]]. Paramorphism is useful when we need to know more about node’s childrens' structure in order to fully evaulate that node.

Paramorphisms are generalized folds with access to the input argument corresponding to the most recent state of the computation.

However, para is limited, because we only know the source structure. If you need more, see histomorphism. But let’s take a break from morphisms and check out some other interesting concepts.

Cofree

Cofree is a very broad concept. It’s actually a "comonad" with quite a few interesting attributes and applications. For the sake of this course, let’s consider Cofree[S, A] to be a pair of:

  1. a value of type A

  2. A recursive expression S which can consist of …​ deeper elements of type Cofree.

Cofree is often used to construct labelled expression. Each node in an expression gets an extra label (tag). For a simple DSL:

sealed trait Expr[A]

case class IntValue[A](v: Int)     extends Expr[A]
case class DecValue[A](v: Double)  extends Expr[A]
case class Sum[A](a: A, b: A)      extends Expr[A]
case class Square[A](a: A)         extends Expr[A]

we can label any Expr with an ExprType:

sealed trait ExprType
case object IntExpr extends ExprType
case object DecExpr extends ExprType

Our rules can be simple: a sum of integers is an integer. A sum of (int + dec) is a decimal. A square of a type has the same type. A tagged expression is of type Cofree[Expr, ExprType]. Such object cosists of:

  1. A head: ExprType which is the tag (label) placed on the node.

  2. A tail: Expr[Cofree[Expr, ExprType]] which is the expression itself (Expr[A]) where A is a deeper level of Cofree.

We can recursively apply such labelling with catamorphism, using an Algebra[Expr, Cofree[Expr, ExprType]]

val inferType: Algebra[Expr, Cofree[Expr, ExprType]] = {
  case IntValue(v) => Cofree.apply(IntExpr, IntValue(v))
  case DecValue(v) => Cofree.apply(DecExpr, DecValue(v))
  case Sum(a, b) => ??? // a: Cofree[Expr, ExprType]]
}

Cofree is much more than a tuple with extras. It’s a fixed-point operator. This means that it has the same power as Fix. Having an expression wrapped with Cofree, we can apply morphisms directly to it!

val typedExpr: Cofree[Expr, ExprType] = expr.cata(inferType)

val toTypedStr: Algebra[EnvT[ExprType, Expr, ?], String] = { case EnvTexprType, Sum(a, b) ⇒ s"($a + $b): $exprType" // (3 + 5.5): DecValue case _ ⇒ "…​" }

This reveals one more mystery - there are more fixed point operators than just Fix and Cofree.

Histomorphism

Histomporphism is a fold which also provides something akin to a "pair". While in paramorphism for each node we could access pairs of (child_evaluated + child_sourceExpression), in histomorphism we work with (child_evaluated + child_tree). This child_tree can be traversed further down. To represent this whole pair, we use Cofree.

val smartPrint: GAlgebra[Cofree[Expr, ?], Expr, String] = {
    case Square(Cofree(a: String, history: Expr[Cofree[Expr, String]])) =>
        history match {
          case Sum(Cofree(a, aHistory), Cofree(b, bHistory)) => ...
          case _ => ...
        }
    case _ => ...
}

We can unpack each node to desired depth, because history elements carry everything that has been computed up to this point.

Fixed point operators

So far we considered Fix[] to be a kind of type-level trick to deal with recursive expressions. For most applications such approach should be enough. Here’s a bit more format explanation.

In math, a fixed point of a function f is a value a such that

f(a) == a

We can now say that fix is a function such that

fix(f) == f(fix(f))

Does this look familiar? For types, we had:

val fix = Fix[Expr]
val expr: Expr[Fix[Expr]] = fix.unfix

Mu

Mu is a more restricted version of Fix. It can be used to represent inductive finite data. We can replace our previous usages of Fix with Mu.

Nu

Nu can be used to represent infinite coinductive data (like streams).

List

sealed abstract class ListF[A, B]
final case class NilF[A, B]() extends ListF[A, B]
final case class ConsF[A, B](head: A, tail: B) extends ListF[A, B]

type List[A]   = Mu[ListF[A, ?]]

Cofree

Cofree is a vast subject in itself. It’s a "comonad" with many interesting properties, but for the sake of this course we are interested in its recursive character. In a simplicit view, Cofree is:

  • some value of type A

plus

  • an expression with nested Cofree(s)

In other words, Cofree can be viewed as a recursive structure similar to Fix[Expr] but with additional labels (tags) placed on every node of our tree.

One example may be adding a type system to our DSL. This allows putting a "type" tag on each node of an expression tree. For example, let’s decompose a Sum(IntVal(3), Sum(IntVal(5), DecVal(5.5))):

If we encode a rule that (Int + Dec) has type Dec, then our expression becomes:

       Sum: Dec
     /     \
    /      Sum: Dec
   /        /     \
3: Int   5: Int    5.5: Dec

With recursion schemes, we can construct a Cofree representing our expression with labels using catamorphism. Then we can run further morphisms on a Cofree without referring to Fix or any other wrapper, since Cofree is such a wrapper in itself. See exercises for examples.

Recursive / Corecursive

Finally we can say that all mentioned types: Fix, Mu, Nu, List, Cofree and others share common properties. These properties are grouped under the Recursive[T] and Corecursive[T] typeclasses. These types are generalizations which allow us to write even more abstract code and stay independent of particular wrapper type if it’s possible. That’s why we learned to use .project and .embed - functions from these type classes, instead of unfix which was specific to Fix.

Free

Another interesting Birecursive[T] (which means both Recursive and Corecursive) is Free. Yes, that Free. A free monad also can be run through morphisms! After all, free monads are often used to work with DSLs describing some programs. Such program description is similar to our Expr[A].

If you’re familiar with free monads, you probably recognize foldMap. It’s nothing more but a catamorphism!

About

Recursion schemes training examples and exercises

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published