Example interpreters are located in interpreters
directory.
- The subdirectory
src
contains source interpreters - The subdirectory
out
contains abstract machines generated from source interpreters - The subdirectory
lib
contains syntax definitionsreuire
d by interpreters
Both source interpreters and generated abstract machines are executable by Racket. To run the test suite of an interpreter (or a machine) use raco
, e.g for call-by-value lambda calculus:
cd interpreters
raco test src/lc-call-by-value.rkt
raco test out/lc-call-by-value.rkt
The semantic transformer - semt
is built from source using cabal
- a Haskell package manager.
cabal build
to buildcabal install
to installsemt
executablecabal run semt --
to run the development build without installingmake test
to test the tool on all example interpretersmake clean
to delete all generated interpreters
The basic mode of usage is to transform a file.rkt
containing an interpreter into out/file.rkt
which is a source file containing the transformed interpreter (i.e., an abstract machine) using the command semt file.rkt
.
The options modifying the behavior of the tool can be displayed with the command semt --help
:
Usage: semt FILE [-o|--output DIR] [-i|--intermediate]
[-d|--debug] [-t|--self-test]
Transform an interpreter into an abstract machine.
Available options:
FILE Source file with the interpreter.
-o,--output DIR Output directory for generated
files, defaults to ./out/
-i,--intermediate Emit executable source files for
each stage.
-d,--debug Emit labeled source files for
each stage.
-t,--self-test Run raco test on each
intermediate file; implies
--intermediate
-h,--help Show this help text
The tool assumes that the source file with an interpreter is a Racket program. An example source listing is shown below.
#lang racket
(require "../lib/idl.rkt")
; begin interpreter
(def-data Term
String ;; variable
{Abs String Term} ;; lambda abstraction
{App Term Term} ;; application
{Unit}) ;; unit value (for testing)
(def eval (env [Term term])
(match term
([String x] (env x))
({Abs x body}
(fun (v) (eval (extend env x v) body)))
({App fn arg}
(let fn (eval env fn))
(let arg (eval env arg))
(fn arg))
({Unit} {Unit})))
(def extend #:atomic (env k v)
(fun #:atomic #:name Extend #:apply lookup (x)
(match (eq? x k)
(#t v)
(#f (env x)))))
(def init #:atomic (x) (error "empty environment"))
(def main ([Term term])
(eval init term))
; end interpreter
(module+ test
(require rackunit)
(define pgm
{App {Abs "x" "x"} {Unit}})
(check-equal? (main pgm) {Unit})
)
The preamble (everything up to the ; begin interpreter
marker in line 4) is copied verbatim by the tool and should be used to specify Racket's dialect (line 1) and import syntax definitions (line 2).
Afterwards an interpreter is specified (lines 4 -- 31) and it ends with another marker ; end interpreter
in line 32.
Everything following the marker is again copied to the output file.
In the example this space is used to define some tests for the interpreter.
The syntax of IDL is given in the table below.
data-def | ::= | (def-data tp-name tp-elem ... ) |
tp-elem | ::= | tp | record |
record | ::= | { tp-name record-field ...} |
record-field | ::= | tp | var | [ tp var ] |
record-def | ::= | (def-struct record ) |
base-tp | ::= | String | Integer | Boolean |
tp | ::= | Any | base-tp | tp-name |
fun-def | ::= | (def var annot ... ( arg ...) statements ) |
annot | ::= | #:no-defun | #:atomic | #:name tp-name | #:apply var |
arg | ::= | var | [ tp var ] |
const | ::= | integer | string | #t | #f |
statements | ::= | (let var term ) statements | term |
term | ::= | var | (fun annot ... ( arg ...) statements) | ( term term ... ) |
| | { tp-name term ...} | (match term branch ... ) | (error string ) |
|
branch | ::= | ( pattern statements ) |
pattern | ::= | var | const | _ | [ base-tp var] | { tp-name pattern ...} |
The interpreter consists of a sequence of datatype (data-def), record (struct-def) and function (fun-def) definitions.
One of the functions must be named main
and will serve as an entry point of the interpreter.
It is required for the main
function to have parameters annotated with types, e.g., (def main ([Term term]) body)
.
Names (tp-name) of datatypes and records must be unique and distinct from base types (base-tp and Any
).
All definitions may be mutually recursive.
The term syntax is split into terms (term) and statements (statements) where the latter appear as the bodies of function definitions (both top-level and anonymous) and branches.
Statements are sequences of let-bindings terminated by a regular term (see lines 16-19 of the example program).
Variables (var) and type names (tp-name) may consist of ASCII letters, decimal digits and the following symbols: -+/*_?<
.
Variables begin either with a lower-case letter or symbol and type names begin with an upper-case letter.
In order to run an interpreter embedded in a source file, the library with syntax definitions (idl.rkt
) provided with this thesis must be imported.
The file is located in the source tree of the project in the interpreters/lib/
directory.
The run-time behavior of the program follows the call-by-value behavior of Racket, e.g.:
let
evaluates the bound term to a value before binding. The binding is visible only in following statements.- Application evaluates terms left-to-right and then applies the value in function position.
match
evaluates the term under scrutiny and tests patterns against the value in order of their definition continuing with the first match.
The interpreters may use the following builtin operations: +
, -
, *
, /
, neg
, not
, and
, or
, eq?
, <
with the usual semantics.
#:apply
specifies the name for apply function generated for the defunctionalized function space whose member is the annotated function.#:name
specifies the name for the record which will represent the annotated function after defunctionalization.#:no-defun
skips defunctionalization of the annotated function (either all or none of the functions in a space must be annotated).#:atomic
means that the annotated function (and calls to it) will be left in direct style during translation to CPS.
semt/Main.hs
is theMain
module, which contains themain
function, parses command line arguments and callsPipeline.run
which performs the transformation- module
Pipeline
contains the transformation pipeline and its components - module
Syntax
contains all syntax definitions used in the transformer. TheTermF
is a functor over its subtrees to accomodate different representations of syntax trees (e.g.Term
defined insrc/Syntax.hs
orLabel
used in the abstract interpretersrc/AbsInt/Types.hs
) - module
Prelude
contains the prelude imported implicitly into every module insrc/
- module
AbsInt
contains the abstract abstract machine used to compute control-flow analysis
The project uses Polysemy
to manage computational effects.