A Javascript implementation of (some of) the monadic parser combinators defined by G. Hutton and E. Meijer (cf. [1], see below).
You can see the combinators in action here (the combinators are used to define the parsing function): Lisp interpreter
[1] G. Hutton and E. Meijer, Monadic parsing in Haskell, Journal of Functional Programming 8, Cambridge University Press (1998), 437–444.
A (monadic) parser combinator is a higher-order function that takes simple building blocks and produces a more specific and complex parsing unit. In our context a parser can be understood as a funcion that takes a string, x
say, and returns a list of tuples [(a_1, y_1), ...,(a_n, y_n)]
, where the a_i
are of some previously fixed data type A
and the y_i
are strings as well, i.e.
type Parser = String ---> [(A,String)].
The first component of the tupel can be understood as the result of the parser (e.g. another string, a single character, or a abstract synthax tree), whereas the second component is the remaining string (i.e. the part of the string that hasn't yet been "consumend" by the parser). The empty list []
indicates a failed approach of parsing the given string. The term monadic refers to the fact that we can endow the the set of parsers with an additional structure -- we'll dive into that later.
Example. Here is a simple parsing function that consumes the letter "z":
var z = function (string) {
var first = string.charAt(0),
rest = string.slice(1);
if (first === "z") return [["z", rest]];
else return [];
}
z("zoo"); // [["z", "oo"]]
z("shoe"); // []
Example. Here is another simple parsing function that consumes the first letter of a string, or fails if the string is empty:
var item = function (string) {
return string.length === 0 ? [] : [ [string.charAt(0), string.slice(1)] ];
}
item("zoo"); // [["z", "oo"]]
item("shoe"); // [["s", "hoe"]]
item(""); // []
Example. Lets conclude the section with a trivial example. The function that always fails:
var zero = function (string) {
return [];
}
zero("whatever"); // []
Bind. Let's forget for a moment that a parsing function actually returns a list of results [(a_1, y_1), ...,(a_n, y_n)]
and assume that it just returns single tupel (a, y)
. Furthermore suppose we are given two parsing functions P
and Q
. One could easily define a new parsing function P*Q
by
P*Q: x |---> P(x)=(a, y) |---> Q(y)=(b, z).
There is nothing wrong with this approach, however the final result (b,z)
does not depend on the intermediate result a
of the first function P
at all. The natural evolution of the above approach would be the following
P `bind` f: x |---> P(x) = (a, y) |---> (f(a))(y)=(b, z).
Note that we replaced Q
by a function f: A ---> Parser
, i.e. given some a
the received function value f(a)
is a parsing function itself.
Result. In the previous section we saw how to combine a parser with a function f: A ---> Parser
. Among these functions there is a neutral candidate that stands out. It is called result
and is defined as follows
result x: input |---> [(x,input)].
It behaves neutral in the sense that we have
P `bind` result = P.