-
Notifications
You must be signed in to change notification settings - Fork 0
Calculator example (ambiguous grammar)
Import SwiLex
to define terminal tokens needed by the SwiLex
lexer. :
import SwiLex
Import SwiParse
to define a parser and its grammar:
import SwiParse
In the last example, the grammar used to define the calculator had no ambiguities. In this second example the goal is to showcase how to use the rules' precedence and operator associativity to resolve Shift/Reduce conflicts. Let's consider the following (ambiguous) grammar:
<expr> → <expr> + <expr> // rule 0
| <expr> - <expr> // rule 1
| <expr> * <expr> // rule 2
| <expr> / <expr> // rule 3
| ( <expr> ) // rule 4
| number // rule 5
If we consider the rule 0
, when the token +
is read as lookahead token, and the content of the stack is <expr> + <expr>
the parser has two options :
-
Reduce: the content of the stack is reduced following the
rule 0
into<expr>
before shifting the new tokens. -
Shift: the parser shifts the next tokens before reducing them following the
rule 0
(last in first out).
This situation, where either a shift or a reduction would be valid, is called a shift / reduce conflict
.
In the last example, choosing to reduce
is equivalent to give priority to a left associativity
for the +
token while a shift
is giving priority to a right associativity
.
if we consider the following input string 1 + 2 + 3
(e1
):
- With a
reduce
(left associativity
) the result will be computed as the following:
1
=> 1 +
=> 1 + 2
=> 3
=> 3 +
=> 3 + 3
=> 6
- With a
shift
(right associativity
) the result will be computed as the following:
1
=> 1 +
=> 1 + 2
=> 1 + 2
=> 1 + 2 +
=> 1 + 2 + 3
=> 1 + 5
=> 6
In other words:
Areduce
will consider the expression e1
as (1 + 2) + 3
while a shift
will consider it as 1 + (2 + 3)
In this example the grammar has another kind
of ambiguity. No precedence is defined between the different rules.
If we consider the rules rule 0
and rule 2
, for example, when the token *
is read as lookahead token, and the content of the stack is <expr> + <expr>
the parser has two options :
-
Reduce: the content of the stack is reduced following the
rule 0
before shifting the new tokens and reducing following therule 2
. -
Shift: the parser shifts the next tokens, reduces following the
rule 2
before reducing the result following therule 0
.
Again it's a shift / reduce conflict
.
In the last example, choosing to reduce
is equivalent to favoring the +
token over the *
one, while a shift
favors a *
over +
(operators' precedence
).
if we consider the following input string 1 + 2 * 3
(e1
):
- With a
reduce
(+ < *
) the result will be computed as the following:
1
=> 1 +
=> 1 + 2
=> 3
=> 3 *
=> 3 * 3
=> 9
- With a
shift
(+ > *
) the result will be computed as the following:
1
=> 1 +
=> 1 + 2
=> 1 + 2
=> 1 + 2 *
=> 1 + 2 * 3
=> 1 + 6
=> 7
In other words:
A reduce
will consider the expression e1
as (1 + 2) * 3
while a shift
will consider it as 1 + (2 * 3)
Like before, define the terminal and non-terminal tokens of the grammar:
enum Terminal: String, SwiLexable {
static var separators: Set<Character> = [" "]
case number = "[0-9]+"
case plus = #"\+"#
case minus = #"-"#
case times = #"\*"#
case divide = #"/"#
case lParenthesis = #"\("#
case rParenthesis = #"\)"#
case eof
case none // empty token
}
enum NonTerminal: SwiParsable {
case expr
case start // starting token
}
Define the reduction actions for each rule of the grammar :
func sum(a: Int, op _: Substring, b: Int) -> Int { a + b }
func minus(a: Int, op _: Substring, b: Int) -> Int { a - b }
func times(a: Int, op _: Substring, b: Int) -> Int { a * b }
func divide(a: Int, op _: Substring, b: Int) -> Int { a / b }
func parenthesis(_: Substring, a: Int, _: Substring) -> Int { a }
func toInt(s: Substring) -> Int { Int(s) ?? 0 }
Next, define a parser with the grammar by providing its rules and their reduction actions:
let parser = try SwiParse<Terminal, NonTerminal>(rules: [
.start => [Word(.expr)], // accepting rule
.expr => [Word(.expr), Word(.plus), Word(.expr)] >>>> sum,
.expr => [Word(.expr), Word(.minus), Word(.expr)] >>>> minus,
.expr => [Word(.expr), Word(.times), Word(.expr)] >>>> times,
.expr => [Word(.expr), Word(.divide), Word(.expr)] >>>> divide,
.expr => [Word(.lParenthesis), Word(.expr), Word(.rParenthesis)] >>>> parenthesis,
.expr => [Word(.number)] >>>> toInt,
])
In order to specify the associativity
and the precedence
of the grammar's operators, the parser's definition takes as argument an optional list: priorities
.
let parser = try SwiParse<Terminal, NonTerminal>(rules: [
// grammar rules
], priorities [
// priorities (associativity, precedence)
])
- A priority is either
.left
or.right
to define theassociativity
. - The index of the priority in the priority list defines the
precedence
(from low to high).
In the following example: [.left(.tok1), .right(.tok2)]
-
.tok1
isleft associative
and.tok2
isright associative
. -
.tok2
has higher precedence than.tok1
.
Let's specify the priority list for the example grammar to resolve the shift / reduce
conflicts.
let parser = try SwiParse<Terminal, NonTerminal>(rules: [
// grammar rules
], priorities [
.left(.plus),
.left(.minus),
.left(.times),
.left(.divide),
])
Finally parse an input string using the defined parser:
let result = try parser.parse(input: "1 + 2 * 3")
// 7
A mistake? A bug? Please report it by opening an issue or via twitter.