Skip to content

Calculator example (ambiguous grammar)

yassir RAMDANI edited this page Nov 27, 2020 · 1 revision

Import SwiLex to define terminal tokens needed by the SwiLex lexer. :

import SwiLex

Import SwiParse to define a parser and its grammar:

import SwiParse

In the last example, the grammar used to define the calculator had no ambiguities. In this second example the goal is to showcase how to use the rules' precedence and operator associativity to resolve Shift/Reduce conflicts. Let's consider the following (ambiguous) grammar:

<expr> →   <expr> + <expr>      // rule 0
         | <expr> - <expr>      // rule 1
         | <expr> * <expr>      // rule 2
         | <expr> / <expr>      // rule 3
         | ( <expr> )           // rule 4
         | number               // rule 5

Associativity

If we consider the rule 0, when the token + is read as lookahead token, and the content of the stack is <expr> + <expr> the parser has two options :

  1. Reduce: the content of the stack is reduced following the rule 0 into <expr> before shifting the new tokens.
  2. Shift: the parser shifts the next tokens before reducing them following the rule 0 (last in first out).

This situation, where either a shift or a reduction would be valid, is called a shift / reduce conflict.

In the last example, choosing to reduce is equivalent to give priority to a left associativity for the + token while a shift is giving priority to a right associativity.

if we consider the following input string 1 + 2 + 3 (e1):

  • With a reduce (left associativity) the result will be computed as the following:

1 => 1 + => 1 + 2 => 3 => 3 + => 3 + 3 => 6

  • With a shift (right associativity) the result will be computed as the following:

1 => 1 + => 1 + 2 => 1 + 2 => 1 + 2 + => 1 + 2 + 3 => 1 + 5 => 6

In other words:

Areduce will consider the expression e1 as (1 + 2) + 3 while a shift will consider it as 1 + (2 + 3)

Precedence

In this example the grammar has another kind of ambiguity. No precedence is defined between the different rules.

If we consider the rules rule 0 and rule 2, for example, when the token * is read as lookahead token, and the content of the stack is <expr> + <expr> the parser has two options :

  1. Reduce: the content of the stack is reduced following the rule 0 before shifting the new tokens and reducing following the rule 2.
  2. Shift: the parser shifts the next tokens, reduces following the rule 2 before reducing the result following the rule 0.

Again it's a shift / reduce conflict.

In the last example, choosing to reduce is equivalent to favoring the + token over the * one, while a shift favors a * over + (operators' precedence).

if we consider the following input string 1 + 2 * 3 (e1):

  • With a reduce (+ < *) the result will be computed as the following:

1 => 1 + => 1 + 2 => 3 => 3 * => 3 * 3 => 9

  • With a shift (+ > *) the result will be computed as the following:

1 => 1 + => 1 + 2 => 1 + 2 => 1 + 2 * => 1 + 2 * 3 => 1 + 6 => 7

In other words:

A reduce will consider the expression e1 as (1 + 2) * 3 while a shift will consider it as 1 + (2 * 3)

Resolving shift / reduce conflicts with SwiParse

Like before, define the terminal and non-terminal tokens of the grammar:

enum Terminal: String, SwiLexable {
    static var separators: Set<Character> = [" "]

    case number = "[0-9]+"
    case plus = #"\+"#
    case minus = #"-"#
    case times = #"\*"#
    case divide = #"/"#

    case lParenthesis = #"\("#
    case rParenthesis = #"\)"#

    case eof
    case none   // empty token
}

enum NonTerminal: SwiParsable {
    case expr

    case start  // starting token
}

Define the reduction actions for each rule of the grammar :

func sum(a: Int, op _: Substring, b: Int) -> Int { a + b }
func minus(a: Int, op _: Substring, b: Int) -> Int { a - b }

func times(a: Int, op _: Substring, b: Int) -> Int { a * b }
func divide(a: Int, op _: Substring, b: Int) -> Int { a / b }

func parenthesis(_: Substring, a: Int, _: Substring) -> Int { a }
func toInt(s: Substring) -> Int { Int(s) ?? 0 }

Next, define a parser with the grammar by providing its rules and their reduction actions:

let parser = try SwiParse<Terminal, NonTerminal>(rules: [
    .start => [Word(.expr)], // accepting rule

    .expr => [Word(.expr), Word(.plus), Word(.expr)] >>>> sum,
    .expr => [Word(.expr), Word(.minus), Word(.expr)] >>>> minus,
    .expr => [Word(.expr), Word(.times), Word(.expr)] >>>> times,
    .expr => [Word(.expr), Word(.divide), Word(.expr)] >>>> divide,
    .expr => [Word(.lParenthesis), Word(.expr), Word(.rParenthesis)] >>>> parenthesis,
    .expr => [Word(.number)] >>>> toInt,
])

In order to specify the associativity and the precedence of the grammar's operators, the parser's definition takes as argument an optional list: priorities.

let parser = try SwiParse<Terminal, NonTerminal>(rules: [
    // grammar rules
], priorities [
    // priorities (associativity, precedence)
])
  • A priority is either .left or .right to define the associativity.
  • The index of the priority in the priority list defines the precedence (from low to high).

In the following example: [.left(.tok1), .right(.tok2)]

  • .tok1 is left associative and .tok2 is right associative.
  • .tok2 has higher precedence than .tok1.

Let's specify the priority list for the example grammar to resolve the shift / reduce conflicts.

let parser = try SwiParse<Terminal, NonTerminal>(rules: [
    // grammar rules
], priorities [
    .left(.plus),
    .left(.minus),
    .left(.times),
    .left(.divide),
])

Finally parse an input string using the defined parser:

let result = try parser.parse(input: "1 + 2 * 3")
// 7