Skip to content

Lexer Rules

Randall O'Reilly edited this page Dec 31, 2018 · 1 revision

Lexing is all about the ordering -- follow the above principle of preemptive specificity as described in the README.

Another common situation is resolving the ambiguity about multi-character constructs, such as := vs just : or =. In this case, you have an outer rule that matches :, and then within that, child rules that do a one-step lookahead for a = at Off=1 -- if that matches, then it is := -- you need to put that rule first, and then the default is just the plain :.

Here's how that looks in the lexer rules for Go:

Colon:		 None		 if String == ":" {
    Define:       OpAsgnDefine        if +1:String == "="   do: Next; 
    Colon:        PunctSepColon       if String == ":"      do: Next; 
}

And here is similar logic for +:

Plus:		 None		 if String == "+" {
    AsgnAdd:       OpMathAsgnAdd       if +1:String == "="   do: Next; 
    AsgnInc:       OpAsgnInc           if +1:String == "+"   do: Next; 
    Add:           OpMathAdd           if String == "+"      do: Next; 
}

Note that you can directly refer to the next character using the Offset of the rule (shown as the +1 in the above examples) -- there is no limit to how far ahead you can look, as the whole thing is all there in RAM.

Clone this wiki locally