Skip to content
Alex Rakov edited this page Feb 22, 2019 · 15 revisions

ELENA uses LL(1) grammar. It is an analytical grammar, meaning that the role of the terminal token is defined by its position in the statement. As a result the grammar lacks the keywords ( instead user-defined attributes are used ) and there are no reserved words. For example, it is possible to write the code without attributes at all:

class
{
   field;

   method(param)
   {
   }
} 

where a class with a name class is declared. It has one field with a name field and one method - method[1]

But in most cases it is required to provide additional info. So the token role is defined by its position.

class class;

singleton Tester
{
    do(var class class)
    {        
    }
}

public program()
{
    var class class := new class();
        
    Tester.do(class);
}

where class is used as an attribute to declare a new class, as a identifier type and as an identifier.

Lexical grammar

input ::=
   { input-element* new-line }*

input-element ::=
    whitespace | comment | token

Line terminators

new-line ::=
   Carriage return character (U+000D) followed by line feed character (U+000A) 
   | Line feed character (U+000A)

White space

whitespace ::=
   whitespace-character+  

whitespace-character ::=
   Horizontal tab character (U+0009)
   | Space (U+0020)

Comments

comment ::=
   single-line-comment
   | delimited-comment

single-line-comment ::=
   "//" input-character*

input-character ::=
   Any Unicode character except a new-line-character

new-line-character ::=
   Carriage return character (U+000D)
   | Line feed character (U+000A)       

delimited-comment ::=
   "/*" delimited-comment-section* "*"+ "/"

delimited-comment-section ::=
   not-asterisk
   | "*"+ not-slash

not-asterisk ::=
   Any Unicode character except *

not-slash ::=
   Any Unicode character except /

Tokens

token ::=
   identifier
   | reference
   | integer-literal
   | real-literal
   | character-literal
   | string-literal
   | operator-literal
   | operator-or-punctuator

Identifiers and References

identifier ::=
   identifier-start-character identifier-part-character* 

reference ::= 
   identifier { "'" identifier }+

identifier-start-character ::=
   letter-character
   | "_"

identifier-part-character ::=
   letter-character
   | decimal-digit-character
   | "_"      

letter-character ::=
   A Unicode character of classes Lu, Ll, Lt, Lm, Lo, or Nl

decimal-digit-character ::=
   A Unicode character of the class Nd

Literals

integer-literal ::=
   decimal-digit+ integer-type-suffix?

real-literal ::=
   decimal-digit+ "." decimal-digit+ exponent-part? real-type-suffix

character-literal ::=
   "$" decimal-digit+

string-literal ::= 
   '"' string-character* '"'
   | quote-escape-sequence

decimal-digit ::=
   "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"

exponent-part ::=
   "e" sign? decimal-digit+

sign ::=
   "+"
   | "-"

string-character ::=
   Any character except "
   | quote-escape-sequence

quote-escape-sequence ::=
   '""'

integer-type-suffix ::=
   "h" | "l"

real-type-suffix ::=
   "r"

Operators

operator-literal ::=
   $shr
   | $shl
   | $fnl

operator-or-punctuator ::=
   "{"
   | "}"
   | "["
   | "]"
   | "("
   | ")"
   | "."
   | ","
   | ":"
   | ";"
   | "+"
   | "-"
   | "*"
   | "/"
   | "|"
   | "\"
   | "^"
   | "!"
   | "="
   | "<"
   | ">"
   | "?"
   | "&&"
   | "||"
   | "=="
   | "!="
   | "<="
   | ">="
   | "+="
   | "-="
   | "*="
   | "/="
   | "=>"
   | ":="
   | "[]"