From 8d803d23a6c36fc9cce01f4d558b096d4f02c26d Mon Sep 17 00:00:00 2001
From: Mark Hollomon <mhhollomon@gmail.com>
Date: Tue, 15 Jan 2019 19:26:01 -0500
Subject: [PATCH] Update README for latest changes.

---
 README.md        | 121 +++++++++++++++++++++++++++++++++++++----------
 scripts/build.sh |   2 +-
 2 files changed, 97 insertions(+), 26 deletions(-)
diff --git a/README.md b/README.md
index 218ebba..674ae20 100644
--- a/README.md
+++ b/README.md
@@ -8,21 +8,14 @@ The generated parser will use [recursive ascent](https://en.wikipedia.org/wiki/R
 
 ## Status
 
-Current sub-goal is to have yalr generate a language recognizer - that is, the
-generated code will simply give a yes/no answer to the question "does the input
-string match the grammar?"
-
-- Specification parser - complete for the limited features set.
-- Syntax analyzer - complete for the limited feature set.
-- LR Parser Table generator - complete. - SLR(1) for the moment. Hard codes
-  reduce as priority over shift in shift/reduce conflicts. Fails on
-  reduce/reduce conflicts.
-- Code generator - complete.
-- lexer - not started. Haven't finalized an approach. May use regex for the
-  short term.
-
-After the sub-goal is complete, I will probably stop to tidy up the place - add
-unit tests, make the parser a bit better about error reporting, etc.
+Yalr currently generates a *language recognizer* - that is, the generated code
+will simply give a yes/no answer to the question "Does the input string match
+the grammar?"
+
+Both a lexer and a parser are generated.
+
+The next goal is to "tidy up" - add more unit tests, make the parser a
+bit better about error reporting, etc.
 
 ## Building
 
@@ -66,13 +59,15 @@ yalr -t grammophone my_grammar.yalr
 
 ## Grammar Spec
 
-The follow types of statements may appear in any order.
-
-whitespace is generally not significant. `C` style `/* ... */` comments 
+Whitespace is generally not significant. `C` style `/* ... */` comments 
 as well as C++ `//` comments are supported.
 
 Keywords are reserved and may not be used as the name of a terminal or rule.
 
+The [example
+directory](https://github.com/mhhollomon/yalr/tree/master/examples) contains
+some example grammars including the grammar for the yalr grammar itself.
+
 ### Parser Class Name
 
 The parser is normally put in `class YalrParser`. This can be changed by using the statement:
@@ -84,25 +79,62 @@ This can be overriden by the `--output-file` option on the command line.
 
 This statement may only appear once in the file.
 
+It must be the first statement in the grammar.
+
 ### Terminals
 
-All terminals must be explicitly declared:
+All terminals must be explicitly declared.
+
+There are two types of terminal - "parser" terminals and "lexer" terminals.
+
+#### Parser Terminals
+
+Parser Terminals are those terminals that are used to create the rules in
+grammar. These are the terminals that are return by the lexer.
+
+Parser Terminals are defined by the `term` keyword.
 
 ```
-term MYTERM;
+// term <ID> <"pattern"> ;
+term MYTERM "my[0-9]" ;
 ```
 
-A "pattern" can be associated with the terminal. Nothing is currently done with
-this, but it will help form the lexer at some point. The pattern must be
-surronded by double quotes.
+The `ID` is the name for the terminal that will be used in the grammar and will
+be returned in error messages. It will also be a part of the enumeration
+constant for the token type in the generated code.
+
+The pattern must be a (c++ std::regex
+pattern)[https://en.cppreference.com/w/cpp/regex/ecmascript] and must be
+enclosed in double quotes. The pattern (and the quotes) are copied verbatim
+into the generated lexer, so ust be in the same formatting (with the same
+escaping) as you would do if writing the code by hand.
+
+*TODO:* Add support for recognizing raw string literals.
+
+#### Lexer Terminals
+
+Lexer terminals are recognized by the the lexer but are not returned. They are
+a means to skip over input that you do not want the grammar to consider.
+
+Lexer terminals may not appear in rules.
+
+Lexer terminals are defined by the `skip` keyword.
 
 ```
-term MYTERM "fo+" ;
+// skip <ID> <"pattern"> ;
+skip WS "\\s+" ;
+
+// recognize line oriented comments
+skip LINEC "//.*\\n" ;
+
+// This is an ERROR!
+rule Foo { => WS ; }
 ```
 
+
 ### Non-terminals
 Rules are declared with the `rule` keyword.
-Each alternate is intrduced with `=>` and terminated with a semicolon.
+Each alternate is introduced with `=>` and terminated with a semicolon.
 
 One rule must be marked as the starting or "goal" rule, by preceeding it with the `goal` keyword.
 
@@ -120,6 +152,37 @@ goal rule Program {
   => Program Statement ;
 }
 ```
+
+## Generated Code
+
+Pre-pre-alpha. Subject to change.
+
+*TODO:* Add info about the generated code. longest match rule, first match as
+tie-breaker.
+
+### Sample Driver
+
+Here is all you need. Season to taste.
+
+```cpp
+#include "./YalrParser.hpp"
+
+int main() {
+    std::string input = "My Input";
+
+    YalrParser::Lexer l(input.cbegin(), input.cend());
+    auto parser = YalrParser::YalrParser(l);
+
+    if (parser.doparse()) {
+        std::cout << "It Worked!\n";
+    } else {
+        std::cout << "too bad!\n";
+    }
+
+    return 0;
+}
+```
+
 ## References
 - [Elkhound](http://scottmcpeak.com/elkhound/sources/elkhound/index.html)
 - [Lemon](http://www.hwaci.com/sw/lemon/)
@@ -134,3 +197,11 @@ goal rule Program {
     Parsers](https://link.springer.com/content/pdf/10.1007/3-540-53669-8_70.pdf)
   - [Recursive ascent-descent
     parsing](https://webhome.cs.uvic.ca/~nigelh/Publications/rad.pdf)
+
+## Technologies
+- [Meson](https://mesonbuild.com/) for build configuration.
+- [Ninja](https://ninja-build.org/) for building.
+- [Catch2](https://github.com/catchorg/Catch2) for unit testing.
+- [Boost::Spirit::X3](https://www.boost.org/doc/libs/develop/libs/spirit/doc/x3/html/index.html)
+is currently used to build the grammar spec parser.
+- [cxxopts](https://github.com/jarro2783/cxxopts) for command line handling.
diff --git a/scripts/build.sh b/scripts/build.sh
index 3070b91..dcf73d6 100755
--- a/scripts/build.sh
+++ b/scripts/build.sh
@@ -1,4 +1,4 @@
 #!/usr/bin/bash
 
 mkdir build
-CC=clang CXX=clag++ meson build .
+CC=clang CXX=clang++ meson build .