-
Notifications
You must be signed in to change notification settings - Fork 0
Format Math Grammar
Storage formats for mathematical grammars, automata and others grammar related data in XBUP protocol.
This group is used to define blocks of mathematical grammar. Grammar is a quadruple (N, Σ, P, S) where:
N is a nonempty finite set of non-terminal symbols
Σ is a finite set of terminal symbols disjoint with N
The union called N and Σ
P is a finite set of rules, a subset of V * NV * x *, we write in the form α → β
With the initial non-terminal symbol
From this follows the basic structure of the grammar:
Regular grammar
- The set of terminals
- The set of nonterminals
- The set of rules
The value of S could be an appropriate candidate for an attribute, or it could be fixed to determine the first nonterminal. It is also possible sets of terminals and nonterminals degrade to a single attribute, ie the number of elements. The assigned address in the catalog: XBUPProject / Math / Grammar
Based on constraints on the shape of the rules of grammar are usually divided into following four groups:
- Type 0 - General grammar: There are no restrictions
- Type 1 - Context-sensitive grammar: Each rule must have following schema: (α → β) ⇒ |α| < |β|.
- Type 2 - Context-free grammar: Rules must have form A → α where |α| > 1, possibly with the exception of S → ε, where S is not on the left side of any rule.
- Type 3 - Regular Grammar: Rules are of the form A → aB, possibly with the exception of S → ε, where S is not on the left side of any rule.
Since the rules must have specific shape, we can define the rules of grammar as follows:
Block RegularRule grammar rule:
UBNatural LeftNonterminal - Index on Nonterminal rule
UBNatural RightTerminal - Index on Terminal
UBNatural RightNonterminal - Index on Nonterminal (0 = null index, others might be shifted)
Regular grammar block RegularGrammar:
UBNatural NonterminalCount - Count of nonterminals
UBNatural TerminalCount - Count of terminals (alphabetical characters)
UBBoolean IncludeEmptyString - Indication of whether the grammar includes the empty word
Context-free grammar defines the rule and its items as sub-blocks. Sub-blocks are of two types, terminal and nonterminal, and both are referring to the corresponding sets.
Context-sensitive grammar is an extension of context-free grammar where there is sequence on the input side.
.
Other grammar can include for example, deterministic context-free grammar or grammar of the recursive languages.
Since it is not possible to guarantee the validity of some rules, it is necessary to introduce an algorithm to validate the grammar. Particularly important is the check for the following conditions:
- Indices of terminals and nonterminals are within the valid range
- If the S → ε rule is included, then in the context-free grammars and regular grammars, there must not be included rule with the S on the left side of the rule
- List of the rules is a set (grammar does not contain the same rule twice)
In addition to basic information about grammar, there can be included ancillary data related to particular views of grammar structures, such as text terminals and nonterminals expressions.
The grammars can perform many operations and transformations. You can verify the fit of strings according to grammar, generate the set of all words or transform grammar into an equivalent mathematical expression of another type of computational machine. These operations can then be called also with a remote function calls.
Grammars can be compared, processed into union, intersection, difference, complement, and more. Some of these operations are not feasible for some grammars, or have a large time complexity.
List of resources, and relevant literature references.
Formal Grammars - https://en.wikipedia.org/wiki/Formal_grammar