Skip to content
Carl Davidson edited this page Feb 2, 2016 · 8 revisions

#String Theory

DOWNLOAD

String theory is a string processing library for OpenSCAD. It is the underlying library behind Relativity. Relativity uses the library to implement it's css-like selector engine. Every copy of relativity.scad comes bundled with String theory, so chances are you already have the library. There is also a non-bundled version of String theory available here, which is handy if you don't intend to use the other features offered by relativity.scad.

String theory intends to be a fully featured string processing library, offering all the out-of-the-box functionality you'd expect from a modern general purpose programming language. This functionality includes basic string manipulation, string search, and a fully featured regular expression engine. It also includes some not so out-of-the-box functionality, including an engine for Parsing Expression Grammars.

The library includes the following functions:

  • before(string, end)
  • after(string, start)
  • between(string, start, end)
  • substring(string, start, length)
  • upper(string)
  • lower(string)
  • is_empty(string)
  • is_null_or_empty(string)
  • is_null_or_whitespace(string)
  • equals(this, that, ignore_case=false)
  • starts_with(string, start, ignore_case=false)
  • ends_with(string, end, ignore_case=false)
  • reverse(string)
  • trim(string)
  • parse_int(string, base)
  • tokenize(string, index)
  • join(strings, delimiter)
  • split(string, seperator=" ", ignore_case = false)
  • index_of(string, goal, ignore_case=false, regex=false)
  • contains(string, substring, ignore_case=false, regex=false)
  • replace(string, replaced, replacement, ignore_case=false, regex=false)
  • grep(string, pattern)

##Parsing Expression Grammars

Support for parsing expression grammars is exposed through the peg() function:

peg(string, grammar)

string is the plain text string you wish to parse.

grammar is an object representing the grammar you wish to parse. The engine is currently too slow to describe complex languages using Backus-Naur form, so grammar must instead use an intermediate format that is based around nested lists. Below you will find an example of this format. This example grammar is able to parse several kinds of number, including positive numbers (e.g. "16807"), negative numbers ("-16807"), floating point numbers ("1.6807"), and numbers written in scientific notion ("1.68e4").

["grammar",
	["rule", "real",
		["sequence",
			["choice",
				["ref", "integer"],
				["sequence",
					["zero_to_one", ["literal", "-"]],
					["zero_to_many", ["character_set_shorthand", "d"]],
					["zero_to_one", 
						["sequence",
							["literal", "."],
							["one_to_many", ["character_set_shorthand", "d"]]
						]
					]
				]
			],
			["zero_to_one", 
				["sequence",
					["choice",
						["literal", "e"],
						["literal", "E"],
					],
					["ref", "integer"]
				]
			]
		]
	],
	["rule", "integer",
		["sequence", 
			["zero_to_one", ["literal", "-"]],
			["one_to_many", ["character_set_shorthand", "d"]],
		]
	]
]

The above example could be written in Backus-Naur form as follows:

real 	= "-"? (integer / \d* ("." \d+)? ) (("e"/"E") integer )?
integer = "-"? \d+

Every nested list within grammar represents an operation. The first element of each list declares the name of the operation. Additional elements specify parameters for the operation, which can either be strings or in turn other nested operations. As shown above, the grammar object must always start with an operation of type "grammar". Operations of type "grammar" accept an arbitrary non-zero number of parameters. Each parameter fed to the "grammar" operation must be an operation of type "rule". Given a valid grammar object, peg() attempts to match the string with the first "rule" operation in the list. In the example above, the parser will attempt to process string using the rule titled "real".

The following operations are supported:

  • grammar
  • rule
  • private_rule
  • ref
  • choice
  • sequence
  • positive_lookahead
  • negative_lookahead
  • one_to_many
  • zero_to_many
  • many_to_many
  • zero_to_one
  • literal
  • positive_character_set
  • negative_character_set
  • character_range
  • character_literal
  • character_set_shorthand
  • wildcard
  • start
  • end
  • private
  • empty_string

Another example of this format can be found in the _rx_peg object within strings.scad, which is used to describe the regular expression language within String Theory.

Clone this wiki locally