This repository contains two different languages which are compiled to sed. Both of them are turing complete and support arbitrary-precision non-negative integer arithmetics.
The state of each program (no matter the input language) must be initialised
completely. It is given by binary numbers separated by hashes. Thus, an
input of 0#10000000000#0#0#0
initialises all but the second variable to zero;
the second variable is initialised to 1024 (binary 10000000000
).
The output format is the same as the input format.
The heading of the generated program contains the names of the variables in the order they are expected in the input.
The asmsed.py
program converts a custom assembly-style language to a sed
program.
$ python3 asmsed.py -o logn.sed logn.ass
$ chmod +x logn.sed
$ echo '1000000000#1010' | ./logn.sed
10
This example program calculates the integer logartihm to an arbitrary base. In
this case, it is invoked to calculate the logarithm 512 (1000000000
) to the
base 10 (1010
). The output is rounded down, so it prints 2 (10
).
logn.ass
Calculate the floored integer logartihm.div.ass
Divide two integer numbers, returning the floored result.
The machine corresponding to the instruction set is a stack machine with an arbitrarily high stack of non-negative integer numbers.
On the sed level, the numbers are encoded as ASCII-binary (that is, sequences
of ASCII 0
and 1
) and the stack slots are separated by hashes (#
).
The input is the initial stack of the program as ASCII binary numbers (see
above) separated by hashes. The usage example above initialises the stack with
two elements, 1000000000
(512) and 1010
(10) on the stack, with 10 being
the topmost element.
The output format is the same as the input format. The entire stack at program exit is printed.
Anything following a semicolon (;
) in a line is treated as a comment.
Non-newline is used to separate different tokens in a single line. A newline ends a mnemonic.
load STACKPOS
Loads the value from the given STACKPOS (counting from the bottom starting at 0) and pushes it onto the stack. The element at STACKPOS is left in place.
store STACKPOS
Stores the topmost stack value into the given stack position (counting from the bottom starting at 0) and removes the topmost value from the stack.
pushc NUMBER
Pushes the given constant NUMBER onto the stack.
dup
Loads the topmost stack element and pushes it onto the stack, effectively duplicating it.
pop
Discard the topmost element from the stack.
inc
Increment the topmost stack element in-place.
dec
Decrement the topmost stack element in-place.
add
Add the two topmost stack elements, remove them from the stack and push the result of the addition onto the stack.
sub
Subtract the topmost element from the second-topmost element of the stack, remove both from the stack and push the result.
If the subtraction underflows, it saturates to zero.
.LABELNAME
Not an instructino per-se, but a label to which a jump can be executed.
The labelname must be a valid sed label name and must not start with an underscore.
jmp LABELNAME
Jump to the given label unconditionally.
jz LABELNAME
If the topmost stack value is zero, pop it from the stack and jump to the given label.
The whilesed.py
program converts a WHILE program (see below) to a sed program.
$ python3 whilesed.py -o outfile.sed log10.while
$ chmod +x outfile.sed
$ echo '0#10000000000#0#0#0' | ./outfile.sed
11#0#0#0#1010#0
The state of the program must be initialised completely. It is given by binary
numbers separated by hashes. The above example initialises all but the second
variable to zero; the second variable is initialised to 1024 (binary
10000000000
).
The output format is the same as the input format. In the above example, the
first output is 3 (binary 11
), which is the correct (rounded down) for
log10(1024).
The heading of the generated program contains the names of the variables in the order they are expected in the input.
The WHILE language has been described in academia, but I don’t know the original source to quote. If you know it, let me know. It is an extension of the LOOP language.
Identifiers in this WHILE implementation match the following regular expression (since you are looking at this, I assume that you know at least the very basics of regular expressions): x_?([0-9]+)
. The prefix (x_?
) is ignored. The capture group is converted to an integer. This is the actual identifier of the variable. So x_2
, x2
, x00002
and x_002
refer all to the same variable.
You do not need to declare variables.
Undeclared variables are initialised to zero.
All variables are non-negative integers. There is no upper bound on the value.
The following statements exist in the language:
Assigns a constant integer to a variable. Example (assigns 2 to x3
):
x3 := 2
Copy the value of one variable to another:
x2 := x3
Add/subtract a value from a variable and assign the result to a (possibly different) variable:
x1 := x2 - 3
Note: since the values can never be negative, negative results are clamped to 0.
Execute a block of statements a semi-fixed amount of times. The value of the loop variable is evaluated when the loop is first entered. The loop executes as many times as the value of the variable at enter time; modifying the variable during loop execution has no effect on the number of iterations.
loop x1
... more statements ...
end
Execute a block of statements until a variable becomes zero.
while x1
... more statements ...
end
Execute a block of statements once if a variable is non-zero.
if x1
... more statements ...
end
Control structures can be arbitrarily nested.