Skip to content

Examples

jrte edited this page Nov 9, 2023 · 1 revision

More Examples

Here are some more examples. These are abridged from the ginr source for the patterns in the ribose test suite (patterns/test/*.inr).

Hello World

Ribose specializes ginr to accept only regular patterns comprised of ternary terms (input, effector[parameters] ...), using the tape shifting operator [] to associate parameters with effectors (so (X,Y[Z]) is equivalent to (X,Y,Z)). Such terms are combined and composed using regular operators to specify ribose transducer patterns. Below, the (X,Y,Z)$(0,1 2) → (X,Y Z) operation uses the projection operator $ to flatten effectors and parameters onto a single tape in canonical order. The :prsseq operator is then applied to verify that the pattern is a single-valued (subsequential) function X → Y Z sequentially mapping input to target effectors and parameters and to provide a readable listing of the compiled FST transitions.

Hello = (nil, out[`Hello World`] stop);

(Hello$(0,1 2)):prsseq;
(START)  nil  [ out Hello World stop ]  (FINAL)

etc/sh/ribose --nil build/Test.model Hello -
Hello World

Fibonacci

This next example computes unary Fibonacci numbers from unary inputs, using transductor compositing effectors to manipulate a collection of user-defined fields (~X) that accumulate data mapped from the input. This is interesting because, formally, FSTs effect linear transforms and can only produce regular outputs while the Fibonacci sequence is not regular (it is strictly context sensitive). This is possible because the Fibonacci FST generates a regular sequence of effectors that retain intermediate results in field registers in the host machine's RAM. Fields and literal byte sequences can also be injected into the transduction input stream, using the in[@field] effector, for immediate transduction.

Fibonacci = (
  (
    ('0', select[`~q`] paste['1'])
    ('0', select[`~r`] cut[`~p`] select[`~p`] copy[`~q`] select[`~q`] cut[`~r`])*
  )?
  (nl, paste out stop)
);

(Fibonacci$(0,1 2)):prsseq;
(START)  0  [ select ~q paste 1 ]                                   1
(START)  nl [ paste out stop ]                                      (FINAL)
1        0  [ select ~r cut ~p select ~p copy ~q select ~q cut ~r ] 1
1        nl [ paste out stop ]                                      (FINAL)

$ for i in '' 0 00 000 0000 00000 000000 0000000 00000000 000000000 0000000000; do \
> echo $i | etc/sh/ribose build/Test.model Fibonacci -; \
> done

1
1
11
111
11111
11111111
1111111111111
111111111111111111111
1111111111111111111111111111111111
1111111111111111111111111111111111111111111111111111111

Tuple Extraction

This example was previewed above. It is taken from the ribose model compiler, which transduces serialized ginr automata to construct FSTs for inclusion in a ribose runtime model. Ginr serializes compiled pattern automata to tab-delimited text files beginning with a header line listing ginr version number and the number of tapes, transitions, states, and symbols. Below, the Header pattern determines how ribose deserializes the header line, transducing it into an immutable value object in the target model. This is presented in more detail in the ribose wiki (see Ginr as a Service Provider). Here it is refined a bit to show how ginr's composition operator (@) can be used to replace formal parameters in a template pattern (Field).

# INR210	3	565	127	282

Field = select[X] clear;
Number = (digit, paste)+;
Header = (
  ('INR', Field @ (X,`~version`)*) Number
  (tab, Field @ (X,`~tapes`)*) Number
  (tab, Field @ (X,`~transitions`)*) Number
  (tab, Field @ (X,`~states`)*) Number
  (tab, Field @ (X,`~symbols`)*) Number
  (nl, header)
);

The model compiler class, ModelCompiler, implements the ITarget interface and expresses a HeaderEffector<ModelCompiler> effector class. The compiled Header transducer maps the header token in the pattern to this effector's invoke() method and calls it when the final nl token is read. The effector uses its IOutput view to decode fields extracted as raw bytes from input to integer-valued fields within a Header object bound to the target instance.

class Header {
  final int version;
  final int tapes;
  final int transitions;
  final int states;
  final int symbols;
  Header(int version, int tapes, int transitions, int states, int symbols)
  throws EffectorException {
  // Apply value range and sanity checks (omitted here)
    this.version = version;
    this.tapes = tapes;
    this.transitions = transitions;
    this.states = states;
    this.symbols = symbols;
  }
}

class HeaderEffector extends BaseEffector<ModelCompiler> {
  IField fields[];

  HeaderEffector(ModelCompiler automaton) {
    super(automaton, Bytes.encode("header"));
  }

  @Override // Called once, when the containing model is loaded into the ribose runtime
  public void setOutput(IOutput output)
  throws TargetBindingException {
    super.setOutput(output);
    fields = new IField[] {
      super.output.getField(Bytes.encode("version")),
      super.output.getField(Bytes.encode("tapes")),
      super.output.getField(Bytes.encode("transitions")),
      super.output.getField(Bytes.encode("states")),
      super.output.getField(Bytes.encode("symbols"))
    };
  }

  @Override // Called once for each input automaton, when the header line has been consumed
  public int invoke()
  throws EffectorException {
    Header h = new Header(
      (int)fields[0].asInteger(),
      (int)fields[1].asInteger(),
      (int)fields[2].asInteger(),
      (int)fields[3].asInteger(),
      (int)fields[4].asInteger()
    );
    super.target.header = h;
    super.target.transitions = new Transition[h.transitions];
    super.target.stateTransitionMap = new HashMap<Integer, ArrayList<Transition>>((h.states * 5) >> 2);
    return IEffector.RTE_EFFECT_NONE;
  }
}

This demonstrates the clear separation of syntactic (input) and semantic (target) concerns obtained with pattern oriented design. All of the fine-grained character-level code that would otherwise be involved to navigate the input stream is relegated to a transduction pattern expressed in a symbolic, algebraic framework supported by a robust compiler for multidimensional regular patterns. Target and effector implementations are thereby greatly reduced. In this example, as with most ribose solutions, the effector implementations and supporting value classes involve no external libraries outside the ribose runtime and are completely decoupled from the syntactic structure of the input. That is not to say that the target and effector implementations, composed in a native programming language, cannot interact with more complex components in the application or service domain. But these interactions are orthogonal and opaque to transduction patterns since they are either encapsulated by effectors or occur downstream from transduction processes.

Navigating Noisy Inputs (Nullification)

It is often necessary to extract features that are embedded in otherwise irrelevant or noisy inputs. The ribose runtime supports this by injecting a nul signal into the input stream whenever no transition is defined for a received input. Any ribose pattern can be extended to handle a nul signal at any point by simply skipping over input with nil effect until a synchronization pattern marking the beginning of the innermost enclosing loop is recognized and then continuing as before. This extends the range of recognizable (but not necessarily acceptable) inputs to cover byte* while accepting only selected embedded features. In the example below, the interval pattern extracts interval-ms attribute values from specific cycle-start stanzas (with type="scavenge") embedded in an XML document.

#<af-start id="3" threadId="0000000000329C80" totalBytesRequested="16" timestamp="2021-04-05T23:12:58.597" intervalms="5243.429" type="nursery" />
#<cycle-start id="4" type="scavenge" contextid="0" timestamp="2021-04-05T23:12:58.597" intervalms="5243.468" />
#<gc-start id="5" type="scavenge" contextid="4" timestamp="2021-04-05T23:12:58.597">

interval = (
  '<cycle-start id="' digit+ '" type="scavenge" contextid="' digit+ '" timestamp="' (digit+ ('.:-T':alph))+ digit+ '" '
  ('intervalms="', select[`~interval`] clear)
  (digit, paste)+ ('.', paste) (digit, paste)+
  ('"', out[`~interval` nl])
  space* '/>' (utf8 - '<')*
);

# tape alphabets
a0 = (interval$0):alph;
a1 = (interval$1):alph;
a2 = (interval$2):alph;

# map nl to NL in a2 to make tape symbols disjoint with a0
nlNL = (((a2 - nl)$(0,0))* (nl, NL)*)*;
# map NL to nl on parameter tape
NLnl = (((a2 - nl)$(0,,,0))* (NL, [[nl]])*)*;

# nullify to construct a synchronizing pattern for everything that is not an interval
null = (
  (
    # map all input symbols x->x|nul in (interval$0) and flatten result to one tape
    ((AnyOrNul* @ (interval @ nlNL))$(0 1 2))
    # copy everything but nul, projecting back onto 3 tapes
  @ ((a0$(0,0))* (a1$(0,,0))* NLnl)*
    # copy first nul back onto input tape, discard remainder of nullified pattern
    (nul$(0,0)) (nul* a0* a1* a2*)*
  )
  # after nul absorb input up to next '<'
  (utf8 - '<')*
);

# transduce recognized intervals, ignore null input
Tintervals = (interval* null*)*;

Nullification can be applied similarly to subpatterns to effect fine-grained context-sensitive error handling. The general technique of flattening a transducer pattern and transforming it with an editor pattern while projecting the result back onto the original tapes can be applied in many other ways. Like CRISPR in genetics, it can be used to inject new behaviors or alter existing behaviors in transducer patterns. The null expression above preserves the interleaving of input and effectors in the nullified interval pattern up to the first nul signal and replaces the remainder with a pattern that synchronizes at the opening of the next XML stanza. If no effectors are involved the same result can be expressed more succinctly as shown in the next example.

Resolving Ambiguous Inputs (Classification)

It is sometimes necessary to look ahead in the input, without effect, to syntactically validate a prospective feature or resolve an ambiguous pattern before selecting a course of action. The snippet below, from the LinuxKernelStrict transducer in the ribose test suite, demonstrates this using the mark and reset effectors. The header and capture subpatterns, referenced but not shown here, effect field extraction from iptables messages in Linux kernel logs.

#May 15 07:58:52 kb-ubuntu kernel: [ 1794.599801] DROPPED IN=eth0 OUT= MAC=01:00:5e:00:00:fb:00:13:20:c0:36:32:08:00 SRC=192.168.144.101 DST=224.0.0.251 LEN=32 TOS=0x00 PREC=0x00 TTL=1 ID=8596 OPT (94040000) PROTO=2
#May 16 07:59:13 kb-ubuntu kernel: [  285.950056] __ratelimit: 225 callbacks suppressed

# distribute paste effector over any input sequence of bytes
PasteAny = (byte, paste)*;

# reorder and write fields extracted from the header and capture subpatterns to stdout
store = out[
  `~timestamp` tab `~hostname` tab `~macaddress` tab `~tag` tab `~in` tab `~out` tab
  `~protocol` tab `~srcip` tab `~dstip` tab `~srcport` tab `~dstport` nl
];

LinuxKernelDropped = (
  header (space, select[`~tag`]) ('DROPPED' @@ PasteAny) capture (nl, store signal[`!nil`] stop)
):dfamin;

LinuxKernelLimited = (
  header (space, select[`~tag`]) ('LIMITED' @@ PasteAny) capture (nl, store signal[`!nil`] stop)
):dfamin;

LinuxKernelAborted = (
  header (space, select[`~tag`]) ('ABORTED' @@ PasteAny) capture (nl, store signal[`!nil`] stop)
):dfamin;

dropped = LinuxKernelDropped$0;
limited = LinuxKernelLimited$0;
aborted = LinuxKernelAborted$0;

line = (dropped | limited | aborted) / nl;
null = ((line:pref) - line) nul (byte - nl)*;
next = reset clear[`~*`] select[`~timestamp`];

LinuxKernelStrict = (
  (
    (nil, mark)
    (
      (dropped, next start[`@LinuxKernelDropped`])
    | (limited, next start[`@LinuxKernelLimited`])
    | (aborted, next start[`@LinuxKernelAborted`])
    | (null nl, signal[`!nil`])
    )
  )*
):dfamin;

This also demonstrates nesting of ribose transducers. The top-level LinuxKernelStrict transducer marks an input anchor and looks ahead to verify syntax before selecting one of three transducers. It then resets to the input anchor and pushes the selected transducer onto the transductor stack to reduce the marked input and inject a nil signal into the input before popping back into the top-level transducer. This is a contrived example; the LinuxKernel transducer in the ribose test suite effects the same transduction without lookahead.

Clone this wiki locally