Skip to content

Latest commit

 

History

History
652 lines (494 loc) · 27.6 KB

BUILTIN.adoc

File metadata and controls

652 lines (494 loc) · 27.6 KB

Writing Built-In Macros for Jamal

In this document we will describe how you can write built-in macros for your specific application.

Built-in macros in Jamal are the macros, which are implemented using Java code. The document assumes that you are familiar with Jamal macrostructure, and you know

  • the difference between built-in and user defined macros in the Jamal file,

  • how to define user defined macros

  • what it means when a built-in macro uses the @ character or the # in front of the macro name

  • what is verbatim

  • the macro evaluation order and how it can be modified using

  • the ident and eval macros, and also the ! and the backtick characters in front of the macros

  • what are the different macro scopes

  • how to export user defined macros to a higher scope

  • how user defined macro arguments are separated and parsed

  • generally, how to use Jamal.

Creating a "Hello World" Macro

A "Hello, World!" example built-in macro is extremely simple. You need the following dependency in your project:

<dependency>
    <groupId>com.javax0.jamal</groupId>
    <artifactId>jamal-api</artifactId>
    <version>2.8.1</version>
</dependency>

In later versions, when we will do a bit more in our macros we will also need the dependency:

<dependency>
    <groupId>com.javax0.jamal</groupId>
    <artifactId>jamal-tools</artifactId>
    <version>2.8.1</version>
</dependency>

This library has a transitive dependency to the library jamal.api so there is no need to specify both. If you plan to use Java Platform Module System (JPMS) then you also have to add the

    requires jamal.api;
    requires jamal.tools;

directives to your module-info.java file. Having the project set up that way we can create our first built-in macro. The whole class is included in this document:

HelloWorld from HelloWorld.java
package javax0.jamal.test.examples;

import javax0.jamal.api.Input;
import javax0.jamal.api.Macro;
import javax0.jamal.api.Processor;

public class HelloWorld implements Macro {
    @Override
    public String evaluate(Input in, Processor processor) {
        return "Hello, World!";
    }
}

The class has to implement the interface javax0.jamal.api.Macro. This interface, as well as the other interfaces and classes imported are defined in the package javax0.jamal.api. This interface requires us to create one method evaluate(). This method gets the input text of the macro, a reference to the processor executing Jamal and it has to return a String that will be the content of the macro after evaluation. In the example case we do not read the input and we do not use any service the processor can provide. We will see in later chapters how to read and parse the input and also how to use the processor to access the executing environment.

The last step is to create a unit test.

The simplest way is using the classes from the jamal-testsupport library. This library contains test utilities that mock a Jamal environment for the macro and also provide useful test mocks. It is recommended to use this package instead of trying to invoke the macro method directly and creating mocks using some general library. It is to note that technically these tests are not unit tests, because they use the actual Jamal implementation and not a mock. However, they test against a certain release, which is supposed to be bug free and any bug discovered should signal an error in the macro implementation. These tests are more readable than a clean unit tests, they are fast, reproducible, and exhibit all the important aspects of a regular unit tests.

To use the test support library, you have to add the dependency to your pom.xml file:

<dependency>
    <groupId>com.javax0.jamal</groupId>
    <artifactId>jamal-testsupport</artifactId>
    <version>2.8.1</version>
    <scope>test</scope>
</dependency>

Note that the scope of this dependency is test because these tools are needed only during unit test running. You may also need to add a JVM command line parameter via the surefire plugin configuration, like the followings:

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-surefire-plugin</artifactId>
    <version> ... actual version of the surefire plugin ... </version>
    <configuration>
        <argLine>
            --add-opens yourModuleName/packageName=ALL-UNNAMED
        </argLine>
    </configuration>
</plugin>

This is needed if your library uses JPMS. When the tests are running, they need access to your code using reflection, and therefore you need to "opens" the package. When you run the tests in the IDE it may ignore the JPMS settings and run your code as a plain library. In that case, the fact that the package is not open for the reflective access is not important.

Note

JPMS is complex and needs much more explanation than what can fit here. Here I describe a few hints only, but it does not intend to teach you JPMS. If you do not understand how JPMS works and how to use it you can get into trouble using it. In that case you have two choices.

  1. Ignore the possibility to use JPMS, and create your macros as plan old classes in a plain old library, which will be available for Jamal from the classpath.

  2. Learn JPMS from the excellent book of Nicolai: https://www.manning.com/books/the-java-module-system

I recommend the second.

Our test will be the following:

TestHelloWorld_1 from TestHelloWorld.java
@Test
@DisplayName("Test that the HelloWorld built-in macro works")
void macroWorks() throws Exception {
    TestThat.theInput(
        "{@use javax0.jamal.test.examples.HelloWorld}" +
            "{@helloworld}"
    ).results("Hello, World!");
}

With this we are essentially ready with the hello world macro application. There is one more topic, though, which is worth discussing here.

In the tests code we had to declare the class in the Jamal file as a macro to be used. This is one of the three possibilities to make a Java class available for the Jamal code. The second is to register the class for the standard Java service loader.

When a Jamal processor object is created it calls the Java service loader to find all the classes, which implement the javax0.jamal.api.Macro interface. The returned list of instances are registered into the Jamal global macro registry and are available to be used for the Jamal processing.

The Java service loader can find a class if it is

  • declared in the module-info.java module descriptor file as one providing the javax0.jamal.api.Macro interface, and/or

  • the full class name is listed in the file /src/main/resources/META-INF/services/javax0.jamal.api.Macro

I recommend that you do both in case you use JPMS, because it will help test running inside the IDE, which may not use JPMS. Having the class names listed in the /src/main/resources/META-INF/services/javax0.jamal.api.Macro file may also help applications that use your library as a normal JAR file and not as a module.

The module file will look something like this:

module_declaration from module-info.java
module jamal.test {
    requires jamal.api;
    requires jamal.tools;
    requires jamal.engine;
    provides javax0.jamal.api.Macro with
        HelloWorld,
        Hello,
        Spacer,
        Array
        ;
}

Our module needs the jamal.api module, so we require it, and we provide the javax0.jamal.api.Macro implementation. After this out unit test will be the following:

TestHelloWorld_2 from TestHelloWorld.java
@Test
@DisplayName("Test that the HelloWorld built-in macro is registered")
void macroRegisteredGLobal() throws Exception {
    TestThat.theInput(
        "{@helloworld}"
    ).results("Hello, World!");
}

Now we do not need to declare the class in the Jamal file, it is available in the global scope.

There is a third option to register a macro in the Jamal processor. The processor has an API and it is possible to register a user defined or a built-in macro programmatically.

Name of a Built-In Macro

There are four different ways to define one or more names for a macro.

The recommended way is to name the macro class aptly as described in the following chapter. If that is not possible, use the annotation @Name as described below. In special cases implement one of the getIds() or getId() methods. As a last resort, the macro can be loaded using the macro use and the name can be defined there.

Default Macro Name

The simplest that we have already seen is to name the macro class aptly. It will be used as the name of the macro converting the first character to lower-case. We followed this approach in the example above in the Creating a "Hello World" Macro chapter.

Macro Name Defined in the Macro use

The name of the macro can also be defined in the macro use when a macro class is explicitly declared for use. The syntax of the use macro is

use [global] fully_class_name [ as macroname]

The parts between [ and ] are optional.

Defining getId(s) in the Macro Class

When the macro is registered via the service loader, the name is decided calling the method getIds() of the macro class. You can override this method and return a list of names. As a convenience, you can also define the method getId() and return a single name. The default implementation of the method getIds() calls the method getId() and returns a list with a single element.

Here is the implementation of the method getIds() and looking at it already foreshadows the fourth possibility.

getIds from Macro.java
default String[] getIds() {
    final var ann = this.getClass().getAnnotation(Name.class);
    if (ann != null && ann.value().length > 0) {
        return ann.value();
    }
    return new String[]{getId()};
}

Name by Annotation @Name

Following the version 2.6.0 it is possible and recommended to specify the name of the macro using the annotation @Name. The annotation can be used on the macro class. It can have one or more strings as arguments specifying the names of the macro. For example, the core macro JShell specifies the name this way:

JShell from JShell.java
@Macro.Name("JShell")
public class JShell implements Macro {

In this case, the annotation defines one name. It is needed because the casing of the macro name is different from the default, following the naming of JShell. The macro name is JShell and not jshell.

Fetching the Used Name

It is possible to query the name of the macro that was used to invoke the macro. This is useful when the macro has to behave differently depending on the name it was invoked with. For example, the macros string:before and string:after are implemented by the same class. The method evaluate() starts with the following code:

GETID_AFTER from StringMacros.java
@Override
public String evaluate(final Input in, final Processor processor) throws BadSyntax {
    final var action = processor.getId();

The variable action will hold the name that was used to invoke the macro.

Note
It is important to invoke this call at the start of the macro. The evaluation may invoke acro processing, and the macro name registered in the processor may change. The getId() method on the processor always returns the last invoked macro name. This feature is available since the release 2.6.0.

Handling the Input of the Macro

In the HelloWorld macro, we completely disregarded the macro’s input. Some built-in macros, such as comment or block, intentionally behave in this manner. Typically, this is not our usual practice. Macros generally require their input to function properly. It’s advisable that even macros designed to ignore input should verify that no additional characters are present following the macro name.

The following test demonstrates the behavior of the HelloWorld macro when it receives and ignores the input:

TestHelloWorld_3 from TestHelloWorld.java
@Test
@DisplayName("Test that the HelloWorld built-in macro works")
void macroIgnoresInput() throws Exception {
    TestThat.theInput(
        "{@helloworld the input is totally ignored}"
    ).results("Hello, World!");
}

In the next section, we will create a macro that uses its input.

Hello, Me Macro

The subsequent macro we’re going to create is designed to not merely extend greetings to the entire world, but specifically to an individual whom we specify. The code for the Hello macro will be as follows:

Hello from Hello.java
public class Hello implements Macro {
    @Override
    public String evaluate(Input in, Processor processor) {
        return "Hello, " +in.toString().trim()+"!";
    }
}

It will utilize the input, transform it into a string, and trim the spaces from the beginning and end of the string, using it as the name in the greeting. The test is equally straightforward, demonstrating the direct application of the macro:

TestHello_1 from TestHello.java
@Test
@DisplayName("Test that the Hello built-in macro works")
void macroWorks() throws Exception {
    TestThat.theInput(
        "{@hello Peter }\n" +
            "{@hello Paul}\n"
    ).results("Hello, Peter!\nHello, Paul!\n");
}

In this example, we are addressing the scenario in the simplest possible manner. We are utilizing the input as it is, as an entire string, merely trimming the spaces from the beginning and the end. In the subsequent chapter, we will explore an example that processes the input in a more intricate way.

Working with the Input: Example: Spacer Macro

Most macros interact with their input in a complex manner. They may parse the input or divide it into smaller segments for subsequent processing. There are numerous methods to accomplish this.

Firstly, the interface javax0.jamal.api.Input extends the CharSequence interface from the Java JDK. You can employ any methods defined there. The characters are stored in a StringBuilder, and you can directly access this by invoking getSB().

However, built-in macros seldom utilize these methods directly. Instead, they typically use the static methods provided by InputHandler.

The Input object is fundamentally a sequence of characters, but it also records the file name and the source location of the characters. Directly modifying the underlying StringBuilder may result in losing track of the line number and column position.

The class InputHandler contains methods specifically designed for safely parsing the input. The authoritative reference is the current JavaDoc. In the upcoming examples, we’ll explore how to utilize some of these methods.

The next macro we will examine takes the macro’s input and intersperses spaces between the characters. That way it will convert

{@spacer this is
some text
}

to

t h i s   i s
s o m e   t e x t

The implementation of the macro is the following:

Spacer from Spacer.java
public class Spacer implements Macro {
    @Override
    public String evaluate(Input in, Processor processor) {
        InputHandler.skipWhiteSpaces(in);
        if (in.length() > 0) {
            final var result = javax0.jamal.tools.Input.makeInput("", in.getPosition());
            boolean lineStart = true;
            while (in.length() > 0) {
                if (!lineStart)
                    result.append(' ');
                lineStart = in.charAt(0) == '\n';
                InputHandler.move(in, 1, result);
            }
            return result.toString();
        } else {
            return "";
        }
    }
}

The first action of the macro is to bypass any white spaces. This is a standard practice because spaces are typically present after the macro’s identifier, serving mainly to separate the macro name from its content. Some macros might only skip spaces up to the end of a line, considering any additional spaces on the subsequent line. However, in this instance, all white spaces, including new lines, are omitted at the beginning of the input. It’s crucial to note that this skipping process also adjusts the line number and the column position to reflect the actual character.

The input maintains information about the file name, the line number, and the column position at the start of the character sequence. Together, these three pieces of information constitute a Position object. The current position within an Input can be obtained by calling the getPosition() method.

If the input consists solely of spaces, they are all skipped, and in such cases, the macro simply returns an empty string. If there are other characters in the input, the macro processes them individually, inserting a space before each character, except when the character is at the beginning of a line. This is achieved by creating a new, initially empty Input object, which inherits the position of the original input. Since Input also implements CharacterSequence, retrieving any character at a specific position is straightforward using charAt(). Characters can also be 'moved' from one input to another. This action removes the character from the Input in and simultaneously updates the current Position of the input.

Eventually, the result is transformed into a String and returned.

This macro treats the input as a sequence of characters. Often, macros are designed to work with individual parameters. The next chapter will introduce an approach for handling such scenarios.

Splitting the Input

If you examine the core built-in macro if, you’ll notice it doesn’t adopt a special syntax. It calls the method getParts() to split the input into parts:

if_parts from If.java
final var parts = InputHandler.getParts(input, processor, 3);

It simply operates with three parameters: if the first parameter evaluates to true, it returns the second parameter; otherwise, it returns the third. In scenarios with only two parameters, it yields an empty string if the first parameter is false. The syntax of the macro is presented as follows:

{@if 'sep' condition 'sep' then result [ 'sep'else result] }

In this syntax, 'sep' represents a separator, which could vary. It might be a space, a non-alphanumeric character, or a more complex separator. The handling of these three cases is facilitated by the method getParts(), located in the class InputHandler.

This method begins by skipping any white spaces at the start of the input and then examines the first character. If it encounters a back-tick, it continues to retrieve characters until it identifies a corresponding back-tick. The fetched string is then utilized as a regular expression to segment the remainder of the input.

If the initial non-space character in the input isn’t a back-tick but is still a non-alphanumeric character, this character serves as the separator to split the input.

Lastly, if the first non-space character is alphanumeric, the input is divided based on spaces.

The upcoming example employs this method to craft a macro capable of retrieving a specific string from a collection, based on an index. For example

        {@array /1/x/aaa/z}

will select the second element, that is aaa from the array of [ "x", "aaa", "z"]. The code of the macro is the following:

Array from Array.java
public class Array implements Macro {
    @Override
    public String evaluate(Input in, Processor processor) throws BadSyntax {
        final var pos = in.getPosition();
        final String[] parts = InputHandler.getParts(in);
        BadSyntaxAt.when(parts.length < 2, "Macro Array needs an index and at least one element", pos);
        final int size = parts.length - 1;
        final int index;
        try {
            index = Integer.parseInt(parts[0]);
        } catch (NumberFormatException nfe) {
            throw new BadSyntaxAt("The index in Macro array '"
                    + parts[0]
                    + "' cannot be interpreted as an integer.", pos, nfe);
        }
        BadSyntaxAt.when(index < 0 || index >= parts.length - 1, "The index in Macro array is '"
                + parts[0]
                + "' but it should be between "
                + (-size) + " and " + (size - 1) + ".", pos);
        return parts[index + 1];
    }
}

The macro calls the method getParts() passing only the input as one argument. There is another version of the method that limits the number of the arguments. Calling that the last element of the returned array will contain the rest of the string even if it could be split up more. The macro implementation checks that there are enough number of parts and then converts the first part to integer. This will be the index, the rest of the parts array are the values to choose from. The code also checks the array bounds and throws exception in case there is an error.

When implementing a macro and there is an error the code has to detect it and it can throw a BadSyntax exception. It is also declared in the interface. The exception BadSyntaxAt is an extension of BadSyntax. This second exception also contains the reference to the input location.

If the location of the error is not interesting inside the macro then it is good enough to throw a simple BadSyntax exception. The processor catches that exception and converts it to a BadSyntaxAt exception that will reference the character at the very start of the macro.

General Structure of the evaluate() Method

Macros that are InnerScopeDependent

The macro evaluation order is detailed in the README of Jamal. When Jamal sees a built-in macro that starts with a # character at the start then it evaluates the content of it before invoking the macro itself. For example

{#trimLines {@define margin=1}
{@snip sampleText}
}

will first evaluate the define macro resulting margin to become a user defined macro with the value 1. After that the snip macro will be evaluated and that way replaced with the snippet named sampleText. Only when it is done starts the execution of the macro trimLines that will shift the lines left or right with spaces so that there will exactly be one space on the leftmost line.

The macro margin is defined in a local scope. The scope starts with the opening { character of the macro trimLines and ends with the closing }. If the implementation of the macro snip would query the macro register, it could see the value of the macro margin as 1.

The question is whether the macro execution trimMacro sees margin as defined in itself or not. Is the scope already closed when the execution of trimLines starts?

It depends.

If the Macro implementing class also implements the InnerScopeDependent interface then the scope is open. If it does not then Jamal closes the scope before starting the execution of the macro.

The macro trimLines implements this interface because it uses parameters. Implementing this interface is simply adding the name of the interface after the implements keyword. There are no abstract methods in this interface to implement in the class. The first few lines of the method evaluate() are the followings:

trimLinesStart from TrimLines.java
@Override
public String evaluate(Input in, Processor processor) throws BadSyntax {
    final var scanner = newScanner(in, processor);
    final var margin = scanner.number("margin").defaultValue(0);
    final var trimVertical =scanner.bool("trimVertical");
    final var verticalTrimOnly = scanner.bool("verticalTrimOnly", "vtrimOnly");
    scanner.done();

The macro class implements the interface javax0.jamal.tools.Scanner.FirstLine. The method newScanner() is inherited from this interface and it creates a scanner object that is able to parse the first line of the macro.

In this case it parses only the first line and scans for the parameters margin, trimVertical and so on. If the macro fetches the input from between () characters then the class has to implement the javax0.jamal.tools.Scanner interface. There are interfaces for the core classes using the [] characters and for the macros that use the whole input.

When a parameter is not defined in the macro, then the class tries to use the value of the macro with the same name. Thus, the value of the variable margin will be a configuration parameter holding the integer value 1.

Note

In earlier version of Jamal there was no utility class to support the parsing of the parameters. The first approach to configure a macro was to define a user defined macro without any parameter of a given name. Later the Params was developed, and it kept the functionality to fall back to macro definitions in case the parameter was not defined.

This backward compatibility can also be useful when there is a sense to define the parameter globally and not only for the macro invocation.

The macros created before the class Params had no other choice but use macros for configuration. These macros supported the local scope of the macro implementing the signal interface InnerScopeDependent. With the availability of parameter parsing there is no need to define a configuration user defined macro inside the build.in macro body. Instead, you can simply use the configuration parameters in the macro body. Newer macros developed after parameter parsing do not implement the interface InnerScopeDependent.

There is still a use defining a parameter as a macro though. It is the case when the parameter should be defined for a larger scope, and you do not want to copy the parameter key=value to each use of the macro. In that case you can write {@define key=value} before the first use of the built-in macro.

The parameter parsing allows the use of aliases. The example macro above uses both verticalTrimOnly, and vtrimOnly. Any of them can be used to define that the trimming is vertical only. They are aliases. However, only the first one, verticalTrimOnly, is considered as a macro name when the parameter is not defined.

Some built-in macros list the names of the parameters starting with null. It means that the parameter has no name, only aliases. Such parameters cannot be defined using a user defined macro.

Note

Boolean parameters cannot be defined using user defined macros. They always have a default value of false if not defined in the macro body. The default value can be altered if they are defined in an options macro. If you say {@options trimVertical} then the default value of trimVertical is changed to true.

Technically the options are stored in the same (identifier,value) store where the user defined macros. The consequence is that you cannot use the same name for an option and for a user defined macro. The options, however, are not user defined macros.

Macros that rely on user defined macros or options as parameters defined inside should implement the interface InnerScopeDependent. It is recommended not to implement this interface anymore.

Creating User Defined Macros

You can easily create user defined macros using the define macro. However, user defined macros can also be created programmatically. This chapter will describe the latter.

Creating Your Own User Defined Macro Implementation

Programmatically created user defined macros can define their own evaluation.

Strategies to Register Built-In Macros

In this chapter I will explain the advantages, and the disadvantages of the two strategies that you can follow to register your built-in macros. It is a more theoretical chapter with less example code. You can skip this section and return to it later.