Skip to content
Mike Schumacher edited this page May 23, 2018 · 23 revisions

Text Merger Plug-in

The Text Merger Plug-in enables merging result free text documents to existing free text documents. Therefore, the algorithms are also very rudimentary.

Merger extensions

There are currently three main merge strategies that apply for the whole document:

  • merge strategy textmerge_append (appends the text directly to the end of the existing document) _Remark_: If no anchors are defined, this will simply append the patch.

  • merge strategy textmerge_appendWithNewLine (appends the text after adding a new line break to the existing document) _Remark_: empty patches will not result in appending a new line any more since v1.0.1 Remark: Only suitable if no anchors are defined, otherwise it will simply act as textmerge_append

  • merge strategy textmerge_override (replaces the contents of the existing file with the patch) _Remark_: If anchors are defined, override is set as the default mergestrategy for every text block if not redefined in an anchor specification.

Anchor functionality

If a template contains text that fits the definition of anchor:${documentpart}:${mergestrategy}:anchorend or more specifically the regular expression (.*)anchor:():(newline_)?([^:])(_newline)?:anchorend\\s*(\\r\\n|\\r|\\n), some additional functionality becomes available about specific parts of the incoming text and the way it will be merged with the existing text. These anchors always change things about the text to come up until the next anchor, text before it is ignored.

If no anchors are defined, the complete patch will be appended depending on your choice for the template in the file templates.xml.

Anchor Definition

Anchors should always be defined as a comment of the language the template results in, as you do not want them to appear in your readable version, but cannot define them as freemarker comments in the template, or the merger will not know about them. Anchors will also be read when they are not comments due to the merger being able to merge multiple types of text-based languages, thus making it practically impossible to filter for the correct comment declaration. That is why anchors have to always be followed by line breaks. That way there is a universal way to filter anchors that should have anchor functionality and ones that should appear in the text. Remark: If the resulting language has closing tags for comments, they have to appear in the next line. Remark: If you do not put the anchor into a new line, all the text that appears before it will be added to the anchor.

Documentparts

In general, ${documentpart} is an id to mark a part of the document, that way the merger knows what parts of the text to merge with which parts of the patch (e.g. if the existing text contains anchor:table:${}:anchorend that part will be merged with the part tagged anchor:table:${}:anchorend of the patch).

If the same documentpart is defined multiple times, it can lead to errors, so instead of defining table multiple times, use table1, table2, table3 etc.

If a ${documentpart} is defined in the document but not in the patch and they are in the same position, it is porcessed int the following way: If only the documentparts header, test and footer are defined in the document in that order, and the patch contains header, order and footer, the resulting order will be header, test, order then footer.

The following documentparts have default functionality

  1. anchor:header:${mergestrategy}:anchorend marks the beginning of a header, that will be added once when the document is created, but not again. Remark: This is only done once, if you have header in another anchor, it will be ignored

  2. anchor:footer:${mergestrategy}:anchorend marks the beginning of a footer, that will be added once when the document is created, but not again. Once this is invoked, all following text will be included in the footer, including other anchors.

Mergestrategies

Mergestrategies are only relevant in the patch, as the merger is only interested in how text in the patch should be managed, not how it was managed in the past.

  1. anchor:${documentpart}::anchorend will use the merge strategy from templates.xml, see Merger-Extensions.

  2. anchor:${}:${mergestrategy}_newline:anchorend or anchor:${}:newline_${mergestrategy}:anchorend states that a new line should be appended before or after this anchors text, depending on where the newline is (before or after the mergestrategy). anchor:${documentpart}:newline:anchorend puts a new line after the anchors text. Remark: Only works with appending strategies, not merging/replacing ones. These strategies currently include: appendbefore, append/appendafter

  3. achor:${documentpart}:override:anchorend means that the new text of this documentpart will replace the existing one completely

  4. anchor:${documentpart}:appendbefore:anchorend or anchor:${documentpart}:appendafter:anchorend/anchor:${documentpart}:append:anchorend specifies whether the text of the patch should come before the existing text or after.

Usage Examples

General

Below you can see how a file with anchors might look like (using Asciidoc comment tags), with examples of what you might want to use the different functions for.

// anchor:header:append:anchorend

Table of contents
Introduction/Header

// anchor:part1:appendafter:anchorend

Lists
Table entries

// anchor:part2:nomerge:anchorend

Document Separators
Asciidoc table definitions

// anchor:part3:override:anchorend

Anything that you only want once but changes from time to time

// anchor:footer:append:anchorend

Copyright Info
Imprint

Merging

In this section you will see a comparision on what files look like before and after merging

override

Before
// anchor:part:override:anchorend
Lorem Ipsum
Patch
// anchor:part:override:anchorend
Dolor Sit
After
// anchor:part:override:anchorend
Dolor Sit

Appending

Before
// anchor:part:append:anchorend
Lorem Ipsum
// anchor:part2:appendafter:anchorend
Lorem Ipsum
// anchor:part3:appendbefore:anchorend
Lorem Ipsum
Patch
// anchor:part:append:anchorend
Dolor Sit
// anchor:part2:appendafter:anchorend
Dolor Sit
// anchor:part3:appendbefore:anchorend
Dolor Sit
After
// anchor:part:append:anchorend
Lorem Ipsum
Dolor Sit
// anchor:part2:appendafter:anchorend
Lorem Ipsum
Dolor Sit
// anchor:part3:appendbefore:anchorend
Dolor Sit
Lorem Ipsum

Newline

Before
// anchor:part:newline_append:anchorend
Lorem Ipsum
// anchor:part:append_newline:anchorend
Lorem Ipsum
(end of file)
Patch
// anchor:part:newline_append:anchorend
Dolor Sit
// anchor:part:append_newline:anchorend
Dolor Sit
(end of file)
After
// anchor:part:newline_append:anchorend
Lorem Ipsum

Dolor Sit
// anchor:part:append_newline:anchorend
Lorem Ipsum
Dolor Sit

(end of file)

Error List

  • If there are anchors in the text, but either base or patch do not start with one, the merging process wil be aborted, as text might go missing this way.

  • Using _newline or newline_ with mergestrategies that don’t support it , like override, will abort the merging process. See Merge Strategies →2 for details.

  • Using undefined mergestrategies will abort the merging process.

  • Wrong anchor definitions, for example anchor:${}:anchorend will abort the merging process, see Anchor Definition for details.

Clone this wiki locally