Skip to content
Vladimir Panteleev edited this page Nov 23, 2020 · 5 revisions

DustMite supports loading pre-split data instead of splitting it itself. This allows it to reduce file formats that it itself cannot parse, with the parsing being delegated to other programs.

The pre-split data is provided as a JSON file instead of the regular dataset. The --json switch instructs DustMite to read the dataset as a JSON archive.

Example:

$ dustmite --json dataset.json tester

The archive can also be provided on standard input:

$ dustmite --json - tester < dataset.json

DustMite can be asked to create a JSON archive of its input data with the --dump-json switch:

$ dustmite --dump-json program.d
# use program.d.json

(Of course, with --json, the output will be identical to the input archive).

Format

The dataset document has one root object, and one object per node (entity).

The root object is as follows:

{
    "version" : 1,
    "root" : { /* ... node object ... */ }
}

Node objects have the following structure:

{
	// If present, indicates that this node represents a file,
	// and its contents represent that file's contents.
	// For any path from the root node to any leaf node, 
	// exactly one node across this path should have a non-empty filename.
	"filename" : "program.c",

	// Represents data before this node's children.
	// Must not be set outside a file.
	"head" : "int main() {\n",

	// Array of child nodes.
	"children" : [
		{ /* ... node object ... */ },
		// ...
	],

	// Represents data after this node's children.
	// Must not be set outside a file.
	"tail" : "}\n",

	// If true, dustmite will not remove this node.
	"noRemove" : false /* or true */,

	// An arbitrary string which can be used to refer to this node
	// from another node's dependents.
	// Labels should be unique across the dataset (JSON file).
	"label" : "123",

	// An array of labels of nodes that should also be removed 
	// if this node is removed.
	"dependents" : [ "456" ]
}

All node object fields are optional.

Binary data can be represented as-is in head / tail fields, as long as special JSON characters (" and \) are escaped, though escaping control characters is recommended for conformance and portability.

For practical examples, you can have a look at the src.json files in DustMite's tests directory.