Command line tool to patch your YAML and JSON files
# input.yaml
foo:
bar: 1
baz: 2
list:
- a
- b
level: 11
# patch.yaml
foo:
bar: 100
list {insert}: c
level {delete}: null
# command: cels patch input.yaml patch.yaml
foo:
bar: 100
baz: 2
list:
- a
- b
- c
Cels is a command-line tool and Python library that enables you to make multiple
modifications to YAML, JSON, or TOML documents. These modifications are based on
changes specified in a patch file. The patch file is written in the same
format as the original data and mirrors its structure. For instance, if you want
to change the bar
key value from hello
to bye
in this example:
foo:
bar: hello
You simply need to create a patch like this:
foo:
bar: bye
To run the tool and get the result, you just pass both files as arguments to
the cels patch
command:
$ cels patch input.yaml output.yaml
For more complex modifications, you can annotate the keys of the patch
document with the operation to perform in the format: {operation[@index1, index2, …]}
. For example, to insert an element in the middle of a list, you
only need to specify the new location and the value to insert:
# input
list:
- a
- c
# patch
list {insert@1}: b
# output
list:
- a
- b
- c
Working with JSON files in Cels is quite similar to working with YAML files. To illustrate this, let's take a look at how the previous example would be transformed:
Here's the input file:
{
"list": ["a", "c"]
}
And here's the patch file:
{
"list {insert@1}": "b"
}
After applying the patch, the result would be:
{
"list": ["a", "b", "c"]
}
Cels supports a variety of operations (set
, delete
, delete_value
,
rename
, insert
, extend
, use
, link
, render
), and most of them can
optionally take indices to work with any level of nested lists.
Cells can be beneficial when you possess a base configuration file or manifest
for a system and need to tailor it to various environments (eg. 'development'
and 'production'). Additionally, you can utilize Cells as a Python library and
integrate it into your application. This allows your users to modify their own
configuration in diverse setups (similar to docker-compose
overrides, for
example).
Refer to Usage for a comprehensive description of all available operations.
See Similar Projects for a comparison between Cels and other tools and specifications with similar objectives.
The project name
The name 'Cels' is inspired by the conventional, analog world of animation. In that context, 'cels' are transparent sheets featuring drawings that are layered atop one another to create the final image. Cels works in a similar way, creating the final result by layering the patch files over the input documents.
If you're considering using Cels, please take into account the following limitations:
- Cels is only compatible with documents where the top-level element is a dictionary or map. That is, won't work with documents where the top-level element is a list.
- Annotations can only be added to keys that are of the string type. This isn't a problem with JSON documents, as all keys are inherently strings. However, in YAML documents, keys aren't necessarily strings. While Cels can handle YAML documents with non-string keys—and these keys can also appear in patch files—they cannot be annotated. This means that advanced operations cannot be applied to them.
To install cells, simply use pip
:
$ pip install cells
Cels' main command is patch
. It takes an input file (the file that you want to
modify) and a patch file (the file that describes the changes to perform). Eg:
$ cels patch input.yaml patch.yaml
The result or running the command is sent to stdout
by default. To save the
result to a file, just redirect the output or use the -O
option:
$ cels patch input.yaml patch.yaml > output.yaml
$ cels patch input.yaml patch.yaml -O output.yaml
You can patch YAML, JSON and TOML files. Cels determines the correct format to
use from the file extensions, but you can explicitly set the format for each one
of the files with the -i
, -p
, and -o
parameters (run cels patch --help
for more information).
Note: It's possible to utilize various formats within the same command execution. For instance, you can use a JSON patch and a YAML input file simultaneously if you wish. Although there aren't many reasons to typically do this, Cels doesn't place any limitations on the combination of formats used.
To override values of the input document, you don't typically need to specify any operation in the patch file:
# input
foo:
bar:
a: 1
b: 2
c: 3
# patch
foo:
bar:
b: 200
# output
foo:
bar:
a: 1
b: 200
c: 3
As shown in this example, dictionaries in the patch file get merged with
the ones in the input file. However, if you want to override an entire
dictionary, instead of merging it, you can use the operation set
:
# input
foo:
bar:
a: 1
b: 2
c: 3
# patch
foo:
bar {set}:
b: 200
# output
foo:
bar:
b: 200
Operations—in their most basic form—are specified by appending the operation
name between curly braces ({<operation>}
) to the key string. The full
annotation format is:
key {<operation>[@<index1>,<index2>,...]}
where the indices are provided to modify lists (see below Working with lists).
The other basic operations include delete
, which removes a key, and rename
,
which alters the key's name while preserving its value.
# input
foo:
bar: "A Fair Field Full of Folk"
baz:
one: 1
two: 2
# patch
foo:
bar {rename}: newbar
baz {delete}: null
# output
foo:
newbar: "A Fair Field Full of Folk"
For the delete
operation, you can pass any value (in this case null
) as it
is just ignored by Cels.
The majority of operations can utilize an index following an @
symbol. This
is to indicate that the action should be executed at a specific position within
a list (To know if a given operation supports indices, run patch describe operation OPERATION_NAME
, which will provide full information about it).
For example, to change the second element of a list, you can use the set
operation:
# input
foo:
- a
- b
# patch
foo {set@1}: B
# output
foo:
- a
- B
Note that when working with lists:
- Indices start at 0 (meaning the first element is located at index 0).
- Negative indices can be used. For instance, -1 refers to the last element in the list, -2 to the penultimate one, and so on.
- To specify elements within nested lists, you can provide multiple indices
separated by commas. For example,
1,0
refers toc
in a nested list like[[a, b], [c, d]]
.
The operations insert
and extend
are very useful when working with lists.
insert
adds one element to a list:
# input
foo:
- a
- b
# patch
foo {insert}: c
# output
foo:
- a
- b
- c
Whereas, in the case of extend
, multiple elements are added simultaneously:
# input
foo:
- a
- b
# patch
foo {extend}:
- c
- d
# output
foo:
- a
- b
- c
- d
In both scenarios, indices can be used to designate the position where you wish to add elements:
{insert@NUM}
and{extend@NUM}
place the elements before theNUM
position. In other words, to insert them at the start of the list, before any other element, use{insert@0}
and{extend@0}
.- The unique index
_
can be utilized to signify that the elements should be appended to the list's end (i.e.,{insert@_}
). The_
can be omitted when operating on the top-level list, meaning{insert}
and{extend}
are synonymous with{insert@_}
and{extend@_}
. However, if you need to append elements to a nested list's end, it's necessary to use it.
For example, here's how you append an element to the end of a nested list:
# input
foo:
- - a
- b
- - c
- d
# patch
foo {insert@1,_}: e
# output
- - a
- b
- - c
- d
- e
The operation delete_value
removes all occurrences of a given value from a list:
# input
foo:
- a
- b
- a
# patch
foo {delete_value}: a
# output
foo:
- b
It is also possible to delete values from nested lists using indices and delete entire objects (not only scalars):
# input
foo:
- a
- b
- - c
- d
- x: 1
y: 2
# patch
foo {delete_value@2}:
x: 1
y: 2
# output
foo:
- a
- b
- - c
- d
It is possible to define variables and then reuse them at different places of the patch file.
This is a quick example:
# input
foo: 1
# patch
my_var {var}: World
foo {use}: my_var
bar {render}: "Hello {{ my_var }}!"
# output
foo: World
bar: "Hello Wold!"
As you can see, variables are defined with:
key {var}: value
where key
is the name of the variable and value
is, well, its value.
Variable definitions, while not visible in the output document themselves, can
be utilized through the use
and render
operations to insert values in
various locations:
- The
use
operation simply inserts the variable as is. If the variable is a list or a dictionary,use
will incorporate it without any modifications. - The
render
operation allows you to define a Jinja template string that can reference one or more variables. If the variable is a list or a dictionary, you can use the.
or[]
notation to pinpoint the exact value you wish to use. Additionally, you can utilize any of the features offered by the Jinja template language, such as filters or conditional structures.
The following is a more elaborated example of using variables with the render
operation:
# patch
my_var {var}:
one: a
two:
- b
- c
foo {render}: "{{ my_var.one }} lowercase, {{ my_var.two[0]|upper }} uppercase"
# output
foo: a lowercase, B uppercase
Please note that a variable comes with an associated scope:
- The
use
andrender
commands can only reference variables from the same dictionary in which they are used, or from any parent or ancestor dictionary. In other words, if you want a variable to be accessible throughout the entire document, you should define it at its root dictionary. - If a variable is redefined in a child dictionary, the value in the child dictionary will take precedence over the one in the parent dictionary.
If you want to perform multiple changes to the same key, you can use the
change
operation, that takes a list of the modifications to perform:
# input
foo: 1
# patch
foo {change}:
- operation: set
value: 100
- operation: rename
value: bar
# output
bar: 100
The operations will be executed in the order they are presented. Each item in the operations list should contain no more than three fields:
operation
: This refers to one of the possible operations (set
,delete
,rename
,insert
, etc.).value
: The value depends on the type of operation. It may be omitted for keys that do not require a value, such asrename
.indices
: This is a list of integers (and_
forinsert
andextend
operations) used to manipulate nested lists. This field can be omitted for operations that do not intend to modify lists.
In summary, these three fields correspond to those that can be defined in a
standard annotation: key {operation@indices}: value
.
Note: the notation:
foo {change}: - operation: set value: 100 - operation: rename value: bar
is different from listing the operations one after another:
foo {set}: 100 foo {rename}: bar
In the latter scenario, the second operation would supersede the initial one. Therefore, there are minimal instances, if any, where you might want to do that.
The link
operation allows you to reference parts of the input document and
reuse them in other parts. For instance, here we use the link
operation to
remove the bar
level from the input document:
# input
foo:
bar:
one: 1
two: 2
# patch
foo {delete}: null
new-foo {link}: .foo.bar
# result
new-foo:
one: 1
two: 2
The link
operation always takes a path that employs the .
and []
notation
to traverse through the dictionaries and lists within the input document. The
initial .
symbol signifies the root dictionary of the document. It's important
to note that the path always refers to the unaltered input document, regardless
of any other operations being performed. This means that the value indicated by
the path will always be the original one.
One limitation of the link
operation is that it just copies a part of the
input document as it is. If you need to reference a value from the input
document but you also require to modify it in some form, you can use a combination
of the render
operation and _get
template function. Here's an example:
# input
foo:
- one
- two
# patch
foo {render@1}: "{{ _get('.foo[1]') | upper }}"
# output
foo:
- one
- TWO
_get()
takes as parameter the path of the value in the input document that
you want to use (using the same dot notation than with link
) and returns such
a value.
If you need to modify a dictionary within a list, you can utilize the patch
operation. In most scenarios, explicitly using patch
isn't required, as
dictionaries in the patch file are automatically merged with those in the input
document. However, when dealing with dictionaries within lists, it becomes
essential to use patch
because you need to specify the index that indicates
the dictionary's position within the list. Here's an example:
# input
foo:
- a: 1
b: 2
# patch
foo {patch@0}:
a: 100
# result
foo:
- a: 100
b: 2
By default, annotations in Cels appear as <space>{operation@indices}
. However,
you can customize all the symbols used to better suit your needs by using the
following command parameters:
$ cels patch input.yaml patch.yaml \
--separator "_" \
--left-marker "(" \
--index-marker "%" \
--right-marker ")"
This example changes the format of annotations to: _(operation%indices)
.
To list all available operations, you can use:
$ cels list operations
To show help for a given operation, including its description and usage examples, you can use:
$ cels describe operation OPERATION_NAME
Additionally, with the -v
flag, you can activate the verbose output.
This will display the operation that was used to generate each node in the
output document:
You can use Cels programmatically from your Python code. It provides two basic
functions: patch_document
and patch_dictionary
.
patch_document
allows you to pass JSON, YAML or TOML text for the input and
patch documents, in the same way that they are passed to the cels
command:
from cels import patch_document
output = patch_document(input_format, input_text, patch_format, patch_text, output_format)
input_format
and patch_format
are string arguments and
can take any of the following values: json
, yaml
or toml
.
input_text
and patch_text
are the raw texts to be used (of course, their
format should match with the parameters above).
Finally, it is possible to specify the format of the output text with
output_format
, which doesn't necessarily have to match the input formats.
patch_dictionary
works exactly the same as patch_document
but the data is
passed already in the format of python dictionaries:
from cels import patch_dictionary
result = patch_dictionary(input_dict, patch_dict)
In both cases (patch_document
and patch_dictionary
), you can pass
separator
, left_marker
, index_marker
and right_marker
parameters to
define the format of the key annotations (see Changing the annotation
format for more information).
Deepcopy
For performance reasons, Cels' Python functions do not generate a complete copy of the output dictionary in memory. Instead, Cels interlaces the input and patch nodes, merging them to produce the final result.
This approach works well if you only need to read the output. However, if you alter it in-place, you may inadvertently modify nodes of the input dictionary at the same time.
If you need to change the output dictionary without affecting the input and patch dictionaries, simply create a deep copy of the dictionary returned by the Cels functions:
from copy import deepcopy from cels import patch_dictionary result = deepcopy(patch_dictionary(input_dict, patch_dict))
jq and yq
are widely used command-line tools for processing JSON and YAML. The key
distinction between these tools and Cels lies in their operational approach.
jq
and yq
operate using paths, whereas Cels utilizes patch files.
For example, if you want to alter the value of a key in yq
, you would specify
it as follows:
.foo.bar.baz = "value"
In contrast, with Cels, you would write the actual YAML:
foo:
bar:
baz: "value"
jq
and yq
shine when it comes to making specific modifications to a
document. They allow you to pinpoint and alter a deeply nested key with a
single command line. However, their intuitiveness diminishes when multiple
changes are required within a document. In such scenarios, using a patch file
as in Cels may provide a better overall view of the modifications and their
interrelations.
In this discussion, we're primarily focusing on the patching capabilities of these tools. However, they also offer additional features (like formatting and data extraction) that Cels does not. For more details, please refer to their respective documentation.
It's worth noting that jq
is exclusively for JSON, while yq
can handle both
JSON and YAML.
Jsonnet, CUE, and YTT are fully-fledged languages equipped with import mechanisms, loops, conditionals, functions, and other programming language constructs. They are designed as supersets of JSON, as seen in Jsonnet and CUE, and a superset of YAML in the case of YTT (however, in this case, language constructs are embedded within comments, ensuring compatibility as a YAML document). These languages can be an excellent choice if you're tasked with making complex modifications to JSON or YAML files.
Cels stands out from these solutions due to its simplicity. All you need to do is annotate each key you wish to modify with the desired operation. While it may not have the capabilities of Jsonnet, CUE, YTT, or any other configuration programming language, it still covers a wide range of common use cases while still being extremely simple to use.
Starlark and Dhall share similarities with Jsonnet, CUE, and YTT as they are all fully-fledged configuration programming languages. However, they don't extend JSON or YAML. Starlark is a subset of Python, while Dhall has its own unique syntax (though it does allow exporting to any other format).
As in the case of, Jsonnet, CUE, and YTT, Cels sets itself apart from Starlark and Dhall by being not a comprehensive language, but rather a simple collection of annotations. This makes it considerably less complex in comparison.
RFC 6902 and RFC 7396 are both proposed standards for patching JSON files.
RFC 6902
defines a JSON structure for defining a list of operations to
be applied sequentially to the original document in order to patch it.
This is an example:
[
{ "op": "remove", "path": "/a/b/c" },
{ "op": "add", "path": "/a/b/c", "value": [ "foo", "bar" ] },
{ "op": "replace", "path": "/a/b/c", "value": 42 },
]
RFC 6902
provides a broad spectrum of operations. However, it doesn't
replicate the original document, but merely outlines the operations in a list
format. On the other hand, the patch file in Cels can result in a more compact
document that is easier to read and potentially easier to maintain. Moreover,
Cels isn't limited to JSON support; it also accommodates YAML and TOML.
On the other hand, RFC 7396
is very similar to Cels. Like Cels, it defines a
patch format that mirrors the original document.
For example, given the following example file:
{
"a": "b",
"c": {
"d": "e",
"f": "g"
}
}
A RFC 7396 patch may look like this:
{
"a":"z",
"c": {
"f": null
}
}
Which results in:
{
"a": "z",
"c": {
"d": "e",
}
}
(Setting a key to null
deletes it).
For comparison, the equivalent Cels patch would look like:
{
"a":"z",
"c": {
"f {remove}": null
}
}
These examples illustrate that RFC 7396
and Cels share many similarities, but
there are key differences:
RFC 7396
is limited to leaving a key as is, overwriting it, or deleting it. In contrast, Cels provides a broader set of options.- Cels has the capability to manipulate list elements, a feature that
RFC 7396
lacks. RFC 7396
employsnull
to remove keys from the original dictionary. However,null
is a perfectly valid value in a JSON file, which rendersRFC 7396
incapable of representing certain valid JSON documents. Specifically, it's impossible to assign anull
value to a key as it would be deleted instead.- While
RFC 7396
only supports JSON, Cels can handle YAML and TOML as well.
The following is a (non-comprehensive) list of RFC 6902
and RFC7396
implementations:
- json-patch: Go implementation of RFC 6902 and RFC 7396.
- python-json-patch: Python implementation of RFC 6902.
- yaml-diff-patch: Command line and npm package that allows to apply RFC 6902 JSON patches to a YAML document.
- chbrown/rfc6902: TypeScript implementation of RFC 6902.
Cels is available under the MIT license.