Skip to content

Commit

Permalink
WIP: Experiment with binary strings
Browse files Browse the repository at this point in the history
  • Loading branch information
nicowilliams committed Jul 22, 2023
1 parent 99a77f7 commit 6923da2
Show file tree
Hide file tree
Showing 16 changed files with 646 additions and 162 deletions.
71 changes: 65 additions & 6 deletions docs/content/manual/manual.yml
Original file line number Diff line number Diff line change
Expand Up @@ -772,6 +772,9 @@ sections:
`null` can be added to any value, and returns the other
value unchanged.
A numeric byte value between 0 and 255, inclusive, can be
added to a binary string value.
examples:
- program: '.a + 1'
input: '{"a": 7}'
Expand Down Expand Up @@ -1405,26 +1408,80 @@ sections:
- title: "`tostring`"
body: |
The `tostring` function prints its input as a
string. Strings are left unchanged, and all other values are
The `tostring` function prints its input as a string.
Binary strings are encoded as base64 or else converted to
UTF-8 with bad character mapping accoring to the kind of
binary value (see `tobinary`, `tobinary_utf8`). Text
strings are left unchanged, and all other values are
JSON-encoded.
examples:
- program: '.[] | tostring'
input: '[1, "1", [1]]'
output: ['"1"', '"1"', '"[1]"']

- title: "`tobinary`"
body: |
The `tobinary` function is like `tostring`, but its output
will be a string which when output to jq's output stream
will be base64-encoded, and which if added with other
strings will produce a binary string value.
Internally the binary string may be represented efficiently,
and may not be encoded until it is output or until it is
passed to `tostring`. Adding a byte value (integer value
between 0 and 255, inclusive) to a binary string is allowed,
and will append that byte to it.
- title: "`tobinary_bytearray`"
body: |
The `tobinary_bytearray` function is like `tobinary`, but
when output by jq it will be represented as an array of
small non-negative byte value integers.
- title: "`tobinary_utf8`"
body: |
The `tobinary_utf8` function is like `tobinary`, but when
output by jq it will be converted to UTF-8 with bad
character replacements.
- title: "`tobinary(bytes)`"
body: |
This function constructs a binary string value like
`tobinary` but consisting of the byte values output by
`bytes`.
- title: "`type`"
body: |
The `type` function returns the type of its argument as a
string, which is one of null, boolean, number, string, array
or object.
- title: "`stringtype`"
body: |
Strings can be UTF-8 strings or binary strings. The
`stringtype` builtin outputs `"UTF-8"` or `"binary"` when
given a string as input.
- title: "`outputencoding`"
body: |
Outputs either `"UTF-8"`, `"base64"`, or `"bytearray"`,
depending on whether the string is a plain text string or as
produced by `tobinary_utf8`, a binary string as produced by
`tobinary`, or a binary string as produced by
`tobinary_bytearray`.
examples:
- program: 'map(type)'
input: '[0, false, [], {}, null, "hello"]'
output: ['["number", "boolean", "array", "object", "null", "string"]']
- program: '[(tostring,tobinary,tobinary_bytearray,tobinary_utf8)|[type,stringtype,outputencoding]]'
input: '"foo"'
output: ['[["string","UTF-8","UTF-8"],["string","binary","base64"],["string","binary","bytearray"],["string","binary","UTF-8"]]']

- title: "`infinite`, `nan`, `isinfinite`, `isnan`, `isfinite`, `isnormal`"
body: |
Expand Down Expand Up @@ -2038,7 +2095,9 @@ sections:
* `@base64d`:
The inverse of `@base64`, input is decoded as specified by RFC 4648.
Note\: If the decoded string is not UTF-8, the results are undefined.
The result will be a binary string as if `tobinary_utf8`
was used, meaning that on output bad characters will be
replaced.
This syntax can be combined with string interpolation in a
useful way. You can follow a `@foo` token with a string
Expand Down
25 changes: 23 additions & 2 deletions jq.1.prebuilt

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 6923da2

Please sign in to comment.