Skip to content

How to: Avoid Pitfalls

pkoppstein edited this page Mar 25, 2018 · 29 revisions

TOC

nan, NaN, inf, Inf, infinite and null

nan is a jq value representing IEEE NaN, but it prints as null.

NaN is recognized in JSON text and is also understood to represent IEEE NaN.

Use isnan to test whether a jq value is identical to IEEE NaN.

Here are some illustrative examples:

$ echo NaN | jq .
null

$ echo nan | jq .
parse error: Invalid literal at line 2, column 0

$ echo NaN | jq isnan
true

$ jq -n 'nan | isnan'
true

Similar comments apply to the jq value infinite, and the admissible values inf and Inf:

$ echo Inf | jq isinfinite
true

$ echo inf | jq isinfinite
true

$ jq -n 'infinite | isinfinite'
true

foo.bar vs .foo.bar

foo.bar is short for foo | .bar and means: call foo and then get the value at the "bar" key of the output(s) of foo.

.foo.bar is short for .foo | .bar and means: get the value at the "foo" key of . and then get the value at the "bar" key of that.

One character, big difference.

Cartesian Products

jq is geared to produce Cartesian products at the drop of a hat. For example, the expression (1,2) | (3,4) produces four results:

3
4
3
4

To see why:

$ jq -n '(1,2) as $i | (3,4) |  "\($i),\(.)"' 
"1,3"
"1,4"
"2,3"
"2,4"

Generator Expressions in Assignment Right-Hand Sides

Generator expressions in assignment RHS expressions are likely to surprise users. Compare (.a,.b) = (1,2) to (.a,.b) |= (.+1,.*2).

Backtracking (empty) in Assignment RHS Expressions and Reductions.

Ditto. .a=empty and .a|=empty behave differently. In most versions of jq (including jq 1.5 and earlier, and the current “master” version as of 2018), 1 | reduce 2 as $stuff (3; empty) produces null, which might be surprising, as one might expect the result to be 3, as it was for a time.

Multi-arity Functions and Comma/Semi-colon Confusability

foo(a,b) is NOT the same as foo(a;b). If foo/1 and foo/2 are defined then you'll silently get the wrong behavior. For example, foo(1,2) is a call to foo/1 with a single argument consisting of the expression 1,2, while foo(1;2) is a call to foo/2 with two arguments: the expressions 1, and 2.

One character, big difference.

index/1 is byte-oriented but match/1 is codepoint-oriented

Given strings as input, the index family of filters (index, rindex, indices) return byte-oriented offsets. For codepoint-oriented offsets, either use the array-oriented versions of these filters, or use match/1 or match/2.

For example:

$ jq -cn '"aéb" | [., index("b")]'
["aéb",3]
$ jq -cn '"aéb" | [., (explode|index("b"|explode))]'
["aéb",2]
$ jq -cn '"a\u00e9b" | [., index("b")]'
["aéb",3]
$ jq -cn '"a\u00e9b" | match("b").offset'
2 
Clone this wiki locally