Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails to parse big integer #1652

Closed
jbernardes opened this issue Apr 18, 2018 · 15 comments
Closed

Fails to parse big integer #1652

jbernardes opened this issue Apr 18, 2018 · 15 comments

Comments

@jbernardes
Copy link

With the following command

echo "{\"a\": 13911860366432393}" | jq "."
{
  "a": 13911860366432392
}

you can notice a difference between input value 13911860366432393 and output value 13911860366432392

@JesseTG
Copy link

JesseTG commented Apr 18, 2018

JSON doesn't actually support integers; there's just one Number type, which in practice is a 64-bit float (a.k.a. double if you know C or Java). As a consequence of the underlying representation, integers between negative and positive 253 can be stored in a Number. Go ahead and enter 13911860366432393 into your favorite browser's console, it'll give you the same result.

That means this isn't a bug; it's a feature, namely conformance to the JSON spec.

@jbernardes
Copy link
Author

jbernardes commented Apr 18, 2018

Thanks for the fast reply.
I understand the point of view even if it seems to be more a JavaScript limitation (http://www.ecma-international.org/ecma-262/#sec-terms-and-definitions-number-value) than a JSON (http://json.org/) limitation .
I was surprised since python handles it

echo "{\"a\": 13911860366432393}" | python -m json.tool
{
    "a": 13911860366432393
}

jq was used to indent a json file produced by a database and we dealt with id mismatch issue.

@JesseTG
Copy link

JesseTG commented Apr 22, 2018

Python's JSON implementation probably doesn't strive for strict compliance the same way jq does, then. That has its own set of pros and cons. I'm wrong, see below.

@pkoppstein
Copy link
Contributor

Please note that the JSON spec says:

This specification allows implementations to set limits on the range and precision of numbers accepted [https://tools.ietf.org/html/rfc7159]

In a sense, then, jq conforms with the spec, though one might argue that conformance with the spec would require that jq not quietly “accept” numbers outside the limits it sets.

@JesseTG
Copy link

JesseTG commented Apr 22, 2018

...oops. My bad. @pkoppstein is right, I'm not. Python is known to use the usual IEEE-754 floating-point standard, so more than likely its JSON parser handles large integers specially (which appears to be allowed within the standard).

If you're dealing with some kind of RESTful API, do they provide an id_str field, or something else that stores the id field as a string? I know Twitter does. If that's the case, use that instead.

@jbernardes
Copy link
Author

jbernardes commented Apr 23, 2018

Thanks @pkoppstein , that is clear now (https://tools.ietf.org/html/rfc7159)

Note that when such software is used, numbers that are integers and
are in the range [-(2**53)+1, (2**53)-1] are interoperable in the
sense that implementations will agree exactly on their numeric
values.

And since

>>> 13911860366432393 > (2**53-1)
True

jq raising a warning in that case would be great.

@JesseTG
Copy link

JesseTG commented Apr 24, 2018

That's a good idea, actually.

@MostAwesomeDude
Copy link

This issue bit me today while processing some JSON which involves data structures packed as bigints. In the Python and Monte languages, this is no problem, so I'm not blocked by this issue, but I figured I'd share some in-the-wild examples: https://gist.github.com/MostAwesomeDude/060eccb197c85274c4af9d5c011090f5

$ jq . <nu.poset.json
[
  [
    "nu",
    "jei",
    "su'u",
    "zu'o",
    "du'u",
    "jei",
    "za'i",
    "mu'e",
    "si'o",
    "li'i",
    "pu'u",
    "nu",
    "ka",
    "ni",
    "su'u"
  ],
  33018956916505006000000000000000
]

@pkoppstein
Copy link
Contributor

This is a well-known limitation of jq, as described (for example) under the "Caveats" heading in the jq FAQ: https://github.com/stedolan/jq/wiki/FAQ#caveats

@jbernardes
Copy link
Author

@pkoppstein so we can close the issue ?

@pkoppstein
Copy link
Contributor

#218 and #369 seem sufficient.

@earlchew
Copy link

It's surprising that large integers do not survive a trip through jq:

python -c 'import json ; print(json.
dumps(1077402552378111143477811657476615308118065527161074128004476424147850760160430021818807185443271105686257832038547166418416685404181421562822126741783213010384216486564580151864858278704345758513411871413642731147162065221107022533731170512113003187014166088811234101538482428385631041638568817110886747078827516684477475704501273547626348371587180153362381012410551022408073271388634757821101731471708851846510370421451853411480818753171820111408123111151464742144611878017034812885114135112176608658164516045617281241322223481003326511818881482714176668315016784158371463058301334580014112182016161857138076416808))' | jq .
1.7976931348623157e+308

@mihidux
Copy link

mihidux commented Jun 2, 2019

In the FAQ, it says "JSON numbers in the input are converted to IEEE 754 64-bit values". Those are floating point doubles. Unfortunately, for large integers, this results in loss of precision. Ideally, integer values up to the maximum value for signed integers (2^63) would be stored internally as variables of type int64. Only integers greater than this value and floating point values would be converted to doubles. Is there a way of differentiating between integers in the JSON and floats (e.g. the presence of a decimal point)?

@leonid-s-usov
Copy link
Contributor

Please see #1752 for a solution

@emanuele6
Copy link
Member

jq 1.7 released with the fix. closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants