Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added jv_is_integer_large() to the library. #1862

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mfeit-internet2
Copy link
Contributor

This change adds a new function to the library called jv_is_integer_large(), which returns true for any JV_KIND_NUMBER that fits into a double and has a fractional part smaller than the system's epsilon.

It was written to let PyJQ make decisions about using the arbitrarily-large integer types available in Python. (A companion pull request for that project is forthcoming.)

Gory details about why: perfsonar/pscheduler#717

@coveralls
Copy link

coveralls commented Mar 14, 2019

Coverage Status

Coverage decreased (-0.2%) to 84.691% when pulling afc7780 on mfeit-internet2:master into 3ea0199 on stedolan:master.

@nicowilliams
Copy link
Contributor

Do we need a jq function for this? It can be coded using the builtin modf and fabs functions...

@mfeit-internet2
Copy link
Contributor Author

Probably not for this change, although I'd imagine there could be some uses for it in scripts.

Jq mostly leaves numbers as they came in on the way out, so there's no need for scripts to make corrections unless they're trying do explicitly convert ints to floats.
There might be a different discussion to be had about how jq deals with real numbers, because it does chop non-fractional floats into ints. That means someone specifying JSON numbers with n significant figures will have them mangled into numbers with none (e.g., 123.000 on the way in becomes 123 on the way out).

I haven't looked at the sources for any of the other language bindings, but I assumed that jv_is_integer() was put there as a utility for them to use. It is used in one spot internally.

@nicowilliams
Copy link
Contributor

nicowilliams commented Mar 14, 2019

Not every jv function has a jq function binding. Some might only be used internally, some might be used by "internal libjq applications" (src/main.c and src/jq_test.c), some might be used by external libjq applications. Who would use this one?

As to jq throwing away zero-valued fractional parts... well, it's just about how it formats numbers. On input jq immediately parses JSON numeric values into IEEE754 doubles. On output it formats doubles back into JSON numeric values. As a result the original form of a numeric value is lost. Now, if in C you have double r = 1.0; that's the same as double n = 1; in that the two values will be bit-by-bit equal (i.e., memcmp((void*)&r, (void*)n, sizeof(double)) will return 0). So how should jq know that jv_number(r) should be printed as 1.0 and jv_number(n) as 1? It cannot.

There is a pending contribution to add a big number facility to jq.

And we could delay parsing of numeric values (but not validation) so that on output we could print unmodified numbers exactly as they were on input. Even so there would be edge cases. Should 1 + 0.0 output 1 or 1.0? We'd have to have a distinction between "integer" and "real" in the code, and I'm not sure it's worth it. At some point one has to accept that JSON doesn't have a canonical representation (yes, some are trying to add one) and that there are trade-offs in any number representation system (even a bignum system has issues, namely being slower and using more memory).

@leonid-s-usov
Copy link
Contributor

leonid-s-usov commented Mar 15, 2019 via email

@nicowilliams
Copy link
Contributor

@leonid-s-usov Oh, I see, that is nice. @wtlangford is rebasing your PR and undoing the big file split.

@leonid-s-usov
Copy link
Contributor

leonid-s-usov commented Mar 15, 2019 via email

@nicowilliams
Copy link
Contributor

I understand. The idea is not to not have that split, but to first merge the contribution, and then do the split in a commit that only does the split.

@mfeit-internet2
Copy link
Contributor Author

Side discussions about bindings and how integers will be handled in the future, does this look like something that will be in the next release?

@nicowilliams
Copy link
Contributor

@mfeit-internet2 sure, a few questions:

  • how would this work once we integrate Decimal literal number #1752?
  • do you have libjq-using code that needs this?
  • should we expose this as a jq function so jq programs can use it?

@mfeit-internet2
Copy link
Contributor Author

1752: If the plan is to merge that change into the next release and there's still going to be a jv_is_integer() in the library without the same larger-than-31-bits bug as the existing version, the _large version can go away. In retrospect, I should probably have proposed this change as a wholesale replacement for jv_is_integer() instead of adding a new function but didn't want to risk breaking existing code that might depend on the 31-bit behavior. I wasn't aware this effort was underway, but as long as whatever function ends up in the new version works correctly, I'm good with it.

Code needing it: I have a patched version of jq with the new function and a patched PyJQ that uses it. Both are privately built and privately distributed, so that code is totally contained and under my control. I'm not the only PyJQ user with this problem, and my plan is to submit a patch to PyJQ patch once jq either pulls this, fixes jv_is_integer() so it works or does something else that still lets me reason about what I'm about to be handed.

Exposure as a function: No need. Scripts shouldn't care since jq takes care of all of this internally and seems to work fine. (See the perfSONAR ticket I referred to in the first comment.) Nothing internal uses jv_is_integer(), so it's a library-only problem.

@nicowilliams
Copy link
Contributor

@mfeit-internet2 thanks.

OK, so, I think this should be folded into jv_is_integer(), and we should have an isinteger function for jq. I'm sure that when #1752 gets merged we can make jv_is_integer() do the right thing (though I suppose we can then have a fun debate about whether 1.0 is an integer or not).

If you can force push a version that fixes jv_is_integer(), that'd be great, and I'll merge it. If you can't before this weekend, i'll probably do it (while retaining you as author).

@nicowilliams
Copy link
Contributor

Thanks! I'll squash and rebase as soon as I get a chance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants