-
-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multipleOf and floating point rounding errors #312
Comments
What is the spec unclear about? How the library decides to do maths is up to it. I don't know if we have any specific tests for floating point values for multipleOf. @Julian, @epoberezkin ? |
Since the spec doesn't explicitly say that multipleOf is expected to work even for numbers that cannot be represented exactly as floating point numbers, some implementors just give up and says, in essence, "you can't compare floating point numbers, because of rounding errors". This makes multipleOf an interop nightmare. I see a few ways of handling this issue:
|
@cederlys what do you mean by "deal with" rounding errors? Something like your last option is the current state. But it's not JSON Schema specifically that made it, JSON does not mandate that languages parse into arbitrary precision, and many languages don't have easy access to such a thing. It's true that that makes things less portable, but I'm not sure what motivation JSON Schema would have to be more strict there -- in cases where you control all the pieces, you have a choice on whether to use arbitrary precision, as I mentioned in that ticket, and when you don't, yeah you need to deal with the fact that your schema means different things depending on how someone deals with the resulting JSON. |
Maths is a fundamental issue between some languages. If you have an issue with a specific implementation of JSON Schema, the issue is with the implementation, I feel. |
One way to deal with this issue is something like this (in pseudocode):
The value of I think it would be helpful if JSON Schema explicitly states if implementations are supposed to go to this trouble, or if using floating point numbers is expected to be non-portable. |
That kind of thing can never work -- see the response in the bug ticket you linked, although I was quite terse there unfortunately. How are you going to distinguish what you call "rounding errors" from the actual literal float that is not the "rounded" one you're talking about? Are you proposing that JSON Schema mandate some level of imprecision that is different from the float specification's own? If so, can you elaborate on why that'd be a thing that's in JSON Schema's purview to want to do? |
I'm not saying that JSON Schema should require implementations to do like that. I'd be just as happy if the spec had a footnote that said something like this:
Perhaps this should be mentioned in the JSON specification, but the issue isn't as important there, as the JSON format itself doesn't do any math. It says nothing about how a number should be stored by an application. In the JSON specification, a number is just a sequence of characters that adheres to a particular grammar. But in JSON Schema validators have to actually do math with the numbers when multipleOf is used. Because of that, I think it is up to JSON Schema to either define, or explicitly leave it undefined, how that math is performed. I may be wrong, but I have not found anything that requires an implementation to use binary floating point internally. If an implementation were to use floating point operations on decimal numbers it wouldn't have this issue. But that is probably not something that should be required. |
Ah, yeah, a note certainly makes sense to me. Reminding people next to multipleOf that its use with non-integer numbers may not be portable and will often involve floating point error depending on the host language's parsing behavior sounds like a reasonable idea. The upcoming (sidebar: @handrews is this upcoming or released, I can't tell, the website claims draft 5 is current) draft 6 doesn't appear to have much difference in explaining multipleOf from how I remember it, but it seems reasonable to me to add something like that note going forward if someone can come up with a decent terse wording. |
Maybe something like this? I've borrowed heavily from RFC 7159, chapter 6, but tried to adapt it for the current context:
Unless it is already present, the IEEE754 reference must also be added as an informative reference:
(I have not checked if that standard has been updated after its inclusion in RFC 7159.) |
Since JSON already talks about how to parse its arbitrary-precision numbers as IEEE floats, and since JSON is normatively referenced (making it a part of the spec in a sense), I don't think any additional text is actually warranted. If implementations want to use IEEE floats, they're very much allowed to, and IEEE already treats how to do number comparisons using an acceptable-margin-of-error technique. Do we need to describe that again? Also not that the precision of an IEEE float is proportional to its magnitude, so even if multipleOf had to be a float, that would only work up to some (very large, but finite) number. |
I don't have access to the IEEE standard. But if IEEE treats how do compare numbers using an acceptable-margin-of-error technique -- does that not imply that a JSON Schema implementation that uses IEEE should consider 0.49 to be a multipleOf 0.01? Is that what you meant, @awwright? And yet, @Julian seems to be of the opposite view: when floating point is used, you should expect unexpected results, and 0.49 may not be a multiple of 0.01. I think either view is valid. But they cannot both be valid at once. I think the JSON Schema needs to explicitly state what we (as schema writers and users) can expect of a validator. I found a very good article about comparing floating point numbers: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ If the method suggested in that article is used to compare round(x/y) to x/y, it should produce sensible results. But is that something that JSON Schema should require of validators? |
I have to read both your comment and Austin's again to make sure I
understand the nuance, but to be clear, my position was not "all bets are
off", much more "expect the behavior defined by the float spec", so if I've
erred in what that is, yeah, that'd be what I was going for. Will read this
a bit more carefully in a bit.
On May 11, 2017 2:59 PM, "Per Cederqvist" <notifications@github.com> wrote:
I don't have access to the IEEE standard. But if IEEE treats how do compare
numbers using an acceptable-margin-of-error technique -- does that not
imply that a JSON Schema implementation that uses IEEE should consider 0.49
to be a multipleOf 0.01? Is that what you meant, @awwright
<https://github.com/awwright>? And yet, @Julian <https://github.com/julian>
seems to be of the opposite view: when floating point is used, you should
expect unexpected results, and 0.49 may not be a multiple of 0.01.
I think either view is valid. But they cannot both be valid at once. I
think the JSON Schema needs to explicitly state what we (as schema writers
and users) can expect of a validator.
I found a very good article about comparing floating point numbers:
https://randomascii.wordpress.com/2012/02/25/comparing-
floating-point-numbers-2012-edition/
If the method suggested in that article is used to compare round(x/y) to
x/y, it should produce sensible results.
But is that something that JSON Schema should require of validators?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#312 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAUIXo2ee8n_dtF0qJpFo8X8Dffd7DA9ks5r41p3gaJpZM4NIvyD>
.
|
I have two issues with "expect the behavior defined by the float spec":
|
I think that any attempt to control the interpretation of numbers beyond what is specified in the JSON RFC (and standards that it references such as IEEE floats) should be done by defining values for A format could be applied to numbers if the desire is simply to convey semantics (use decimal floating point vs use IEEE floating point). If the intention is to preserve some aspect of the numeric representation, because of the data model this is better done by defining a string format that indicates how the string should be interpreted as a number. This is because strings map fairly directly into the data model (particularly for things like basic numeric notation that do not require escaped characters), while numbers intentionally lose representation details during parsing. See PR #455 (numeric representation and the data model), and issues json-schema-org/json-schema-vocabularies#45 (encoding decimals as strings), #152 (specifying precision), and #116 (format maximum/minimum, also discusses multipleOf for format) for related discussions. Is there anything to be done for this issue that is not addressed by the other issues and PRs? If there are no comments indicating a course of action here after a couple of weeks I will close this in favor of the other issues. I do not think that the JSON Schema core specification should mandate specific floating point behavior any more than JSON does. |
It's been more than two years since I asked if there was anything not covered by the linked issues/prs, so I'm closing this. |
Is -15.9 a multiple of 5.3? The current specification of JSON schema is a bit terse:
A numeric instance is only valid if division by this keyword's value results in an integer.
What does this mean? In some programming languages, dividing a floating point number by another floating point number always results in a floating point number. In this case, it would be -3.0, which isn't an integer, so the validation would always fail.
python-jsonschema/jsonschema#185 is a bug report about this issue in a schema validator implementation. The conclusion is that "this is just floating points. Those numbers aren't exactly representable as floats, so you're going to get False, there's nothing jsonschema can do about it, the numbers you get are not in fact multiples of each other."
I think the specification needs to be clearer. Is this supposed to be useful for numbers like 5.3 and -15.9 which often cannot be represented exactly in floating point form? If so, the specification needs to be clear that implementations that use floating point needs to deal with rounding errors. In the current state, we get interoperability issues.
The text was updated successfully, but these errors were encountered: