Describe numbers encoding in JSON #1574

lexaknyazev · 2019-03-06T13:36:26Z

No description provided.

donmccurdy · 2019-03-06T15:21:54Z

specification/2.0/README.md

@@ -178,6 +178,10 @@ To simplify client-side implementation, glTF has additional restrictions on JSON

   > **Implementation Note:** This allows generic glTF client implementations to not have full Unicode support. Application-specific strings (e.g., values of `"name"` properties or content of `extras` fields) may use any symbols.
 3. Names (keys) within JSON objects must be unique, i.e., duplicate keys aren't allowed.
+4. Numbers defined as integers in the schema must be written without fractional part (i.e., `.0`). Exporters should not produce integer values greater than 2<sup>53</sup> because some client implementations will not be able to read them correctly.


What is the purpose of prohibiting a fractional part? It may be parsed differently in a typed language, I suppose? If any mainstream JSON serializer doesn't do this, it won't be within a developer's ability to change it. If it's already universal in common libraries, as it is for JS, that's fine.

Note that JSON also allows an exponent part: https://stackoverflow.com/a/19554986/1314762. Should that be mentioned?

From JSON-Schema spec:

Some programming languages and parsers use different internal representations for floating point numbers than they do for integers.

For consistency, integer JSON numbers SHOULD NOT be encoded with a fractional part.

We had this issue with files made by the Blender exporter at one point. JSON.stringify is fine. Note, that this affects only properties marked as "integer" in the schema such as:

"buffers": [ { "byteLength": 25.0 } ]

I'm fine with changing "must" to "should" to align better with JSON-Schema language. Exponent notation is usually associated with floats, so it shouldn't be used for integers as well.

It might only be remotely related, but related: There has been some discussion about this at KhronosGroup/glTF-Validator#8

The point that people might not be able to modify the behavior of JSON serializers may be an issue.

I think that loaders using typed languages that rely on "integer" schema type are a bigger concern here rather than half-baked serializers. For example, "componentType":5123.0 crashes some loaders.

Ok, i'm fine with either "must" or "should", whichever you prefer. Let's also mention exponents here.

donmccurdy · 2019-03-06T15:27:00Z

specification/2.0/README.md

@@ -178,6 +178,10 @@ To simplify client-side implementation, glTF has additional restrictions on JSON

   > **Implementation Note:** This allows generic glTF client implementations to not have full Unicode support. Application-specific strings (e.g., values of `"name"` properties or content of `extras` fields) may use any symbols.
 3. Names (keys) within JSON objects must be unique, i.e., duplicate keys aren't allowed.
+4. Numbers defined as integers in the schema must be written without fractional part (i.e., `.0`). Exporters should not produce integer values greater than 2<sup>53</sup> because some client implementations will not be able to read them correctly.
+5. Floating-point numbers must be written in a way that preserves original values when these numbers are read back.


As an implementer I don't know what to do with this as a normative requirement... no one should be writing a JSON serializer from scratch, and I would not know how to evaluate this requirement on a JSON library, other than (as you suggest below) looking for something common and well-tested. Some developers (e.g. on the web) will not have a choice in their JSON implementation.

Perhaps this should just be an implementation note – that implementers should be aware of this when choosing a serialization method, and that common JSON libraries handle it automatically.

Is this something we can detect in validation?

All "reasonable" approaches (such as JSON.stringify on the Web, platform-provided JSON implementations in languages like Python, commonly-used C++ libraries) are already aligned with this requirement.

Validating it would require comparing glTF JSON with some "source" data.

Nevertheless, I think this is too vague to be a normative requirement, and should probably be an implementation note.

It is probably also worth saying at the top of this section that all of these "additional restrictions on JSON" are already implemented by common JSON libraries, even if they are not required by the JSON spec.

lexaknyazev · 2020-01-02T12:36:36Z

Follow-up from KhronosGroup/glTF-Blender-IO#843.

There's no integer data type in JSON. All numbers are just number with no defined limits (i.e. they are just decimals). When the JSON-Schema says integer, it should be seen as a strict subset of number.

This inevitably leads to several edge cases (the PR should be updated to cover all of these):

real values written without the fractional part ("metallicFactor": 0)
- should be universally supported;
integer values written with zero fractional part ("bufferView": 1.0) or exponential notation ("bufferView": 1e0)
- works with JSON.parse on the web (there's nothing we can do about it);
- should fail in strongly-typed implementations and also fails command-line validation;
- disallowed by this PR;
integer values that are greater than 2⁵³-1 (max unambiguous integer value that can be stored in double-precision float)
- wrong values with JSON.parse on the web;
- likely correct values with strongly-typed implementations assuming that they use 64-bit ints internally;
- disallow?
integer values that are greater than 2⁶³-1 (max integer value that can be stored in signed int64)
- wrong values with JSON.parse on the web;
- potentially incorrect values with strongly-typed implementations that use signed 64-bit ints internally;
- disallow?
integer values that are greater than 2⁶⁴ (max integer value that can be stored in unsigned int64)
- wrong values with JSON.parse on the web;
- potential overflow in strongly-typed implementations;
- disallow?

donmccurdy · 2020-01-02T18:56:48Z

specification/2.0/README.md

@@ -178,6 +178,10 @@ To simplify client-side implementation, glTF has additional restrictions on JSON

   > **Implementation Note:** This allows generic glTF client implementations to not have full Unicode support. Application-specific strings (e.g., values of `"name"` properties or content of `extras` fields) may use any symbols.
 3. Names (keys) within JSON objects must be unique, i.e., duplicate keys aren't allowed.
+4. Numbers defined as integers in the schema must be written without a fractional part (e.g., `.0`). Exporters should not produce integer values greater than 2<sup>53</sup> because some client implementations will not be able to read them correctly.


For the "without a fractional part" requirement, could you include the motivation for that? A value of 1.0 would pass JSON validation for { "type": "number", "multipleOf": 1.0 }, so this requirement is really because some (many?) JSON libraries for typed languages depend on it?

@donmccurdy Related: KhronosGroup/glTF-Validator#109

Yes, but I think the rationale should be included explicitly in the specification – otherwise it sounds like we're inventing our own subset of JSON, which would be a bad look. 😄

Some of the issues that are related to the original integer type have already been linked to in other comments. Some of the related hiccups can probably be explained by the fact that earlier versions of the ("official") JSON-Schema included the integer type, but it has been removed in the meantime (cf. json-schema-org/json-schema-spec#146 ).

One specific issue is caused by the combination of a typed language and automatic code generation from the schema. In doubt, the code generation has to be tweaked and configured so that a { "type": "number", "multipleOf": 1.0 } is translated into a long, and that the resulting parser does a parseLong instead of a parseDouble ...

"type": "integer" was added back at some point. Latest definitions are:

concept of "integer" is from a vocabulary, not the data model

For consistency, integer JSON numbers SHOULD NOT be encoded with a fractional part.

String values MUST be one of the six primitive types ("null", "boolean", "object", "array", "number", or "string"), or "integer" which matches any number with a zero fractional part.

Sources:

https://json-schema.org/draft/2019-09/json-schema-core.html

https://json-schema.org/draft/2019-09/json-schema-validation.html

It seems like there's an important distinction between "SHOULD NOT" (the JSON spec wording) and "must be written without a fractional part" (this PR). The JSON wording doesn't outright prevent encodings like 1.1e1 for integers, although they are strongly discouraged.

It may still make sense for glTF to have stronger enforcement, for the sake of strongly-typed glTF readers that don't wish to take extra steps of parsing floats and rounding. But I agree with Don's concern that this needs to avoid becoming a subset of JSON.

OK, they update the JSON schema draft regularly, and I wonder about their roadmap for a ratification - without that, #929 remains a moving target... :-/

I think we should copy the JSON wording here, changing "must be written without a fractional part" to "should not be written without a fractional part."

lexaknyazev · 2020-11-11T20:55:10Z

Couple more edge cases from today's call.

Most of decimal values found in glTF JSON (e.g. material factors) usually end up being used as GPU uniforms, so the clients may parse them into single-precision (32-bit) floats for subsequent GPU usage. However, assuming that numbers labeled as integers in the schema but written with a fractional part are single-precision floats would lead to incorrect behavior. For example, "byteOffset": 20485763.0 would be rounded to 20485764.0. On the other hand, assuming double-precision (64-bit) floats may be suboptimal as they'd need to be rounded to 32 bits before GPU upload.
What should clients do with e.g. "byteOffset": 20485764.000000001? Double-precision is not enough to disambiguate this value from an integer 20485764.

lexaknyazev · 2020-11-11T21:05:58Z

The latter example would also work with JS array indices (think all glTF internal references):

const arr = [];
arr[8192] = 41;
arr[8192.0000000000001]; // == 41

javagl · 2020-11-12T03:31:17Z

Specifying "JSON itself" cannot be in the responsibility of Khronos or the working group.

For the particular example of numbers: Even the JSON RFC remains irritatingly vague here, and focusses on "interoperability", saying that ~"most tools will use 64bit IEEE754"...

This issue was only about the numbers, but the spec currently also disallows duplicate keys. The JSON RFC just says:

When the names within an object are not unique, the behavior of software that receives such an object is unpredictable.

This is not something that I'd like to see in a real specification. This is exactly the vagueness that has led to JSON parsers/writers that are not as "interoperable" as one would like them to be.

But generally: When adding constraints for the JSON part of glTF, we could try to distinguish between things that are in the control of the user, and the parts that are not. The constraint to disallow duplicate keys is reasonable. There probably are few writers that can do that anyhow. But trying to iron out the kinks of the JSON specs for valid number formats might mean that (in the extreme case) there are no JSON writers/parsers available that can write truly compliant glTF-JSON. (glTF JSON should be a "subset" of JSON, and not a "dialect" of JSON, so to speak).

In the best case, we could/should be able to say: "The JSON part of glTF is JSON (whatever that means)", but ... here we are now.

Making these constraints implicit, by pointing to the JSON spec and saying that ~"the general rules for 'interoperability' that are mentioned there SHOULD be followed" might be a reasonable solution for a problem that, strictly speaking, is beyond our control.

lexaknyazev · 2020-11-12T16:40:44Z

The constraint to disallow duplicate keys is reasonable.

That also depends. Although it's very unlikely that any commonly-used JSON library would export such an asset, readers usually have no control over the parsed result. FWIW, JSON.parse ECMAScript specification defines that duplicated keys simply overwrite each other in order of appearance.

I'm inclined to rephrase most restrictions from this section as interoperability suggestions and/or best practices while explicitly listing the corner cases that implementations should assess themselves.

WDYT?

javagl · 2020-11-13T23:07:45Z

That sounds reasonable for me. This would come close to the desired solution of just pointing to the JSON spec and leaving the burden of interpreting that correctly not to the implementors of glTF libs, but to implementors of JSON libs.

(One reason why I'm hesitant to add constraints is related to a very old issue, where you suggested that "writers / converters MUST write 1" (for integer values), which is in line with this PR ... but when just using an arbitrary JSON lib, that's hard to enforce from the glTF perspective...)

If nobody else has objections, I'd agree that this could be written as recommendations and interoperability hints. Just to get an idea about the corner cases that you mentioned: Would this be roughly something like

Writers MIGHT write integer values with a fractional part, and JSON parsers MIGHT treat them as (floating point) number, and the (glTF!) client has to take care to cast values that are supposed to be integer values to int where necessary

(with better wording, but along that line)?

emackey · 2020-11-30T13:50:37Z

I'm inclined to rephrase most restrictions from this section as interoperability suggestions and/or best practices while explicitly listing the corner cases that implementations should assess themselves.

Sounds good to me.

lexaknyazev · 2021-08-09T17:17:15Z

Resolved in #1997.

Describe numbers encoding in JSON

7717876

lexaknyazev requested review from pjcozzi, donmccurdy and bghgary March 6, 2019 13:36

pjcozzi approved these changes Mar 6, 2019

View reviewed changes

donmccurdy reviewed Mar 6, 2019

View reviewed changes

Wording tweaks in JSON section.

2173d8f

bghgary approved these changes Mar 7, 2019

View reviewed changes

lexaknyazev mentioned this pull request Jan 2, 2020

Material properties exported as integers instead of floats/numbers KhronosGroup/glTF-Blender-IO#843

Closed

donmccurdy reviewed Jan 2, 2020

View reviewed changes

donmccurdy added the specification label Oct 17, 2020

lexaknyazev closed this Aug 9, 2021

lexaknyazev deleted the numbers-fix branch August 9, 2021 17:17

This was referenced Oct 4, 2021

The integer JSON handler should accept (zero) fractional values CesiumGS/cesium-native#357

Closed

Core Metadata: Language pass for binary table format CesiumGS/3d-tiles#493

Merged

javagl mentioned this pull request Dec 13, 2021

Added tileset with full metadata CesiumGS/3d-tiles-samples#36

Closed

javagl mentioned this pull request Apr 9, 2022

Single or double precision node transforms #2140

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Describe numbers encoding in JSON #1574

Describe numbers encoding in JSON #1574

lexaknyazev commented Mar 6, 2019

donmccurdy Mar 6, 2019

lexaknyazev Mar 6, 2019

javagl Mar 6, 2019 •

edited

Loading

lexaknyazev Mar 6, 2019

donmccurdy Mar 7, 2019

donmccurdy Mar 6, 2019

lexaknyazev Mar 6, 2019

donmccurdy Mar 6, 2019

lexaknyazev commented Jan 2, 2020

donmccurdy Jan 2, 2020

emackey Jan 2, 2020

donmccurdy Jan 2, 2020

javagl Jan 3, 2020

lexaknyazev Jan 3, 2020

emackey Jan 3, 2020

javagl Jan 4, 2020

emackey Oct 27, 2020

lexaknyazev commented Nov 11, 2020

lexaknyazev commented Nov 11, 2020

javagl commented Nov 12, 2020

lexaknyazev commented Nov 12, 2020

javagl commented Nov 13, 2020

emackey commented Nov 30, 2020

lexaknyazev commented Aug 9, 2021

Describe numbers encoding in JSON #1574

Describe numbers encoding in JSON #1574

Conversation

lexaknyazev commented Mar 6, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

javagl Mar 6, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lexaknyazev commented Jan 2, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lexaknyazev commented Nov 11, 2020

lexaknyazev commented Nov 11, 2020

javagl commented Nov 12, 2020

lexaknyazev commented Nov 12, 2020

javagl commented Nov 13, 2020

emackey commented Nov 30, 2020

lexaknyazev commented Aug 9, 2021

javagl Mar 6, 2019 •

edited

Loading