Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describe numbers encoding in JSON #1574

Closed
wants to merge 2 commits into from

Conversation

lexaknyazev
Copy link
Member

No description provided.

@@ -178,6 +178,10 @@ To simplify client-side implementation, glTF has additional restrictions on JSON

> **Implementation Note:** This allows generic glTF client implementations to not have full Unicode support. Application-specific strings (e.g., values of `"name"` properties or content of `extras` fields) may use any symbols.
3. Names (keys) within JSON objects must be unique, i.e., duplicate keys aren't allowed.
4. Numbers defined as integers in the schema must be written without fractional part (i.e., `.0`). Exporters should not produce integer values greater than 2<sup>53</sup> because some client implementations will not be able to read them correctly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of prohibiting a fractional part? It may be parsed differently in a typed language, I suppose? If any mainstream JSON serializer doesn't do this, it won't be within a developer's ability to change it. If it's already universal in common libraries, as it is for JS, that's fine.

Note that JSON also allows an exponent part: https://stackoverflow.com/a/19554986/1314762. Should that be mentioned?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From JSON-Schema spec:

Some programming languages and parsers use different internal representations for floating point numbers than they do for integers.

For consistency, integer JSON numbers SHOULD NOT be encoded with a fractional part.

We had this issue with files made by the Blender exporter at one point. JSON.stringify is fine. Note, that this affects only properties marked as "integer" in the schema such as:

"buffers": [
  {
    "byteLength": 25.0
  }
]

I'm fine with changing "must" to "should" to align better with JSON-Schema language. Exponent notation is usually associated with floats, so it shouldn't be used for integers as well.

Copy link
Contributor

@javagl javagl Mar 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might only be remotely related, but related: There has been some discussion about this at KhronosGroup/glTF-Validator#8

The point that people might not be able to modify the behavior of JSON serializers may be an issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that loaders using typed languages that rely on "integer" schema type are a bigger concern here rather than half-baked serializers. For example, "componentType":5123.0 crashes some loaders.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, i'm fine with either "must" or "should", whichever you prefer. Let's also mention exponents here.

@@ -178,6 +178,10 @@ To simplify client-side implementation, glTF has additional restrictions on JSON

> **Implementation Note:** This allows generic glTF client implementations to not have full Unicode support. Application-specific strings (e.g., values of `"name"` properties or content of `extras` fields) may use any symbols.
3. Names (keys) within JSON objects must be unique, i.e., duplicate keys aren't allowed.
4. Numbers defined as integers in the schema must be written without fractional part (i.e., `.0`). Exporters should not produce integer values greater than 2<sup>53</sup> because some client implementations will not be able to read them correctly.
5. Floating-point numbers must be written in a way that preserves original values when these numbers are read back.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an implementer I don't know what to do with this as a normative requirement... no one should be writing a JSON serializer from scratch, and I would not know how to evaluate this requirement on a JSON library, other than (as you suggest below) looking for something common and well-tested. Some developers (e.g. on the web) will not have a choice in their JSON implementation.

Perhaps this should just be an implementation note – that implementers should be aware of this when choosing a serialization method, and that common JSON libraries handle it automatically.

Is this something we can detect in validation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All "reasonable" approaches (such as JSON.stringify on the Web, platform-provided JSON implementations in languages like Python, commonly-used C++ libraries) are already aligned with this requirement.

Validating it would require comparing glTF JSON with some "source" data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevertheless, I think this is too vague to be a normative requirement, and should probably be an implementation note.

It is probably also worth saying at the top of this section that all of these "additional restrictions on JSON" are already implemented by common JSON libraries, even if they are not required by the JSON spec.

@lexaknyazev
Copy link
Member Author

Follow-up from KhronosGroup/glTF-Blender-IO#843.

There's no integer data type in JSON. All numbers are just number with no defined limits (i.e. they are just decimals). When the JSON-Schema says integer, it should be seen as a strict subset of number.

This inevitably leads to several edge cases (the PR should be updated to cover all of these):

  • real values written without the fractional part ("metallicFactor": 0)
    • should be universally supported;
  • integer values written with zero fractional part ("bufferView": 1.0) or exponential notation ("bufferView": 1e0)
    • works with JSON.parse on the web (there's nothing we can do about it);
    • should fail in strongly-typed implementations and also fails command-line validation;
    • disallowed by this PR;
  • integer values that are greater than 253-1 (max unambiguous integer value that can be stored in double-precision float)
    • wrong values with JSON.parse on the web;
    • likely correct values with strongly-typed implementations assuming that they use 64-bit ints internally;
    • disallow?
  • integer values that are greater than 263-1 (max integer value that can be stored in signed int64)
    • wrong values with JSON.parse on the web;
    • potentially incorrect values with strongly-typed implementations that use signed 64-bit ints internally;
    • disallow?
  • integer values that are greater than 264 (max integer value that can be stored in unsigned int64)
    • wrong values with JSON.parse on the web;
    • potential overflow in strongly-typed implementations;
    • disallow?

@@ -178,6 +178,10 @@ To simplify client-side implementation, glTF has additional restrictions on JSON

> **Implementation Note:** This allows generic glTF client implementations to not have full Unicode support. Application-specific strings (e.g., values of `"name"` properties or content of `extras` fields) may use any symbols.
3. Names (keys) within JSON objects must be unique, i.e., duplicate keys aren't allowed.
4. Numbers defined as integers in the schema must be written without a fractional part (e.g., `.0`). Exporters should not produce integer values greater than 2<sup>53</sup> because some client implementations will not be able to read them correctly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the "without a fractional part" requirement, could you include the motivation for that? A value of 1.0 would pass JSON validation for { "type": "number", "multipleOf": 1.0 }, so this requirement is really because some (many?) JSON libraries for typed languages depend on it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I think the rationale should be included explicitly in the specification – otherwise it sounds like we're inventing our own subset of JSON, which would be a bad look. 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the issues that are related to the original integer type have already been linked to in other comments. Some of the related hiccups can probably be explained by the fact that earlier versions of the ("official") JSON-Schema included the integer type, but it has been removed in the meantime (cf. json-schema-org/json-schema-spec#146 ).

One specific issue is caused by the combination of a typed language and automatic code generation from the schema. In doubt, the code generation has to be tweaked and configured so that a { "type": "number", "multipleOf": 1.0 } is translated into a long, and that the resulting parser does a parseLong instead of a parseDouble ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"type": "integer" was added back at some point. Latest definitions are:

  • concept of "integer" is from a vocabulary, not the data model

  • For consistency, integer JSON numbers SHOULD NOT be encoded with a fractional part.

  • String values MUST be one of the six primitive types ("null", "boolean", "object", "array", "number", or "string"), or "integer" which matches any number with a zero fractional part.

Sources:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like there's an important distinction between "SHOULD NOT" (the JSON spec wording) and "must be written without a fractional part" (this PR). The JSON wording doesn't outright prevent encodings like 1.1e1 for integers, although they are strongly discouraged.

It may still make sense for glTF to have stronger enforcement, for the sake of strongly-typed glTF readers that don't wish to take extra steps of parsing floats and rounding. But I agree with Don's concern that this needs to avoid becoming a subset of JSON.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, they update the JSON schema draft regularly, and I wonder about their roadmap for a ratification - without that, #929 remains a moving target... :-/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should copy the JSON wording here, changing "must be written without a fractional part" to "should not be written without a fractional part."

@lexaknyazev
Copy link
Member Author

Couple more edge cases from today's call.

  • Most of decimal values found in glTF JSON (e.g. material factors) usually end up being used as GPU uniforms, so the clients may parse them into single-precision (32-bit) floats for subsequent GPU usage. However, assuming that numbers labeled as integers in the schema but written with a fractional part are single-precision floats would lead to incorrect behavior. For example, "byteOffset": 20485763.0 would be rounded to 20485764.0. On the other hand, assuming double-precision (64-bit) floats may be suboptimal as they'd need to be rounded to 32 bits before GPU upload.

  • What should clients do with e.g. "byteOffset": 20485764.000000001? Double-precision is not enough to disambiguate this value from an integer 20485764.

@lexaknyazev
Copy link
Member Author

The latter example would also work with JS array indices (think all glTF internal references):

const arr = [];
arr[8192] = 41;
arr[8192.0000000000001]; // == 41

@javagl
Copy link
Contributor

javagl commented Nov 12, 2020

Specifying "JSON itself" cannot be in the responsibility of Khronos or the working group.

For the particular example of numbers: Even the JSON RFC remains irritatingly vague here, and focusses on "interoperability", saying that ~"most tools will use 64bit IEEE754"...

This issue was only about the numbers, but the spec currently also disallows duplicate keys. The JSON RFC just says:

When the names within an object are not unique, the behavior of software that receives such an object is unpredictable.

This is not something that I'd like to see in a real specification. This is exactly the vagueness that has led to JSON parsers/writers that are not as "interoperable" as one would like them to be.

But generally: When adding constraints for the JSON part of glTF, we could try to distinguish between things that are in the control of the user, and the parts that are not. The constraint to disallow duplicate keys is reasonable. There probably are few writers that can do that anyhow. But trying to iron out the kinks of the JSON specs for valid number formats might mean that (in the extreme case) there are no JSON writers/parsers available that can write truly compliant glTF-JSON. (glTF JSON should be a "subset" of JSON, and not a "dialect" of JSON, so to speak).

In the best case, we could/should be able to say: "The JSON part of glTF is JSON (whatever that means)", but ... here we are now.

Making these constraints implicit, by pointing to the JSON spec and saying that ~"the general rules for 'interoperability' that are mentioned there SHOULD be followed" might be a reasonable solution for a problem that, strictly speaking, is beyond our control.

@lexaknyazev
Copy link
Member Author

The constraint to disallow duplicate keys is reasonable.

That also depends. Although it's very unlikely that any commonly-used JSON library would export such an asset, readers usually have no control over the parsed result. FWIW, JSON.parse ECMAScript specification defines that duplicated keys simply overwrite each other in order of appearance.

I'm inclined to rephrase most restrictions from this section as interoperability suggestions and/or best practices while explicitly listing the corner cases that implementations should assess themselves.

WDYT?

@javagl
Copy link
Contributor

javagl commented Nov 13, 2020

That sounds reasonable for me. This would come close to the desired solution of just pointing to the JSON spec and leaving the burden of interpreting that correctly not to the implementors of glTF libs, but to implementors of JSON libs.

(One reason why I'm hesitant to add constraints is related to a very old issue, where you suggested that "writers / converters MUST write 1" (for integer values), which is in line with this PR ... but when just using an arbitrary JSON lib, that's hard to enforce from the glTF perspective...)

If nobody else has objections, I'd agree that this could be written as recommendations and interoperability hints. Just to get an idea about the corner cases that you mentioned: Would this be roughly something like

Writers MIGHT write integer values with a fractional part, and JSON parsers MIGHT treat them as (floating point) number, and the (glTF!) client has to take care to cast values that are supposed to be integer values to int where necessary

(with better wording, but along that line)?

@emackey
Copy link
Member

emackey commented Nov 30, 2020

I'm inclined to rephrase most restrictions from this section as interoperability suggestions and/or best practices while explicitly listing the corner cases that implementations should assess themselves.

Sounds good to me.

@lexaknyazev
Copy link
Member Author

Resolved in #1997.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants