-
-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify data model regarding binary data #942
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -198,13 +198,17 @@ | |
|
||
<section title="Instance"> | ||
<t> | ||
A JSON document to which a schema is applied is known as an "instance". | ||
A document to which a schema is applied is known as an "instance". | ||
</t> | ||
|
||
<section title="Instance Data Model"> | ||
<t> | ||
JSON Schema interprets documents according to a data model. A JSON value | ||
interpreted according to this data model is called an "instance". | ||
JSON Schema interprets documents according to a data model. A value | ||
interpreted according to this data model is called an "instance". JSON | ||
documents map trivially into the data model, with a few exceptions as | ||
noted below. Documents of other media types MAY be treated as instances | ||
if a suitable application-defined mapping of the media type into the | ||
data model can be determined. | ||
</t> | ||
<t> | ||
An instance has one of six primitive types, and a range of possible values | ||
|
@@ -219,6 +223,12 @@ | |
<t hangText="string:">A string of Unicode code points, from the JSON "string" value</t> | ||
</list> | ||
</t> | ||
<t> | ||
Binary data MAY be treated as an instance, however no data type in the data model | ||
is suitable. Therefore, only schemas such as the empty schema that do not | ||
constrain the type can be considered to pass. Rationales for and behavior of | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think commas may be required: "Rationales for, and behavior of," There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not in this instance. |
||
binary data as an instance SHOULD be defined by the consuming application. | ||
</t> | ||
<t> | ||
Whitespace and formatting concerns, including different lexical | ||
representations of numbers that are equal within the data model, are thus | ||
|
@@ -3796,7 +3806,7 @@ https://example.com/schemas/common#/$defs/count/minimum | |
<t></t> | ||
<t></t> | ||
<t></t> | ||
<t></t> | ||
<t>Clarify applicability to non-JSON media types</t> | ||
<t></t> | ||
<t></t> | ||
<t></t> | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -913,7 +913,9 @@ | |
<section title="Foreword"> | ||
<t> | ||
Annotations defined in this section indicate that an instance contains | ||
non-JSON data encoded in a JSON string. | ||
non-JSON data encoded in a JSON string. Additionally, they can be used in | ||
the context of resources of various media types to indicate binary resources | ||
not otherwise describable by JSON Schema. | ||
</t> | ||
<t> | ||
These properties provide additional information required to interpret JSON data | ||
|
@@ -945,14 +947,21 @@ | |
consumer than that which processed the containing document. | ||
</t> | ||
<t> | ||
All keywords in this section apply only to strings, and have no | ||
effect on other data types. | ||
All keywords in this section generally apply to strings, and have no | ||
effect on other JSON data types. Additionally, they MAY be used without | ||
type information when describing resources of other media types, subject | ||
to certain restrictions. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could one or two examples of restrictions be given here? |
||
</t> | ||
<t> | ||
Implementations MAY offer the ability to decode, parse, and/or validate | ||
the string contents automatically. However, it MUST NOT perform these | ||
the string contents automatically. However, they MUST NOT perform these | ||
operations by default, and MUST provide the validation result of each | ||
string-encoded document separately from the enclosing document. This | ||
string-encoded document separately from the enclosing document. In particular, | ||
these keywords, including "contentSchema", MUST NOT cause the containing schema | ||
to fail validation. | ||
</t> | ||
<t> | ||
The optional automatic decoding, parsing, and validating | ||
process SHOULD be equivalent to fully evaluating the instance against | ||
the original schema, followed by using the annotations to decode, parse, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this say instead, "The optional automatic decoding, parsing, and validating process SHOULD be equivalent to fully evaluating the instance against process SHOULD be equivalent to fully evaluating an instance against a schema, followed by using the annotations to decode, parse the original schema". The thing is, this section is about applying schemas or rules to encoded "content". if we use "evaluating the instance against the original schema", then it feels like there's confusion between the schema containing all this stuff vs. the stuff inside "content". Does this make sense? |
||
and/or validate each string-encoded document. | ||
|
@@ -1005,7 +1014,14 @@ | |
<t> | ||
If the instance is a string, this property indicates the media type | ||
of the contents of the string. If "contentEncoding" is present, | ||
this property describes the decoded string. | ||
this property describes the decoded string. If the "type" keyword is | ||
absent, this keyword MAY be interpreted as describing an unencoded binary | ||
resource. The exact meaning and behavior of this untyped usage is | ||
application-defined. | ||
Comment on lines
+1017
to
+1020
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is effectively permitting values outside the data model, which sort of defeats the point of having a data model, doesn't it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @awwright I like your idea of simply extending the data model. I'd probably recommend that we just add What's going on here is that I'm recognizing a thing that a significant number of people (OpenAPI users) were already doing in the wild, in a way that was more problematic. OAS 3.0 treats binary data as Since OAS 3.1 is picking up the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would argue, let's just call binary for now, and if other use cases present, then open it up later. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Relequestual @awwright just to clarify, we're now proposing to add I feel like this should get an issue for visibility so that others can comment on it. It's a major change from what I worked out with OAS. I'm reasonably OK with making the change, but it's kind of a big deal. There are also alternatives such as a I will file an issue for this. |
||
<cref> | ||
For an example of application-defined untyped usage, | ||
see the forthcoming OpenAPI Specification v3.1 | ||
handrews marked this conversation as resolved.
Show resolved
Hide resolved
|
||
</cref> | ||
</t> | ||
<t> | ||
The value of this property MUST be a string, which MUST be a media type, | ||
|
@@ -1015,8 +1031,9 @@ | |
|
||
<section title="contentSchema"> | ||
<t> | ||
If the instance is a string, and if "contentMediaType" is present, this | ||
property contains a schema which describes the structure of the string. | ||
If the instance is a string or an untyped binary resource, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about: "If the instance is a string or an untyped binary resource having an application-defined mapping to the data model" |
||
and if "contentMediaType" is present, this property contains a schema | ||
which describes the structure of the string. | ||
</t> | ||
<t> | ||
This keyword MAY be used with any media type that can be mapped into | ||
|
@@ -1433,8 +1450,8 @@ | |
<list style="symbols"> | ||
<t>Correct email format RFC reference to 5321 instead of 5322</t> | ||
<t>Clarified the set and meaning of "contentEncoding" values</t> | ||
<t></t> | ||
<t></t> | ||
<t>Clarified value requirements for "contentSchema"</t> | ||
<t>Expanded "contentMediaType" to unencoded binary media</t> | ||
<t></t> | ||
<t></t> | ||
<t></t> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The added paragraph above this one says that it's possible there's some application-defined mapping from binary data to the JSON data model. The wording of this paragraph seems to waffle between "MAY be treated as an instance" and "can't do anything unless it's an empty schema". Maybe add some language that suggests that a schema can be applied if that data model is applied. For example (changes in bold), "Binary data MAY be treated as an instance, however no data type in the data model is directly suitable. Therefore, only schemas such as the empty schema that do not constrain the type can be considered to pass outright."
I'm sure there's some better language, but this paragraph feels like it's fighting with itself if there's no acknowledgement that there exists that "application-defined mapping" in the previous paragraph.