From 02c7e741532c34f9b55967cdf0a6d654a1bb909d Mon Sep 17 00:00:00 2001 From: Henry Andrews Date: Sun, 31 May 2020 15:18:28 -0700 Subject: [PATCH 1/2] Generalize data model language It is advantageous in hypermedia environments to apply untyped schemas to binary data, despite the lack of a truly suitable mapping into the data model. --- jsonschema-core.xml | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/jsonschema-core.xml b/jsonschema-core.xml index d25b23ee..4b33df67 100644 --- a/jsonschema-core.xml +++ b/jsonschema-core.xml @@ -198,13 +198,17 @@
- A JSON document to which a schema is applied is known as an "instance". + A document to which a schema is applied is known as an "instance".
- JSON Schema interprets documents according to a data model. A JSON value - interpreted according to this data model is called an "instance". + JSON Schema interprets documents according to a data model. A value + interpreted according to this data model is called an "instance". JSON + documents map trivially into the data model, with a few exceptions as + noted below. Documents of other media types MAY be treated as instances + if a suitable application-defined mapping of the media type into the + data model can be determined. An instance has one of six primitive types, and a range of possible values @@ -219,6 +223,12 @@ A string of Unicode code points, from the JSON "string" value + + Binary data MAY be treated as an instance, however no data type in the data model + is suitable. Therefore, only schemas such as the empty schema that do not + constrain the type can be considered to pass. Rationales for and behavior of + binary data as an instance SHOULD be defined by the consuming application. + Whitespace and formatting concerns, including different lexical representations of numbers that are equal within the data model, are thus @@ -3796,7 +3806,7 @@ https://example.com/schemas/common#/$defs/count/minimum - + Clarify applicability to non-JSON media types From 5c5fcac2718e146d204808804b9fffdc8bc508ce Mon Sep 17 00:00:00 2001 From: Henry Andrews Date: Sun, 31 May 2020 14:47:01 -0700 Subject: [PATCH 2/2] "contentMediaType" can describe binary resources This expansion of contentMediaType is motivated by the need to indicate media types for binary resources in hypermedia environments. Existing usage (e.g. OpenAPI 3.0) considered unencoded binary data to be strings, but such data violates the expectations of JSON strings. A better approach is to purely indicate the media type and avoid constraining the instance by JSON type as no JSON type is suitable. --- jsonschema-validation.xml | 37 +++++++++++++++++++++++++++---------- 1 file changed, 27 insertions(+), 10 deletions(-) diff --git a/jsonschema-validation.xml b/jsonschema-validation.xml index 06966a8d..001bd67b 100644 --- a/jsonschema-validation.xml +++ b/jsonschema-validation.xml @@ -913,7 +913,9 @@
Annotations defined in this section indicate that an instance contains - non-JSON data encoded in a JSON string. + non-JSON data encoded in a JSON string. Additionally, they can be used in + the context of resources of various media types to indicate binary resources + not otherwise describable by JSON Schema. These properties provide additional information required to interpret JSON data @@ -945,14 +947,21 @@ consumer than that which processed the containing document. - All keywords in this section apply only to strings, and have no - effect on other data types. + All keywords in this section generally apply to strings, and have no + effect on other JSON data types. Additionally, they MAY be used without + type information when describing resources of other media types, subject + to certain restrictions. Implementations MAY offer the ability to decode, parse, and/or validate - the string contents automatically. However, it MUST NOT perform these + the string contents automatically. However, they MUST NOT perform these operations by default, and MUST provide the validation result of each - string-encoded document separately from the enclosing document. This + string-encoded document separately from the enclosing document. In particular, + these keywords, including "contentSchema", MUST NOT cause the containing schema + to fail validation. + + + The optional automatic decoding, parsing, and validating process SHOULD be equivalent to fully evaluating the instance against the original schema, followed by using the annotations to decode, parse, and/or validate each string-encoded document. @@ -1005,7 +1014,14 @@ If the instance is a string, this property indicates the media type of the contents of the string. If "contentEncoding" is present, - this property describes the decoded string. + this property describes the decoded string. If the "type" keyword is + absent, this keyword MAY be interpreted as describing an unencoded binary + resource. The exact meaning and behavior of this untyped usage is + application-defined. + + For an example of application-defined untyped usage, + see the forthcoming OpenAPI Specification v3.1 + The value of this property MUST be a string, which MUST be a media type, @@ -1015,8 +1031,9 @@
- If the instance is a string, and if "contentMediaType" is present, this - property contains a schema which describes the structure of the string. + If the instance is a string or an untyped binary resource, + and if "contentMediaType" is present, this property contains a schema + which describes the structure of the string. This keyword MAY be used with any media type that can be mapped into @@ -1433,8 +1450,8 @@ Correct email format RFC reference to 5321 instead of 5322 Clarified the set and meaning of "contentEncoding" values - - + Clarified value requirements for "contentSchema" + Expanded "contentMediaType" to unencoded binary media