Skip to content

Commit

Permalink
Add general validation principles and examples.
Browse files Browse the repository at this point in the history
This addresses issue json-schema-org#55 plus concerns raised in the comments of
issue json-schema-org#101.

I replaced "linearity" with "independence" as I think it is
more general and intuitive.

The general considerations section has been reorganized
to start with the behavior of the empty schema, then explain
keyword independence, and finally cover container vs child
and type applicability, both of which flow directly from
keyword independence.

In draft 04, the wording obscured the connection between
keyword independence and container/child independence.
When we rewrote the array and object keywords to explicitly
classify each keyword as either validating the container
or the child, keyword independence became sufficient to
explain container/child independence.

The list of non-independent keywords has been updated, and
exceptions to the independence of parent and child schemas
have been documented.  Finally, I added a comprehensive example
of the frequently-confusing lack of connection between
type and other keywords.
  • Loading branch information
handrews committed Nov 16, 2016
1 parent b5afae7 commit 61aadd5
Showing 1 changed file with 138 additions and 31 deletions.
169 changes: 138 additions & 31 deletions jsonschema-validation.xml
Original file line number Diff line number Diff line change
Expand Up @@ -156,56 +156,163 @@

<section title="General validation considerations">

<section title="Keywords and instance primitive types">
<section title="Constraints and missing keywords">
<t>
Most validation keywords only limit the range of values within a certain primitive type.
When the primitive type of the instance is not of the type targeted by the keyword, the
validation succeeds.
Each JSON Schema validation keyword adds constraints that
an instance must satisfy in order to successfully validate.
</t>
<t>
For example, the "maxLength" keyword will only restrict certain strings (that are too long) from being valid.
If the instance is a number, boolean, null, array, or object, the keyword passes validation.
</t>
Validation keywords that are missing never restrict validation.
In some cases, this no-op behavior is identical to a keyword that
exists with certain values, and these values are noted where relevant.
</t>
<figure>
<preamble>
From this principle, it follows that all JSON values
successfully validate against the empty schema:
</preamble>
<artwork>
<![CDATA[
{}
]]>
</artwork>
</figure>
<figure>
<preamble>
Similarly, it follows that no JSON value successfully
validates against the empty schema's negation:
</preamble>
<artwork>
<![CDATA[
{
"not": {}
}
]]>
</artwork>
</figure>
</section>

<section title="Validation of primitive types and child values">
<t>
Two of the primitive types, array and object, allow for child values. The validation of
the primitive type is considered separately from the validation of child instances.
</t>
<section title="Keyword independence">
<t>
For arrays, primitive type validation consists of validating restrictions on length.
Validation keywords typically operate independently, without
affecting each other's outcomes.
</t>
<t>
For objects, primitive type validation consists of validating restrictions on the presence
or absence of property names.
For schema author convenience, there are some exceptions:
<list>
<t>"additionalProperties", whose behavior is defined in terms of "properties" and "patternProperties"</t>
<t>"additionalItems", whose behavior is defined in terms of "items"</t>
<t>"minimum" and "maximum", whose behaviors are modified by "exclusiveMinimum" and "exclusiveMaximum", respectively</t>
</list>
</t>
</section>

<section title="Missing keywords">
<section title="Validation of primitive types and child values">
<t>
Validation keywords that are missing never restrict validation.
In some cases, this no-op behavior is identical to a keyword that exists with certain values,
and these values are noted where known.
Two of the primitive types, array and object, allow for child values.
</t>
</section>

<section title="Linearity">
<!-- I call this "linear" in the same manner e.g. waves are linear, they don't interact with each other -->
<t>
Validation keywords typically operate independent of each other, without affecting each other.
Nearly all keywords are defined to operate on either the primitive
type of the container instance, or on the child instance(s), but
not both. Those that operate on child instances are applied to
each appropriate child instance separately.
</t>
<t>
For author convienence, there are some exceptions:
It follows from keyword independence that validation of the primitive
type of the container instance is considered separately from the
values of the child instances or their validation outcomes.
</t>
<t>
Two keywords are exceptions, as they validate properties of arrays as a whole:
<list>
<t>"additionalProperties", whose behavior is defined in terms of "properties" and "patternProperties"; and</t>
<t>"additionalItems", whose behavior is defined in terms of "items"</t>
<t>"uniqueItems", which validates a relationship among the child instances</t>
<t>"contains", which provides a schema for child validation, but need only successfully validate any one child instance rather than applying to all children or to a specific subset of children.</t>
</list>
</t>
</section>

</section>

<section title="Keyword applicability to instance primitive types">
<t>
An important implication of keyword independence is
that most validation keywords only limit the range of values
within a certain primitive type. When the primitive type of
the instance is not of the type targeted by the keyword, the
validation succeeds.
</t>
<t>
For example, the "multipleOf" keyword will only restrict
certain numbers from being valid.
If the instance is a string, boolean, null, array, or object
the keyword passes validation.
</t>
<figure>
<preamble>
The utility of this is best illustrated by considering
this schema for odd numbers:
</preamble>
<artwork>
<![CDATA[
{
"multipleOf": 1,
"not": {
"multipleOf": 2
}
}
]]>
</artwork>
</figure>
<figure>
<preamble>
If "multipleOf" implicitly constrained the type of the
instance to be a number, then both the overall schema
and the negated subschema would require a numeric instance
in order to validate. It would be equivalent to:
</preamble>
<artwork>
<![CDATA[
{
"type": "number",
"multipleOf": 1,
"not": {
"type": "number",
"multipleOf": 2,
}
}
]]>
</artwork>
<postamble>
It is clearly impossible to satisfy this schema, so keywords
must not impose constraints on type. Therefore, as originally written
(without a type constraint) the schema validates both odd integers
and non-numbers.
</postamble>
</figure>
<figure>
<preamble>
The following schema is the correct way to validate
only odd integers, while failing validation for non-numbers:
</preamble>
<artwork>
<![CDATA[
{
"type": "number",
"multipleOf": 1,
"not": {
"multipleOf": 2,
}
}
]]>
</artwork>
<postamble>
This negates only the even-ness of numbers, without
affecting validation of the instance type within the "not".
The instance type is only constrained outside of the negation.
</postamble>
</figure>
</section>

<section title="Validation keywords">
<t>
Validation keywords in a schema impose requirements for successfully validating an instance.
Expand Down Expand Up @@ -505,7 +612,7 @@
</t>
<t>
For all such properties, child validation succeeds if the child instance
validates agains the "additionalProperties" schema.
validates against the "additionalProperties" schema.
</t>
</section>

Expand Down Expand Up @@ -663,7 +770,7 @@
<t>
Both of these keywords can be used to decorate a user interface with
information about the data produced by this user interface. A title will
preferrably be short, whereas a description will provide explanation about
preferably be short, whereas a description will provide explanation about
the purpose of the instance described by this schema.
</t>
<t>
Expand Down Expand Up @@ -812,11 +919,11 @@

<section title="Security considerations">
<t>
JSON Schema validation defines a vocabulary for JSON Schema core and conserns all the security considerations listed there.
JSON Schema validation defines a vocabulary for JSON Schema core and concerns all the security considerations listed there.
</t>
<t>
JSON Schema validation allows the use of Regular Expressions, which have numerous different (often incompatible) implementations.
Some implementations allow the embedding of arbritrary code, which is outside the scope of JSON Schema and MUST NOT be permitted.
Some implementations allow the embedding of arbitrary code, which is outside the scope of JSON Schema and MUST NOT be permitted.
Regular expressions can often also be crafted to be extremely expensive to compute (with so-called "catastrophic backtracking"),
resulting in a denial-of-service attack.
</t>
Expand Down

0 comments on commit 61aadd5

Please sign in to comment.