-
Notifications
You must be signed in to change notification settings - Fork 664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make JSON schema available for verification under https:// URIs #739
Conversation
This only commits the result of (make schema-fs) and is otherwise unrelated to the rest of the PR. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
On Wed, Jan 10, 2018 at 11:11:26AM -0800, Miloslav Trmač wrote:
Overall, it is completely unclear to me which of the URIs is the
canonical URI of the “content descriptor” schema, and other schemas
in this repo, and the owner of the URI namespace needs to decide on
the canonical schema URIs.
An alternative is to drop the internal ‘id’ entries, in which case the
canonical URI becomes “wherever you got that JSON from”. See
opencontainers/runtime-spec#945 for runtime-spec implementation of
this approach.
|
Anyone running (vndr) currently ends up with failing tests in OCI schema validation because gojsonschema has fixed its "$ref" interpretation, exposing inconsistent URI usage inside image-spec/schema. So, this runs (vndr), and uses mtrmac/image-spec:id-based-loader ( opencontainers/image-spec#739 ) to make the tests pass again. As soon as that PR is merged we should revert to using the upstream image-spec repo again.
Anyone running (vndr) currently ends up with failing tests in OCI schema validation because gojsonschema has fixed its "$ref" interpretation, exposing inconsistent URI usage inside image-spec/schema. So, this runs (vndr), and uses mtrmac/image-spec:id-based-loader ( opencontainers/image-spec#739 ) to make the tests pass again. As soon as that PR is merged we should revert to using the upstream image-spec repo again.
Anyone running (vndr) currently ends up with failing tests in OCI schema validation because gojsonschema has fixed its "$ref" interpretation, exposing inconsistent URI usage inside image-spec/schema. So, this runs (vndr), and uses mtrmac/image-spec:id-based-loader ( opencontainers/image-spec#739 ) to make the tests pass again. As soon as that PR is merged we should revert to using the upstream image-spec repo again.
} | ||
|
||
// JsonReference implements gojsonschema.JSONLoader.JsonReference. The "Json" capitalization needs to be maintained to conform to the interface. | ||
func (l *fsLoader) JsonReference() (gojsonreference.JsonReference, error) { // nolint: golint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSONReference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the comment says,
The "Json" capitalization needs to be maintained to conform to the interface.
REJECT I don't want to see the further adoption of gojsonschema in this project. The complexity of the validation outweigh its usefulness and it doesn't even meet basic needs. I don't see how adding network calls to validation makes this situation any better. Even worse, if this schema is further published for use in other languages, the assumption will be made that it is sufficient to have just this and not higher-level validation that is required to be compliant with the specification. Sorry to be harsh here, but this is a huge waste of time. There are problems with the existing validator, such as printing to stdout, that would be a much better use of resources. |
I’m just a consumer of another library which calls public API of this package like *shrug*. I ultimately don’t care one iota whether it is fixed by this PR, by modifying the schema as @wking suggests, or by rewriting The one thing that does not work for me is doing nothing and keeping the self-tests in this package broken.
My only concern is that the validator does not work at all. Whether it prints unwanted output somewhere can not even begin to be a concern until that if fixed AFAICT.
This is removing network calls. The current implementation is trying to really access URLs like |
@mtrmac I am sorry if I came across as not wanting to accept a fix here. We really do want the same result: a working
I apologize if I'm confused here: this
I would love to see this package fixed, as well, but doubling down on json-schema, after it has been so problematic, is not the solution. This PR just adds more complexity that will have to be removed when we have a real solution. I am even more worried that we are creating more dependencies on, as well as the effect of removing it. Let's cut our losses here and get working validation. |
I’m sorry; apparently there is some history to this code and the discussion, about which I know nothing. I didn’t mean to take sides in any such discussion, or really to revive it at all. The PR is, from the start, a minimal hack which completely defers to the maintainers of the schema for actually cleaning up the inconsistencies. I’m happy to work on implementing any other reasonably-sized fix, whether a “real solution” or a short-term one, on which there can be consensus. Completely dropping the schema files and reimplementing the equivalent in Go is probably more than I can commit to at the moment, though.
The This PR changes nothing fundamental about the existence or purpose of Now that Regardless of the It was marginally more consistent to stop mapping |
schema/schema.go
Outdated
// specs maps OCI schema media types to schema files. | ||
// schemaNamespaces is a set of URI prefixes which contain the schema files of fs (used in JSON schema "id" and "$ref" attributes). | ||
// fs stores all of its files in the root of the filesystem tree. | ||
// (Note that this must contain subdirectories before its parent directories for fsLoaderFactory.refContents to work.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs stronger wording to indicate that these aren't actually hosted at the URIs provided and this is requirement of the fix to avoid remote IO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs stronger wording to indicate that these aren't actually hosted at the URIs provided and this is requirement of the fix to avoid remote IO.
If there is no intention of hosting the files at the URIs listed in the id
properties, why don't we just drop the id
properties from the JSON Schema (like opencontainers/runtime-spec#945 is doing for runtime-spec)? Then image-spec won't need to fake the presence of resources at those URIs with loader hoop-jumping and other JSON Schema consumers which ingest this JSON Schema without going through the gojsonschema loader will work too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the comment to hopefully explain the situation, while still being concise enough.
I have left the more long-winded explanation of what the commit does, and why, and what it doesn’t do, to the commit message and the PR discussion. If you’d rather have more of the rationale in code comments, please say so.
@mtrmac I think we go ahead with this hack, since it removes the IO and unblocks your activity. I have one request about adding a stronger comment around the URIs. Please feel free to hit me up on a more synchronous method of communication so we can work out the next steps. |
244d16b
to
1ed8b65
Compare
After updating gojsonschema to include xeipuuv/gojsonschema#171 , tests fail with > unable to validate: Could not read schema from HTTP, response status is 404 Not Found Before that gojsonschema change, "$ref" links were interpreted by taking the current schema source file's URI as a base, and treating "$ref" as relative to this. For example, starting with the [file://]/image-manifest-schema.json URI, as used by Validator.Validate (based on the "specs" map), the > "$ref": "content-descriptor.json" reference used to evaluate to file:///content-descriptor.json. gojsonschema.jsonReferenceLoader would then load these file:///*.json URIs via _escFS. After the gojsonschema change, "$ref" links are evaluated relative to a URI base specified by the "id" attribute inside the schema source, regardless of the "external" URI passed to the gojsonschema.JSONLoader. This is consistent with http://json-schema.org/latest/json-schema-core.html#rfc.section.8 and http://json-schema.org/latest/json-schema-core.html#rfc.section.9.2 (apart from the "id" vs. "$id" attribute name). In the same example, [file://]/image-manifest-schema.json URI contains > "id": "https://opencontainers.org/schema/image/manifest", so the same > "$ref": "content-descriptor.json" now evaluates to "https://opencontainers.org/schema/image/content-descriptor.json", which is not found by gojsonschema.jsonReferenceLoader (it uses _escFS only for file:/// URIs), resulting in the 404 quoted above. This is a minimal fix, making the schema files available to gojsonschema at the https:// URIs, while continuing to read them from _escFS. Because gojsonschema.jsonReferenceLoader can only use the provided fs for file:/// URIs, we are forced to implement our own gojsonschema.JSONLoaderFactory and gojsonschema.JSONLoader; something like this might be more generally useful and should therefore instead be provided by the gojsonschema library. This particular JSONLoader{Factory,} implementation, though, is image-spec specific because it locally works around various inconsistencies in the image-spec JSON schemas, and thus is not suitable for gojsonschema as is. Namely, the specs/*.json schema files use URIs with two URI path prefixes, https://opencontainers.org/schema/{,image/} in the top-level "id" attributes, and the nested "id" attributes along with "$ref" references use _several more_ URI path prefixes, e.g. > "id": "https://opencontainers.org/schema/image/manifest/annotations", > "$ref": "defs-descriptor.json#/definitions/annotations" in image-manifest-schema.json specifies the https://opencontainers.org/schema/image/manifest/defs-descriptor.json URI. In fact, defs-descriptor.json references use all of the following URIs: > https://opencontainers.org/schema/defs-descriptor.json > https://opencontainers.org/schema/image/defs-descriptor.json > https://opencontainers.org/schema/image/descriptor/defs-descriptor.json > https://opencontainers.org/schema/image/index/defs-descriptor.json > https://opencontainers.org/schema/image/manifest/defs-descriptor.json So, this commit introduces a loader which preserves the original _escFS layout by recognizing and stripping all of these prefixes, and using the same /*.json paths for _escFS lookups as before; this is clearly unsuitable for gojsonschema inclusion. Finally, the reason this commit uses such a fairly hacky loader is that merely changing the _escFS structure is still not sufficient to get consistent schema: the schema/*.json paths in this repository, and the "$ref" values, do not match the "id" values inside the schemas at all. E.g. image-manifest-schema.json refers to https://opencontainers.org/schema/image/manifest/content-descriptor.json , while content-descriptor.json identifies itself as https://opencontainers.org/schema/descriptor , matching neither the path prefix nor the file name. Overall, it is completely unclear to me which of the URIs is the canonical URI of the "content descriptor" schema, and the owner of the URI namespace needs to decide on the canonical schema URIs. Only afterwards can the code be cleanly modified to match the specification; until then, this commit at least keeps the tests passing, and the validator usable by external callers who want to use the public image-spec/schema.ValidateMediaType*.Validate() API. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
1ed8b65
to
9cfed2f
Compare
I'm 👍 on this hack for unblocking the general validation issue (it's causing problems with |
Thanks! |
Anyone running (vndr) currently ends up with failing tests in OCI schema validation because gojsonschema has fixed its "$ref" interpretation, exposing inconsistent URI usage inside image-spec/schema. So, this runs (vndr), and uses mtrmac/image-spec:id-based-loader ( opencontainers/image-spec#739 ) to make the tests pass again. As soon as that PR is merged we should revert to using the upstream image-spec repo again.
Anyone running (vndr) currently ends up with failing tests in OCI schema validation because gojsonschema has fixed its "$ref" interpretation, exposing inconsistent URI usage inside image-spec/schema. So, this runs (vndr), and uses mtrmac/image-spec:id-based-loader ( opencontainers/image-spec#739 ) to make the tests pass again. As soon as that PR is merged we should revert to using the upstream image-spec repo again.
tl;dr: After updating gojsonschema to include xeipuuv/gojsonschema#171 , tests fail with
This fixes that, but the owner of the JSON schema URI namespace needs to revise the existing status.
Before that
gojsonschema
change,"$ref"
links were interpreted by taking the current schema source file's URI as a base, and treating"$ref"
as relative to this.For example, starting with the [
file://
]/image-manifest-schema.json
URI, as used byValidator.Validate
(based on thespecs
map), thereference used to evaluate to
file:///content-descriptor.json
.gojsonschema.jsonReferenceLoader
would then load thesefile:///*.json
URIs via_escFS
.After the
gojsonschema
change,"$ref"
links are evaluated relative to a URI base specified by the"id"
attribute inside the schema source, regardless of the “external” URI passed to thegojsonschema.JSONLoader
.This is consistent with http://json-schema.org/latest/json-schema-core.html#rfc.section.8 and http://json-schema.org/latest/json-schema-core.html#rfc.section.9.2 (apart from the
"id"
vs."$id"
attribute name).In the same example, the [
file://
]/image-manifest-schema.json
URI containsso the same
now evaluates to
https://opencontainers.org/schema/image/content-descriptor.json
, which is not found bygojsonschema.jsonReferenceLoader
(it uses_escFS
only forfile:///
URIs), resulting in the 404 quoted above.This is a minimal fix, making the schema files available to
gojsonschema
at thehttps://
URIs, while continuing to read them from_escFS
.Because
gojsonschema.jsonReferenceLoader
can only use the providedfs
forfile:///
URIs, we are forced to implement our owngojsonschema.JSONLoaderFactory
andgojsonschema.JSONLoader
; something like this might be more generally useful and should therefore instead be provided by thegojsonschema
library.This particular
JSONLoader{Factory,}
implementation, though, isimage-spec
specific because it locally works around various inconsistencies in theimage-spec
JSON schemas, and thus is not suitable forgojsonschema
as is.Namely, the
specs/*.json
schema files use URIs with two URI path prefixes,https://opencontainers.org/schema/{,image/}
, in the top-level"id"
attributes, and the nested"id"
attributes along with"$ref"
references use several more URI path prefixes, e.g.in
image-manifest-schema.json
specifies thehttps://opencontainers.org/schema/image/manifest/defs-descriptor.json
URI.In fact,
defs-descriptor.json
references use all of the following URIs:So, this PR introduces a loader which preserves the original
_escFS
layout by recognizing and stripping all of these prefixes, and using the same/*.json
paths for_escFS
lookups as before; this is clearly unsuitable forgojsonschema
inclusion.Finally, the reason this PR only uses such a fairly hacky loader is that merely changing the
_escFS
structure is still not sufficient to get consistent schema: theschema/*.json
paths in this repository, and the"$ref"
values, do not match the"id"
values inside the schemas at all. E.g.image-manifest-schema.json
refers tohttps://opencontainers.org/schema/image/manifest/content-descriptor.json
, whilecontent-descriptor.json
identifies itself ashttps://opencontainers.org/schema/descriptor
, matching neither the path prefix nor the file name.Overall, it is completely unclear to me which of the URIs is the canonical URI of the “content descriptor” schema, and other schemas in this repo, and the owner of the URI namespace needs to decide on the canonical schema URIs. Only afterwards can the code be cleanly modified to match the specification.
Until then, this PR at least keeps the tests passing, and the validator usable
by external callers who want to use the public
image-spec/schema.ValidateMediaType*.Validate()
API.