Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft 7 schema with an $id containing an empty fragment produces resolution errors for subresources #1064

Closed
philvarner opened this issue Mar 21, 2023 · 7 comments

Comments

@philvarner
Copy link

philvarner commented Mar 21, 2023

This is an issue with 4.18.0a1 and 4.18.0a2, but not an issue with 3.x and 4.17 (and presumably all pre 4.18 versions).

The following json file validates against the schema https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json prior to 4.18.0a1:

S1A_EW_GRDM_1SDH_20220101T230242_20220101T230351_041273_04E7D5_F22B.json.txt

I'm using jsonschema thorough the stac-validator package, so this is what the outputs are from.

Running:

$ stac-validator out/S1A_EW_GRDM_1SDH_20220101T230242_20220101T230351_041273_04E7D5_F22B.json.txt

With jsonschema=4.17:

Thanks for using STAC version 1.0.0!

[
    {
        "version": "1.0.0",
        "path": "out/S1A_EW_GRDM_1SDH_20220101T230242_20220101T230351_041273_04E7D5_F22B.json",
        "schema": [
            "https://stac-extensions.github.io/sar/v1.0.0/schema.json",
            "https://stac-extensions.github.io/sat/v1.0.0/schema.json",
            "https://stac-extensions.github.io/eo/v1.0.0/schema.json",
            "https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json"
        ],
        "valid_stac": true,
        "asset_type": "ITEM",
        "validation_method": "default"
    }
]

with 4.18.0a2:

Thanks for using STAC version 1.0.0!

[
    {
        "version": "1.0.0",
        "path": "out/S1A_EW_GRDM_1SDH_20220101T230242_20220101T230351_041273_04E7D5_F22B.json.txt",
        "schema": [
            "https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json"
        ],
        "valid_stac": false,
        "error_type": "Exception",
        "error_message": "#/definitions/core"
    }
]

Additionally, it looks like these alpha releases where published to PyPI without the correct pre-release flag, so they're resolving for the constraint >=3.2.0 that stac-validator has. It would be preferred for this not to be the case.

@philvarner philvarner changed the title 4.18.x causes validation errors, and alphas are not marked as pre-release versions 4.18.x causes validation errors Mar 21, 2023
@philvarner
Copy link
Author

I originally had something about pre-releases resolving, but this has either been fixed or I had the alpha versions in my local cache.

@Julian
Copy link
Member

Julian commented Mar 21, 2023

Hi there, please minimize your example to some specific example that you believe is incorrect, as certainly on its own this isn't necessarily indicative of a bug (and yeah as you modified, there's no such thing as marking something as a pre-release, so presumably what happened has to do with your setup)

@Julian Julian added Waiting for Author A PR waiting for author changes or issue waiting on reporter feedback Needs Simplification An issue which is in need of simplifying the example or issue being demonstrated for diagnosis. labels Mar 21, 2023
@philvarner
Copy link
Author

Ok, here's the output just using jsonschema. If there is an additional parameter I need to set, I would not expect changes like this across minor versions.

pip install 'jsonschema==4.17.3'
jsonschema item.json -i out/S1A_EW_GRDM_1SDH_20220101T230242_20220101T230351_041273_04E7D5_F22B.json.txt
# no output, schema passes
jsonschema item.json -i out/S1A_EW_GRDM_1SDH_20220101T230242_20220101T230351_041273_04E7D5_F22B.json.txt
/Users/philvarner/.pyenv/versions/3.9.16/bin/jsonschema:5: DeprecationWarning: The jsonschema CLI is deprecated and will be removed in a future version. Please use check-jsonschema instead, which can be installed from https://pypi.org/project/check-jsonschema/
  from jsonschema.cli import main
Traceback (most recent call last):
  File "/Users/philvarner/.pyenv/versions/3.9.16/bin/jsonschema", line 8, in <module>
    sys.exit(main())
  File "/Users/philvarner/.pyenv/versions/3.9.16/lib/python3.9/site-packages/jsonschema/cli.py", line 239, in main
    sys.exit(run(arguments=parse_args(args=args)))
  File "/Users/philvarner/.pyenv/versions/3.9.16/lib/python3.9/site-packages/jsonschema/cli.py", line 293, in run
    exit_code |= _validate_instance(
  File "/Users/philvarner/.pyenv/versions/3.9.16/lib/python3.9/site-packages/jsonschema/cli.py", line 229, in _validate_instance
    for error in validator.iter_errors(instance):
  File "/Users/philvarner/.pyenv/versions/3.9.16/lib/python3.9/site-packages/jsonschema/validators.py", line 330, in iter_errors
    for error in errors:
  File "/Users/philvarner/.pyenv/versions/3.9.16/lib/python3.9/site-packages/jsonschema/_validators.py", line 335, in allOf
    yield from validator.descend(instance, subschema, schema_path=index)
  File "/Users/philvarner/.pyenv/versions/3.9.16/lib/python3.9/site-packages/jsonschema/validators.py", line 375, in descend
    for error in errors:
  File "/Users/philvarner/.pyenv/versions/3.9.16/lib/python3.9/site-packages/jsonschema/_validators.py", line 284, in ref
    yield from validator._validate_reference(ref=ref, instance=instance)
  File "/Users/philvarner/.pyenv/versions/3.9.16/lib/python3.9/site-packages/jsonschema/validators.py", line 404, in _validate_reference
    resolved = self._resolver.lookup(ref)
  File "/Users/philvarner/.pyenv/versions/3.9.16/lib/python3.9/site-packages/referencing/_core.py", line 548, in lookup
    raise exceptions.Unresolvable(ref=ref) from None
referencing.exceptions.Unresolvable: #/definitions/core

@Julian
Copy link
Member

Julian commented Mar 21, 2023

Thanks, but it's again quite a bit more helpful for you to minimize your example rather than pointing to a big instance and big schema.

If there is an additional parameter I need to set, I would not expect changes like this across minor versions.

Hard to say without diagnosing, which is why the minimized example would help. https://stackoverflow.com/help/minimal-reproducible-example has some notes (and specifically giving the minimal possible schema+instance which produces your issue is the most helpful bit in this case).

Regardless though, that isn't a good assumption to make. If there's a bug in this library that means that it doesn't have correct behavior according to the specification, that is exactly the sort of thing that may get fixed in minor versions (or even patch releases), so there's some chance it's your schema which is invalid, worked before, and now doesn't because that behavior was fixed. I won't guess what's more likely though, we may as well just look at the minimal example and see.

@Julian
Copy link
Member

Julian commented Mar 21, 2023

For reference, here's what an essentially minimized example looks like:

from jsonschema import validate
instance = {}
schema = {
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json#",
  "allOf": [{"$ref": "#/definitions/core"}],
  "definitions": {"core": {}}
}
validate(instance=instance, schema=schema)

where essentially it looks like your issue has to do with your $id containing an empty fragment (so removing the # fixes the issue).

I'll have to re-read the draft 7 specification, if I recall correctly, having an empty fragment is technically allowed, though discouraged (and there's notes in successive drafts saying that future drafts will probably make this an error).

You probably should consider removing it, though after reviewing the specs if it turns out this is indeed strictly speaking allowed even if discouraged, I'm happy to make a fix.

@Julian Julian changed the title 4.18.x causes validation errors Draft 7 schema with an $id containing an empty fragment produces resolution errors for subresources Mar 21, 2023
@philvarner
Copy link
Author

It is allowed in Draft7, and (as you can see in your example), the draft itself uses it in $id (which apparently why the the item schema used it).

Now that we're aware of the issue, we'll likely remove the empty fragment in a future release, but we're stuck with it for now.

Julian added a commit to python-jsonschema/referencing that referenced this issue Mar 23, 2023
This is discouraged by the spec, albeit still allowed.

Future versions may fully make it an error, which is why
it seems to make sense to make this JSON Schema-specific
(rather than e.g. doing it as part of Resource.id) as some
other spec may allow it and have different behavior to my
current understanding.

Refs: python-jsonschema/jsonschema#1064
Julian added a commit to python-jsonschema/referencing that referenced this issue Mar 23, 2023
This isn't necessarily 100% 'nice feeling', but early versions
of JSON Schema used URIs with empty fragments as their $ids, and
even though later versions don't, and future versions may even
make this an error, a downstream library (including jsonschema)
is likely to copy these $ids exactly as-is and expect the resource
to be retrievable, despite the last commit making some of this
behavior JSON Schema-specific. We may revisit this all later if/when
it turns out to matter.

Refs: python-jsonschema/jsonschema#1064
@Julian
Copy link
Member

Julian commented Mar 23, 2023

Fixed in referencing 0.25.3.

@Julian Julian closed this as completed Mar 23, 2023
@Julian Julian removed Needs Simplification An issue which is in need of simplifying the example or issue being demonstrated for diagnosis. Waiting for Author A PR waiting for author changes or issue waiting on reporter feedback labels Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants