Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve extra sibling attributes #35

Closed

Conversation

pkarman
Copy link
Contributor

@pkarman pkarman commented Dec 16, 2021

Per #23 this change will preserve sibling attributes to the $ref attribute on JSON object instances.

I recognize that the JSON RFC language about

Any members other than "$ref" in a JSON Reference object SHALL be ignored.

is ambiguous. What does it mean to "ignore" members? Does it mean "pretend they are not there" or does it mean "do not act on or alter them"? I dunno.

What I do know is that other JSON libraries, both in Python and other languages, seem to take the "do not act on or alter them" interpretation of "ignore" so that's what this change does.

You'll note that I use deepcopy and so I may be breaking some of the lazy-ness of the proxy pattern. It also means that the test pattern of asserting with is against __subject__ no longer works because it's a copy of the values, not a pointer at the same object.

Happy to discuss a different approach that achieves the same ends of preserving the sibling attributes.

@gazpachoking
Copy link
Owner

Hmm. Not sure I'm a fan of merging the two dictionaries together, I think JSON reference is meant as just a way to refer to other areas of a document. Merging dictionaries seems beyond scope.
I'm also worried about how other cases will work when modifying the dictionary, like when multiple references point to the same document, or recursive references. All those cases handled already with this implementation?

@pkarman pkarman force-pushed the pek-ignore-sibling-attributes branch from 590c416 to 6482e3a Compare October 20, 2022 21:06
@pkarman
Copy link
Contributor Author

pkarman commented Oct 20, 2022

multiple references point to the same document

is handled in the test case in this PR.

Recursion is not handled, as evidenced by the failing test in 6d1fcac.

Honestly, though, I won't pursue this PR if philosophically you're opposed to the feature. In this comment you argue that

The JSON reference documents say that a JSON reference should be an object with only the $ref key.

I don't believe it says that. I believe it says (as you quote just below):

A JSON Reference is a JSON object, which contains a member named "$ref", which has a JSON string value.

It does not mention "only the $ref key" -- in fact, it says to ignore other object members, which implies that the object may legitimately contain them, and that statement is ambiguous enough that different libraries implement it differently. All I was trying to do here was make this excellent library act consistently with how other JSON schema libraries approach this problem.

Your library, your rules. I'm happy to close and not code to solve the failing test, if you prefer.

@gazpachoking
Copy link
Owner

gazpachoking commented Oct 20, 2022

I'm a little bit torn. I see how this could be useful, but I think it's clear that it's outside the spec. You are absolutely right that there isn't anything in the spec that says a reference object can't have extra keys, but it does say:

Resolution of a JSON Reference object SHOULD yield the referenced
JSON value. Implementations MAY choose to replace the reference with
the referenced value.

Which seems pretty clear, the whole reference object resolves into the thing it references, not merged with any extra keys the reference object may have had.

As for your new test, that one seems really extreme to me, and I definitely don't think a reference object inside a reference object should needs to be supported. What I meant when I sad recursive was something more along these lines:

{
  "a": "something",
  "child": {"$ref": "/#", "more": "somethingelse"}
}

@gazpachoking
Copy link
Owner

@pkarman Can I ask what sort of situation you are using this for?

@gazpachoking
Copy link
Owner

I don't want to lead you on, because I still have no idea if I'll come around on adding it at all, but I think if we do it should:

  • be optional, and non-default
  • be tested against the new 1.0 branch. I suspect recursive references will break there when using proxies=False

@smajda
Copy link

smajda commented Oct 21, 2022

@pkarman Can I ask what sort of situation you are using this for?

You weren't asking me here, but I've been following along because I've found this change quite useful, so I'll chime in here with an example.

We're using react-jsonschema-form to build forms and we frequently use the pattern described here when doing this. In short, defining a "field type" definition that want to include via a $ref, but then extending it to add things like a title attribute, as seen in the linked example. In our case, we obviously also want to dereference the $refs. Without this change, we lose the title attributes. We'd been resorting to doing this:

    "billing_address": {
        "allOf": [
            {
                "title": "Billing address"
            },
            "$ref": "#/definitions/address"
        ]
    },

But that's cumbersome and it's much nicer to just do what the react-jsonschema-form docs above show:

"billing_address": {
    "title": "Billing address",
    "$ref": "#/definitions/address"
},

@pkarman
Copy link
Contributor Author

pkarman commented Oct 21, 2022

We had a very similar situation to @smajda -- trying to use a schema across multiple languages (python and javascript) and wanted consistency in the toolchain. Our schema has a description on some objects to we wanted to preserve.

@gazpachoking
Copy link
Owner

I think I'm okay with the idea. My hesitation is because it might deepen the hole we are in, where we have people relying on this non-standard behavior for their schemas. (And I actually think the allOf is a pretty nice solution, and within the spec, so should work everywhere.) I don't think the implementation has to be too complicated though, so I'm okay with it going in as a non-default option. While playing around with this I found a few more things I want to clean up/refactor on the 1.0 beta, then I'll see how neatly this can fit in. I'm considering doing the dictionary merge with something like ChainMap rather than creating new dictionaries, though I haven't figured out if that really helps or simplifies anything yet.

"b": {"extra": "foobar", "$ref": "#/a"},
}
result = JsonRef.replace_refs(json)
with pytest.raises(JsonRefError) as excinfo:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bad behavior, even if the extra attributes are opt-in. This clearly violates the spec stipulation that extra attributes should be ignored. I think when turning on a merge-attributes mode, it would only do anything if the ref points to a dictionary, and just ignore them otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough

@gazpachoking
Copy link
Owner

gazpachoking commented Oct 22, 2022

I have a prototype which seems to be working well. Doesn't look like there is any need for ChainMap, seems to be working okay with/without the proxies, and still works recursively. I think I'm just going to eliminate the possibility of having another $ref within extra arguments of a reference object. Couple of questions:

  • If the extra properties overlap with ones from the ref, which ones should stay? I'd be inclined to say the extra properties override the $referenced properties.
  • @pkarman Was there a reason you were doing deepcopy here?
  • Currently, you can load a document with refs, edit it, then write it back out again with the original $refs intact (assuming it's a single file with only self referential refs). This would sorta break that, do we think that's a problem? I don't think that's a commonly relied on property of this library. This is the one bit we could resolve with a ChainMap style merging, but I don't think it's worth the complication.
  • What to call the option to enable this mode?

@gazpachoking
Copy link
Owner

Okay, I've added the merge_extra_properties argument to the #43 branch. That name feels a bit verbose though, any suggestions? @pkarman @smajda Does that implementation seem good to you? I also pushed a new beta to pypi with these changes, and updated the docs on readthedocs.

@pkarman
Copy link
Contributor Author

pkarman commented Oct 24, 2022

I have a prototype which seems to be working well. Doesn't look like there is any need for ChainMap, seems to be working okay with/without the proxies, and still works recursively. I think I'm just going to eliminate the possibility of having another $ref within extra arguments of a reference object. Couple of questions:

  • If the extra properties overlap with ones from the ref, which ones should stay? I'd be inclined to say the extra properties override the $referenced properties.

agreed.

  • @pkarman Was there a reason you were doing deepcopy here?

Can't remember, honestly. I think it was likely defensive, just in case there were multiple layers deep and I wasn't entirely sure how Python would handle it.

  • Currently, you can load a document with refs, edit it, then write it back out again with the original $refs intact (assuming it's a single file with only self referential refs). This would sorta break that, do we think that's a problem? I don't think that's a commonly relied on property of this library. This is the one bit we could resolve with a ChainMap style merging, but I don't think it's worth the complication.

That's not a use case I have exercised, or that I would be likely to use. I mainly rely on this library to "flatten" a JSON document that contains $ref pointers, since the otherwise excellent https://github.com/python-jsonschema/jsonschema did not consistently work for us.

  • What to call the option to enable this mode?

merge_props ?

@gazpachoking
Copy link
Owner

That's not a use case I have exercised, or that I would be likely to use. I mainly rely on this library to "flatten" a JSON document that contains $ref pointers, since the otherwise excellent https://github.com/python-jsonschema/jsonschema did not consistently work for us.

I figured that wouldn't be used much. Going to not worry about that when merge_props is used to keep things simpler.

  • What to call the option to enable this mode?

merge_props ?

Agreed, probably better. Changed.

@pkarman
Copy link
Contributor Author

pkarman commented Oct 24, 2022

superceded in #43

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants