Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure we serialize dictionary into valid JSON #3408

Conversation

martinRenou
Copy link
Member

Similar PR to jupyter-widgets/ipydatagrid#303

This makes sure the returned JSON is valid (dictionary keys are strings)

@github-actions
Copy link

github-actions bot commented Mar 4, 2022

Binder 👈 Launch a binder notebook on branch martinRenou/ipywidgets/valid_json_serialization

@jasongrout
Copy link
Member

Does the JSON serialization already do this? It seems that JSON-encoding a dict with non-string keys already converts the key to a string:

In [1]: import json

In [2]: json.dumps({1: True})
Out[2]: '{"1": true}'

@jasongrout
Copy link
Member

From the python json documentation: https://docs.python.org/3/library/json.html#json.dumps

Note: Keys in key/value pairs of JSON are always of the type str. When a dictionary is converted into JSON, all the keys of the dictionary are coerced to strings. As a result of this, if a dictionary is converted into JSON and then back into a dictionary, the dictionary may not equal the original one. That is, loads(dumps(x)) != x if x has non-string keys.

Is this not happening automatically for some dictionary?

@jasongrout
Copy link
Member

However, this note in the docs seems to contradict the "keys are coerced to strings": https://docs.python.org/3/library/json.html#json.dump

If skipkeys is true (default: False), then dict keys that are not of a basic type (str, int, float, bool, None) will be skipped instead of raising a TypeError.

@jasongrout
Copy link
Member

Yeah, it seems like non-primitive types in keys raise a TypeError:

In [6]: json.dumps({frozenset([1]): 4})
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [6], in <module>
----> 1 json.dumps({frozenset([1]): 4})

...

TypeError: keys must be str, int, float, bool or None, not frozenset

I'm a bit concerned that we would be going beyond python json defaults here by automatically coercing non-primitive keys.

@jasongrout
Copy link
Member

Is the ipydatagrid issue the fact that keys are tuples?

I'm sort of in favor of ipydatagrid having a custom serializer here, and keeping our base serialization in line with python defaults. Wouldn't ipydatagrid need a custom serializer on the js side anyway to interpret keys that are stringified tuples?

@ibdafna
Copy link
Member

ibdafna commented Mar 4, 2022

ipydatagrid does have a custom serializer, but seeing as this was handled implicitly previously and now it's not, it will likely affect other custom widgets which do not use custom serializers.

@jasongrout
Copy link
Member

but seeing as this was handled implicitly previously and now it's not

Oh, that is a different story. When did this work before? Is this an ipywidgets 7->8 change?

@jasongrout
Copy link
Member

It looks like this line is the same in the 7.x branch:

return {k: _widget_to_json(v, obj) for k, v in x.items()}

@martinRenou
Copy link
Member Author

but seeing as this was handled implicitly previously and now it's not, it will likely affect other custom widgets which do not use custom serializers.

Some background missing from my PR description:

This was actually handled in ipykernel, which used to "clean" the JSON prior to pass the socket message. This logic has been removed from ipykernel for jupyter_client 7 in order to improve performances. So now jupyter_client expects valid JSON, but it falls back to the old JSON cleaning logic if the message is not valid, showing a warning if the message is not valid.

Though it was not handling invalid dictionary keys, I pushed a fix for that in jupyter_client: jupyter/jupyter_client#752

I'm sort of in favor of ipydatagrid having a custom serializer here, and keeping our base serialization in line with python defaults. Wouldn't ipydatagrid need a custom serializer on the js side anyway to interpret keys that are stringified tuples?

I think this makes sense. I am fine not merging this PR and leave it to the custom widgets developers to make sure the JSON is valid.

@jasongrout
Copy link
Member

Thanks, that background context is very helpful. So are these two statements correct?

  • Right now with the current packages, an ipywidget that has non-primitive keys like a tuple will work, but will print out a warning?
  • This PR was to get rid of the warning (and make sure we didn't fail in the future when the fallback in jupyter_client is removed)?

If those statements are correct, then I'd be in favor of not merging this PR and relying on the warning for ipywidgets users. Alternatively, I'd also be +1 on merging a fix like this for ipywidgets 7 (so it continues working even after the fallback is removed) and noting in the ipywidgets 8 documentation that a widget author should take care of this with a custom serializer for ipywidgets 8.

@martinRenou
Copy link
Member Author

Right now with the current packages, an ipywidget that has non-primitive keys like a tuple will work, but will print out a warning?

As of today, it will unfortunately raise a TypeError but this should be fixed in the next jupyter_client release. Once it's released, it should raise a warning yes, but this is all temporary: in the future that code should fail with a TypeError.

This PR was to get rid of the warning (and make sure we didn't fail in the future when the fallback in jupyter_client is removed)?

That's correct

@jasongrout
Copy link
Member

jasongrout commented Mar 7, 2022

@martinRenou: which option do you think is best?

  1. Merging this PR in 8.0 and 7.x
  2. Not merging this PR and relying on the warning for ipywidgets developers to change their code to use a custom serializer
  3. Merging a fix like this for ipywidgets 7 (so it continues working even after the fallback is removed) and noting in the ipywidgets 8 documentation that a widget author should take care of this with a custom serializer for ipywidgets 8.

(or is there another option?)

@jasongrout jasongrout added this to the 8.0 milestone Mar 8, 2022
@martinRenou
Copy link
Member Author

Maybe 2? We can probably consider that widget_serialization only serializes dictionaries that are "valid JSON". If custom widgets devs need more intricate logic for serializing (like ipydatagrid) then they should get a warning and would need to provide custom serializers if they want.

I'll let you close this PR if you agree.

@jasongrout
Copy link
Member

I'll let you close this PR if you agree.

Sounds good, thanks.

@jasongrout jasongrout closed this Mar 8, 2022
@jasongrout
Copy link
Member

And thanks for fixing the error down to a warning in jupyter_client!

@martinRenou martinRenou deleted the valid_json_serialization branch March 8, 2022 10:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants