Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behaviour when a notebook is not valid JSON #374

Closed
pbugnion opened this issue Jan 9, 2020 · 4 comments
Closed

Unexpected behaviour when a notebook is not valid JSON #374

pbugnion opened this issue Jan 9, 2020 · 4 comments

Comments

@pbugnion
Copy link

pbugnion commented Jan 9, 2020

Thanks for nbsphinx. It's great!

When a notebook is not valid JSON, the JSON source gets directly rendered to HTML.

As far as I can tell, this behaviour arises here: if the default converter fails for any reason, nbsphinx falls back silently to the RST Parser. The RST Parser seems to just convert the content, broken JSON and all, to HTML.

Unless I'm missing something, I'd expect nbsphinx to fail if it fails to parse the notebook?

I've created a minimal repository demonstrating this here. With this, the rendered HTML looks like:

Screenshot 2020-01-09 at 16 50 08

This tripped us up in ipywidgets when a notebook became invalid JSON as a result of a merge conflict. Reported in this issue.

@pbugnion
Copy link
Author

pbugnion commented Jan 9, 2020

I'm very happy to implement a change to nbsphinx myself, if need be, once the maintainers have decided on the right course.

@mgeier
Copy link
Member

mgeier commented Jan 10, 2020

Thanks for the report!

This is actually known behavior but I'm aware that it's a bit unfortunate.

The reason why this is not an error is that the NotebookParser.parse() method has two different tasks to do: (1) parse a whole input document, (2) parse a single translated sentence/paragraph.

See the docstring:

nbsphinx/src/nbsphinx.py

Lines 822 to 832 in aad12aa

*inputstring* is either the JSON representation of a notebook,
or a paragraph of text coming from the Sphinx translation
machinery.
Note: For now, the translation strings use reST formatting,
because the NotebookParser uses reST as intermediate
representation.
However, there are plans to remove this intermediate step
(https://github.com/spatialaudio/nbsphinx/issues/36), and after
that, the translated strings will most likely be parsed as
CommonMark.

... and this issue + PR: #154 + #156

I'm very happy to implement a change to nbsphinx myself

If you find a reliable way to detect that a given string is not intended as a reST (or in the future probably a Markdown) string, we could probably raise a combined NotebookError to get a proper error instead of a nonsensical output HTML.

There has also been a bit of a discussion on the Sphinx-dev mailing list: https://groups.google.com/d/topic/sphinx-dev/PvAQfZcDeHw/discussion

I think ideally Sphinx would create a separate API for translating paragraphs, then this problem would immediately go away. But I don't know whether that will happen anytime soon (or at all).

@mgeier
Copy link
Member

mgeier commented Jan 10, 2020

As a work-around, you could turn Sphinx warnings into errors with -W: https://www.sphinx-doc.org/en/master/man/sphinx-build.html#id6

It is quite likely that a garbled JSON file will cause quite a few warnings. It most definitely does in this case.

I'm normally using -W in CI, which helps finding problems early.

@pbugnion
Copy link
Author

Ah thanks for the explanations. That makes sense. I'd read that part of the docstring, but I hadn't quite figured how it applied to my issue.

Thanks for the suggestion of using -W. You're definitely right that there were lots of warnings in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants