Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add notebook directive #33

Closed
wants to merge 7 commits into from

Conversation

tbekolay
Copy link
Contributor

@tbekolay tbekolay commented Apr 1, 2016

This is the fourth and final PR. In our project, the notebooks that we include in the documentation are examples of using our library. We have the examples in the top-level of the repository so that people visiting the repo see them front-and-center and can look through them. Unfortunately, since Sphinx doesn't allow toctree entries that are not in the root folder of the documentation, we can't include these examples simply by adding them to toctree entries.

Similar to how files like README.rst can be added with the include directive, this PR adds a notebook directive that you can point at a notebook outside of the doc directory. It also allows you to have one .rst file that includes multiple notebooks, which I can imagine being useful.

The actual implementation of the directive is basically how docutils's include directive works, but then uses the NotebookParser to insert blocks into the document. 4fd8fe8 shows how this works by eliminating the duplication between doc/index.rst and doc/index.ipynb. It seems like bi-directional communication works perfectly; now reST files can have notebooks in them, while notebooks can have raw reST cells in them! It's very nice, and definitely not something we were able to do with our previous implementation.

One issue with the last commit is that when I build the Sphinx docs, I don't get links in the left sidebar, so I may be missing something critical in this PR. Similarly, the subdir/toctree.ipynb notebook doesn't show up, even though it is included in index.ipynb. Any idea why that might be?

@tbekolay tbekolay force-pushed the notebook-directive branch from 4fd8fe8 to 8c2b8bd Compare April 2, 2016 11:17
@mgeier
Copy link
Member

mgeier commented Apr 2, 2016

Thanks for this PR! I hope it isn't actually your final PR. I'm looking forward to receiving more ...

I'll have a closer look later, but first I want to comment on your suggested changes regarding index.rst and index.ipynb:

The reason why I initially added index.ipynb was to show a somewhat accidental and quite obscure feature.
It is possible to have source files with the same name but different extensions. Sphinx just uses the one whose extension comes first in source_suffix (which is typically .rst).
Since nbsphinx rewrites Markdown links to *.ipynb files to :doc:basename`` links, this can be used for a little trick that is shown in doc/orphan.ipynb with the "Back to the main page" link. This link leads to index.ipynb when viewed in the Jupyter Notebook and on nbviewer. However, after it's processed by Sphinx, it magically links to the Sphinx index page generated from `index.rst`.

However, all this became obsolete when I added the "nbsphinx-toctree" metadata which allows to use the same index.ipynb for everything. Also, in the meantime it became possible to use Markdown links to *.rst files which get turned into links to the generated Sphinx pages automatically.

Now I had the choice to use either index.ipynb or index.rst for the nbsphinx docs. I decided to use the more traditional index.rst in order to not frighten away users who are used to Sphinx.
Also, the toctree metadata is somewhat obscure and probably harder to use, so I think index.rst is easier to understand for newcomers, even if they don't know Sphinx yet. And I guess all Sphinx tutorials mention index.rst.

Anyway, I think that both are equally valid ways to use nbsphinx, but I think it's very confusing to use both at the same time.
I think the best solution here is not to link the two with your new notebook directive but instead remove index.ipynb completely. I added #34 as a reminder.

Does that sound reasonable?

@mgeier mgeier mentioned this pull request Apr 2, 2016
@tbekolay tbekolay force-pushed the notebook-directive branch 2 times, most recently from 2a5cde7 to aa681be Compare April 4, 2016 16:43
@tbekolay
Copy link
Contributor Author

tbekolay commented Apr 4, 2016

Yes, definitely reasonable! I agree that sticking with index.rst and removing index.ipynb is the best way to go.

Do you think it's necessary to show an example of the notebook directive, or is adding a notebook as documentation sufficient? I think the most compelling example would be including a notebook that is not in the doc folder (as that's why we're using it in our project), but I don't want to litter the main directory with a notebook if it doesn't need to be there. What do you think?

@tbekolay tbekolay force-pushed the notebook-directive branch 4 times, most recently from 58e1154 to 212ea73 Compare April 5, 2016 12:55
@mgeier
Copy link
Member

mgeier commented Apr 5, 2016

I think an example in the docs would be good, both as documentation and as test if it even works.

I would add this to the end of doc/rst.rst and include a short notebook there.
I'm not sure how links from the notebook to other places should be handled and if the links have to be rewritten if it's included from a different directory.
Or is this already handled?

Either way, I think it would be useful to put the example notebook in a subdirectory, e.g. doc/subdir2/.

@@ -848,6 +889,7 @@ def setup(app):

app.add_directive('nbinput', NbInput)
app.add_directive('nboutput', NbOutput)
app.add_directive('notebook', Notebook)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it's good to call this notebook, since that's already used by notebook_sphinxext.py.

What about calling it nbinclude to show the similarity with the built-in include directive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We previously used a modified version of notebook_sphinxext.py, hence why I went with notebook :) But nbinclude makes a lot of sense and is a natural extension of the include directive, and also the NbInput and NbOutput directives defined in nbsphinx, so I'll make that change!

@tbekolay tbekolay force-pushed the notebook-directive branch 2 times, most recently from 60a833d to d1da027 Compare April 5, 2016 16:05
@tbekolay
Copy link
Contributor Author

tbekolay commented Apr 5, 2016

I added an example in doc/rst.rst in d1da027. And it was a good thing! When I did this initially, the included document was inserted at the bottom of the document, even though the nbinclude was in the middle. So, I reworked the first commit (731e177) to use insert_lines instead of parsing directly into the document. This necessitated a bit of a refactor of NotebookParser, is that change OK?

@mgeier
Copy link
Member

mgeier commented Apr 5, 2016

Thanks! I'll have a closer look at this in the next few days ...

@mgeier
Copy link
Member

mgeier commented Apr 7, 2016

OK, I had a look ...

I don't think it's necessary to make the detour over the reST string. Also, if #36 comes through, there won't be a reST string anymore.

Do you still have the hash of your original commit?
I'd be interested to see what was the problem there.

@mgeier
Copy link
Member

mgeier commented Apr 7, 2016

I found it: 212ea73

@tbekolay
Copy link
Contributor Author

tbekolay commented Apr 7, 2016

Yep, that's it :) The difference is that I was parsing directly into the document (nbparser.parse(rawtext, self.state.document) as opposed to adding input lines to self.state_machine. I don't know enough about Spinx / docutils internals to know why parsing into the document placed the tree at the end of the document, but adding input lines to self.state_machine is what the include directive does, so I think it's likely the right thing to do.

@mgeier
Copy link
Member

mgeier commented Apr 7, 2016

You shouldn't parse into self.state.document, you should rather return the list of directives which should be inserted by NbInclude (I wonder why include doesn't do that). To do that, you'll have to parse into a temporary document and then return the content of that temporary document.

I'm not sure if that's the best way, but it should work somewhat like this:

node = docutils.utils.new_document('somename', self.state.document.settings)
nbparser.parse(rawtext, node)
return node.children

@tbekolay
Copy link
Contributor Author

tbekolay commented Apr 7, 2016

Yes, I also tried that before settling on this current solution.

The issue with making a new document, parsing into it, and returning the parsed doctree is that other parts of the document (specifically the Sphinx sidebar) gets included in node, and therefore ends up nested in the body of the doctree where it should be outside of it. For some reason, doing this also prevents the sidebar from showing up where it should show up (so, you end up with one misplaced sidebar rather than two, which I would have expected).

That said, when I tried it returned [node]; maybe returning [node.children] would fix the issue?

@mgeier
Copy link
Member

mgeier commented Apr 7, 2016

I didn't test this extensively, but the sidebar looked OK ...

node.children is already a sequence, so you shouldn't put it into another list.

If node.children turns out to be wrong, there has to be something similar that does the right thing.

@tbekolay
Copy link
Contributor Author

tbekolay commented Apr 7, 2016

I'll try it out!

@tbekolay tbekolay force-pushed the notebook-directive branch from d1da027 to 5268883 Compare April 7, 2016 13:00
@tbekolay
Copy link
Contributor Author

tbekolay commented Apr 7, 2016

node.children does indeed do the right thing! There's no sidebar issue with it. The updated PR to no longer change NotebookParser.

One issue that I have now (that isn't related to the change to node.children; this happens with the previous version of the PR) is that I get a weird "orphan:" label rendered in rst.html. See here.

I added the orphan metadata to the included notebook so that a warning isn't generated in the Sphinx build process, but for some reason when parsing, a table is added to the doctree rendering the orphan label. However, this doesn't appear in the rendered versions of those notebooks alone (see included and orphan. Any idea what's going on there?

@mgeier
Copy link
Member

mgeier commented Apr 7, 2016

This should be the explanation: http://www.sphinx-doc.org/en/stable/markup/misc.html#file-wide-metadata

I don't know if that's the proper solution, but you can try this:

if isinstance(node.children[0], docutils.nodes.field_list):
    return node.children[1:]
return node.children

... or something along those lines ...

This theoretically also removes :tocdepth: and :nocomments:, but since there is no way to set those in a notebook, this shouldn't be a problem. Also, I'm not sure if those even make sense in an included notebook.

@@ -36,6 +36,26 @@ These links were created with:
* "``../``" is not allowed, you have to specify the full path even if the
current source file is in a subdirectory!

Sphinx Directive for Including Notebooks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if you could move this section after the section about nbinput and nboutput.

raise self.severe(u'Problem with "%s" directive:\n%s' %
(self.name, error))

# Use the NotebookParser to convert to reST and evaluate if needed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conversion to reST is an implementation detail and will likely go away with #36.

@mgeier
Copy link
Member

mgeier commented Apr 7, 2016

Can you please add links and images to the example notebook, similar to subdir/another.ipynb?

Somehow you'll have to manipulate the paths to get this working correctly.

tbekolay added 6 commits April 8, 2016 11:56
The added notebook is included in the `rst.rst` document, but not
added to any toctrees.
This is a bit of a hack, but it means that notebooks no longer need
to have the `orphan: true` metadata specified for any notebook that
is in the doc directory and is included through the `nbinclude`
directive, but is not in any toctrees.
@tbekolay tbekolay force-pushed the notebook-directive branch from 5268883 to 6318526 Compare April 9, 2016 13:15
@tbekolay
Copy link
Contributor Author

tbekolay commented Apr 9, 2016

Sorry for the delay! I think I've incorporated all the feedback now, and where it made sense, I made FIXUP commits so that you can see what I've changed compared to what was in this PR before. Before merging, I can fixup those commits in an interactive rebase to keep the history clean.

I was able to remove the toctree warning in commit b1255d1. Now, the included notebook doesn't have the orphan metadata, though of course it could have it if desired. Your fix to not render the field list worked in my testing.

The only thing that I think is not really resolved thus far is adding links / images to the included notebook. The issue is that relative paths depend on the directory in which the notebook is executed, and there are two sets of relative paths to worry about: those written by the notebook creator, and those generated by nbconvert. There are four options that I can see, and none of them is a clear winner.

  1. Execute the notebook in the directory that it exists in. I did this in 387df6f but it does not render correctly because relative paths are wrong, including the relative path generated by the IPython.display.Image, even though the actual image resources are saved to the correct place on disk.
  2. Execute the notebook in the directory that it exists in, but rewrite any relative paths in the notebook to point to their relative path from the new directory. It seems like this should be possible through augmenting ProcessLocalLinks? Or perhaps some different transformation of the doctree is necessary? The main downside here is that it might be difficult to find all cases in which relative paths are used and rewrite them appropriately.
  3. Execute the notebook in the directory that includes it. This is what happens by default (i.e., without 387df6f). This saves resources and references them properly; however, it means that opening up the notebook and executing it with Jupyter behaves differently than when it is executed when included in a reST file, essentially meaning that the notebook doesn't really work unless it's nbincluded. If you have relative paths in the notebook, you write them as though the notebook was in the directory that it is included in, not the directory that it is actually in. If the notebook doesn't use relative paths (which is probably true of a lot of notebooks; it's true of the notebook in our project that is using nbsphinx) then this works properly. But for documents that link to other documents (which is likely the case for notebooks in a documentation tree) this is likely not true.
  4. Raise an error if the notebook is in a different directory than the RST file that includes it, and uses relative paths. These can likely be detected in non-code cells; if the relative paths are in code cells, then running those cells is likely to raise a CellExecutionError, which we can catch and reraise.

Whichever option we go with will need to be documented, in any case!

@mgeier
Copy link
Member

mgeier commented Apr 10, 2016

Thanks for the changes!

I think option 2 is the right choice.

I'll have a closer look tomorrow ...

@mgeier
Copy link
Member

mgeier commented Apr 12, 2016

OK, I had a closer look ...
In general, it seems to work nicely, but the problems are in the details.

Most code assumes that the files are in the source directory, ProcessLocalLinks even explicitly raises a warning (and ignores the link) if a local link points outside of the source directory.
This sure could be solved somehow, but it would mean creating a whole parallel implementation of this stuff "around" Sphinx.

Links to notebooks outside of the source directory would also be quite hard to get working.

The implementation would definitely become much more complicated and therefore harder to maintain in the future.

Now let me take a step back and ask you this:

Do you really need this?

I see a few alternatives for your case:

  1. move examples/ to doc/examples/. You said you don't want that, and I understand your reasons, but it would be a really simple solution
  2. create a symbolic link doc/examples that points to examples/. I'm not sure if that works on Windows, but at least theoretically, symbolic links should be supported since Windows Vista.
  3. if you don't want to move the examples into the source directory, move the source directory around the examples. Moving conf.py to the main directory should do it. All other files can stay at their old places. You'd just have to specify the correct master_doc.

I think I'd prefer alternative 2 in your case. All three of them would be hugely simpler than implementing (and probably even using) the nbinclude directive.

What do you think?

@jgosmann
Copy link

Unfortunately, option 2 is not a complete solution as this causes problems with images in output cells as detailed in #49. Also, this is problematic on Windows. Symbolic links have been supported for a while, but up to Windows 10 required elevated privileges to be created.

Option 1 is in my case currently not possible (the notebooks need to be part of the installed Python package).

@mgeier
Copy link
Member

mgeier commented Aug 13, 2017

@jgosmann Thanks for your comment. I know that each of the alternatives has its own weaknesses. I'm open to suggestions for solutions to the problems of this PR, probably we can solve them ...

Another thing you could try is to use your setup.py to copy around the notebooks to wherever you need them.
You can probably also use the package_dir option to manipulate where your notebooks are installed (since you say that they are supposed to be installed with the package). You might also be able to do some path manipulations with package_data?

You could also try https://github.com/ngoldbaum/RunNotebook, which I think lets you include notebooks from wherever, but AFAICT its integration with Sphinx is weaker (which may actually be what you want) and it has no LaTeX/PDF support.

@mgeier
Copy link
Member

mgeier commented Dec 19, 2017

I've just stumbled upon the PyPI package nbsphinx-link, see https://github.com/vidartf/nbsphinx-include.

@tbekolay
Copy link
Contributor Author

Hadn't seen nbsphinx-include, thanks @mgeier, will check it out :)

@Seanny123
Copy link

@tbekolay we used nbsphinx-link in nengo_extras, so should we close this PR?

@tbekolay
Copy link
Contributor Author

Yes, nbsphinx-link is a better solution than what's in this PR. Thanks for pointing us to it @mgeier!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants