Section headings in markdown render nodes outside of the InputCell nodes #30

choldgraf · 2020-02-23T07:24:31Z

I just noticed that we have a bug in how the rendered docutils nodes are put (or not) inside of the CellInputNode that should contain them. I think I know what the problem is.

For each markdown cell, we create a CellInputNode and set it as the current node. Then we render the markdown content inside. What we want is for whatever nodes are spit out to be put inside the CellInputNode (as children). This usually works for most markdown. However, if there are section headers, or other syntax that should trigger a new section, then the render function creates a new section and starts rendering outside of the CellInputNode. I think this is where that happens for headers:

https://github.com/ExecutableBookProject/myst_parser/blob/develop/myst_parser/docutils_renderer.py#L259

@chrisjsewell can you think of a way that we can ensure the renderer is always placing objects inside the CellinputNode?

The text was updated successfully, but these errors were encountered:

choldgraf · 2020-02-23T07:41:42Z

One option that would be fairly hacky would be to, at the end of the "render" operation, check if there are any non CellNode nodes in the doctree. And if so, assume that they should be appended to the children of the latest CellInputNote...

chrisjsewell · 2020-02-23T09:44:32Z

Hmm thats a tricky one. It feels like you have two immiscible hierarchies here; sections and markdown cells, since both can encapsulate each other. Do you actually need to wrap the markdown cells in CellInputNode, what would it be used for? Would it be easier to do what the SphinxRenderer now does and add a 'classed' comment node, to denote a break between two markdown cells, as in render_block_break

choldgraf · 2020-02-23T18:07:53Z

Well, the main trick is that jupyter-book often wants to know about a "block" of markdown to do things to it (e.g. "look for the cell tag div.cell.tag_hide_input and when found, hide the input of that cell"). So keeping things together with their metadata is important to Jupyter Book at least. The other nice part about this from a CSS perspective is that you can use a single selector (e.g. div.cell) and know that everything on the page will behave the same way at the top level.

Special-case the page title field and have some combination of "check ntbk-level metadata, and if no title field is there, find the first markdown heading and treat it as the "title" for the document". Then the title node gets added before any subsequent cell nodes are added.

We should probably define a "title" standard in the myst-parser anyway, no? I keep forgetting that Sphinx will treat multiple single # sections as pages in the sidebars etc.

If there is another way to wrap a bundle of markdown as a "cell" with a class in the HTML, then we could also say something like "only put markdown cells in the container node if they have tags in them", and then we could say "you need at least one markdown cell without tags in it and a header inside". That feels edge-casey for the user.

This brings up a third point - since tags are the lightest-weight way to add metadata to cells, what if we adopted the tag1, tag2 syntax for the short-hand markdown breaks. So we accept:

+++ tag1, tag2

as valid, and it'll parse anything after the +++ as a comma-or-space-separated list of tags?

chrisjsewell · 2020-02-24T05:15:46Z

This is possible in HTML, because headers are stored in a flat structure:

<div class="cell">
<h1>This is heading 1</h1>
<p>This is some text.</p>
<h2>This is heading 2</h2>
<p>This is some other text.</p>
</div>
<div class="cell">
<h3>This is heading 3</h3>
</div>

But in the doctree format (with heirarchical headers) this is surely impossible to represent

<cell>
    <section h1>
        This is some text
        <section h2>
            This is some other text.

How can you close the cell and start h3 here?
I think it would also not be possible to use such a structure when converting to LaTeX.

I feel if you did want this, it would be more robust to deal with this in the HTML translator, if you had signalled the cell begin/end in the docutils tree, something like:

<comment classes=["markdown_cell_start"]>
<section h1>
    This is some text
    <section h2>
        This is some other text.
        <comment classes=["markdown_cell_end"]>
        <comment classes=["markdown_cell_start"]>
        <section h3>
            <comment classes=["markdown_cell_end"]>

chrisjsewell · 2020-02-24T05:33:45Z

what if we adopted the tag1, tag2 syntax for the short-hand markdown breaks

The only thing with this is that it does not allow for strict round-tripping, as opposed to:

+++ {"tags": ["tag1", "tag2"], "other": "data"}

In general though, I feel we should be emphasising the use of in-text directives or syntax, rather than markdown cell metadata.

For example in your Hiding markdown cells, with MyST I feel it would be better to either comment out text you don't want in the document:

% commented out
% text

or using a directive to indicate toggle views, like you can do with the cloud sphinx theme:

```{rst-class} html-toggle
```

# Title

choldgraf · 2020-02-24T05:43:05Z

So just to clarify, it sounds like you're suggesting we shouldn't be using container nodes at all for this, and instead using a custom html translator along with some kind of metadata that's included w comments in the sphinx AST?

I guess the translator would need to override the comment render function, to check for the tag metadata and render the div blocks as needed?

Maybe @mmcky could advise on the best way to do this?

chrisjsewell · 2020-02-24T05:52:41Z

So just to clarify, it sounds like you're suggesting we shouldn't be using container nodes at all for this

For markdown cells yes, because IMO it's theoretically impossible, given the docutils AST, to add a container that is guaranteed to not break the section hierarchy.

For code cells it should be fine. The only possible edge-case I could think of is if a code output had a text/markdown mime type, that also contained headers. I notice actually in CellOutputsToNodes you don't deal with text/markdown.

choldgraf · 2020-02-24T16:56:38Z

OK - I'll try adding a lightweight HTMLTranslator that overrides comment behavior...I might have some questions since I haven't build a translator before :-)

choldgraf · 2020-02-24T17:05:51Z

OK quick question for @chrisjsewell:

How do we handle the case when we are leaving a codecell? We want the renderer to start adding nodes as a sibling of the CellNode that was the last codecell, not as a child of the node. But by default if we do something like renderer.current_node = latest_code_cell, then if the next markdown is not a header it will try to render it as a child of the code cell, and not a new section, right?

choldgraf · 2020-02-24T17:16:18Z

Question 2: is there any way to do this without manually setting a custom HTMLTranslator? I ask this because I know some Sphinx themes also do this, so if we want to use our own HTMLTranslator, those themes would not be able to use this functionality of MyST-NB.

chrisjsewell · 2020-02-24T18:30:12Z

How do we handle the case when we are leaving a codecell?

Well you want to revert to the last node that was the current_node before the codecell (i.e. its parent). This is exactly the use of the (now named) DocutilsRenderer.current_node_context method:

        with self.current_node_context(node, append=True):
            self.render_children(token)

which reverts the current_node back to its original state on exit.

chrisjsewell · 2020-02-24T18:32:56Z

is there any way to do this without manually setting a custom HTMLTranslator?

I have no idea about HTMLTranslators I'm afraid!? I just know it will be difficult/impossible to not break the docutils AST with the original approach.

choldgraf · 2020-02-25T00:13:33Z

OK, I thought about your point re: nestedness and the AST, and decided to stop being stubborn and re-work the parser so that it no longer forces the top-level "flat" cell structure that notebooks have.

Check out #31 for the latest state of this...I think it's an approach that more naturally fits with the AST we're working with

choldgraf mentioned this issue Feb 23, 2020

reworking cell and markdown relationship in parser #31

Merged

choldgraf mentioned this issue Feb 24, 2020

How to handle titles? #35

Open

choldgraf mentioned this issue Feb 24, 2020

Deal with markdown outputs of cells #36

Closed

choldgraf closed this as completed Feb 28, 2020

chrisjsewell mentioned this issue Mar 11, 2020

Figure out how to keep a consistent Mistletoe AST before the docutils render #58

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Section headings in markdown render nodes outside of the InputCell nodes #30

Section headings in markdown render nodes outside of the InputCell nodes #30

choldgraf commented Feb 23, 2020

choldgraf commented Feb 23, 2020

chrisjsewell commented Feb 23, 2020

choldgraf commented Feb 23, 2020 •

edited

Loading

chrisjsewell commented Feb 24, 2020

chrisjsewell commented Feb 24, 2020 •

edited

Loading

choldgraf commented Feb 24, 2020

chrisjsewell commented Feb 24, 2020 •

edited

Loading

choldgraf commented Feb 24, 2020

choldgraf commented Feb 24, 2020

choldgraf commented Feb 24, 2020

chrisjsewell commented Feb 24, 2020

chrisjsewell commented Feb 24, 2020

choldgraf commented Feb 25, 2020

Section headings in markdown render nodes outside of the InputCell nodes #30

Section headings in markdown render nodes outside of the InputCell nodes #30

Comments

choldgraf commented Feb 23, 2020

choldgraf commented Feb 23, 2020

chrisjsewell commented Feb 23, 2020

choldgraf commented Feb 23, 2020 • edited Loading

chrisjsewell commented Feb 24, 2020

chrisjsewell commented Feb 24, 2020 • edited Loading

choldgraf commented Feb 24, 2020

chrisjsewell commented Feb 24, 2020 • edited Loading

choldgraf commented Feb 24, 2020

choldgraf commented Feb 24, 2020

choldgraf commented Feb 24, 2020

chrisjsewell commented Feb 24, 2020

chrisjsewell commented Feb 24, 2020

choldgraf commented Feb 25, 2020

choldgraf commented Feb 23, 2020 •

edited

Loading

chrisjsewell commented Feb 24, 2020 •

edited

Loading

chrisjsewell commented Feb 24, 2020 •

edited

Loading