-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Display data as a cell type #1123
Comments
Can you clarify why a Markdown cell doesn't accomplish that? Do you want to
access display mime other types than HTML? What would edit mode look like
for those other mime types? Not opposed to this idea, just trying to think
through it....
|
@ibustelo has run into some sort of sanitisation that we do on markdown cells, so you can't write arbitrary HTML in them. I think that was prompted by the security discussion - we decided to sanitise markdown all the time, so that the signatures and trust mechanism only deal with outputs from code cells. So I think the crucial thing we need to work out is what the user interface for trusting arbitrary HTML in something like a markdown cell is, equivalent to running a code cell to trust it. Should we present untrusted cells unrendered and let the user render them to trust them? |
Ping @minrk |
If I understand this correctly, I think this can be orthogonal to #621 . Even if you could insert a special "image" cell, I don't think this would cover all the use cases for inline images. For example, you might want to do layout on your inline images (e.g. put them in a table) that you cannot do with a simple image cell type. |
@takluyver great point, yes I guess we do treat the code/markdown cells Creating a new cell type that supports arbitrary display mime-types, but On Sun, Feb 21, 2016 at 2:46 AM, Thomas Kluyver notifications@github.com
Brian E. Granger |
Here's what I imagine for what is represented as an HTML cell for users when they're editing it: Followed by what it looks like when rendered: In this example I'm showing a pencil icon to switch back to edit mode, though we can refine that. As for inserting it, to the user they don't see what the underlying representation is for this cell which I'm imagining to be: {
"cell_type": "data",
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Name</th>\n",
" <th>Email</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Jane Doe</td>\n",
" <td>jane@doe.com</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>John Doe</td>\n",
" <td>john@doe.com</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
]
}
} Similary for images, this would end up with base64 encoded images inline while the UX for it is similar in flow to drag and dropping images or selecting via menu in other UIs. |
A summary of the security/sanitization we do and why:
If we treat markdown cells as we do code cells, that means that an untrusted notebook needs to open with all markdown cells unrendered, and they must be 'executed' manually by the user before they are allowed to render on the page. The same would be true if we had dedicated mimebundle cells. So the question to me is, really, a user-experience one on which I don't have an informed opinion. Should we:
There are technically other options, like "display but sanitize only untrusted markdown", which is attractive because it will do what people want the most often, but I think it's the most confusing when sanitization of untrusted markdown produces a materially different result, where we have to communicate to users that they must"unrender then re-render the cell to see it how it's meant to be." |
Is it possible to detect whether there's anything in a block of markdown/html which would be affected by sanitisation? I.e. could we check if each markdown cell is 'safe' and decide to display it rendered or unrendered? |
Yes, that is possible. |
If that's doable, I think that would be preferable. But it would probably also cause some confusion - why are some of these cells showing up rendered and others unrendered? Maybe we could add a little untrusted indicator by the unrendered cells, and offer a bit of explanation on mouseover/click. |
Sure, so then the plan would be:
Step 1. requires a change in nbformat to keep track of the trusted flag on markdown cells. Perhaps that trust->sign code really belongs in the notebook repo anyway (not 100% sure), since it's specifically a transform from the nbformat file format to 'live' document state that's specific to this webapp. One (lazy) version of that indicator could be to leave the cell unrendered if it's untrusted and sanitization would make some change. @rgbkrk is there anything display cells would provide that this proposal would not, or do you still think that display cells are something we should add? |
To be precise, my proposal is not to sanitise (or not to display sanitised output), but to display the markdown cell unrendered if sanitisation would make any changes and it's untrusted. |
Gotcha, that ought to be doable. |
I got the basics going for that at #1126. |
Trust on markdown cells is a good starting point and we should see how it works in practice, via @lbustelo. There's a larger question about what the markdown cells do and how they're specified:
If we couple ourselves to always using markdown cells for embedded HTML, these are the primary user experience and developer experience cons:
|
It would make this trickier, but I don't think it should be impossible. I expect that most markdown cells would still contain relatively simple Markdown which could be edited like that. Of course, if there are two different behaviours for markdown cells based on their contents, maybe they should be two cell types. That would be a bigger change to the notebook format, though. |
Using the Markdown cell (once sanitation issues are solved) addresses 2 of my main concerns:
Having said that... there are so many ways to author client side content that may not fit as nicely in the Markdown cell and might be better suited if we had some level of extensibility around cell types. @bollwyvl brought up https://github.com/pugjs/jade as another alternative to avoid typing HTML. There is always some new flavor of the month. Also as @rgbkrk hinted at with WYSIWYG comment, the maturation of the Notebook space is going to lead to higher level authoring experiences. Jupyter and the NB format should somehow accommodate for that to avoid being overshadowed by the countless alternatives that are popping up all over. I understand the importance of the downstream tools (i.e. nbviewer) and the hesitation of an open set of cell types, but as notebooks become platforms for solution development, I think this issue is going to become more and more important to solve. |
Quick demo for you using draft-js and KaTeX, for a rich editor that has block maths: |
Great, glad to see conversation about this topic! I've been tracking it for some time now, and am really interested in how this round turns out! I was indeed convinced then that make a cell type for everything is not the answer... but perhaps it is time to start thinking about some of the UI pieces around rich output that haven't fundamentally changed since then. Background Front End Stuff Even if adopting commonmark for all that Stuff, and getting our sanitization house in order, the authoring of said Stuff in the browser (in the notebook) probably does need some love.
User-authored text If we were to embark on this path, I think it would end up having:
Then the question is... are these then even cells? or is this another, prose-native view of your notebook you have chosen, which can include cells embedded in it. Do you show them next to each other? Does this even go in an ipynb, or is this a separate file type altogether, or a wrapper around both kinds of file, or a PDF with a local file store? Very exciting stuff, and hopefully a topic for the dev meeting! |
That seems really appealing, and points to either a display-data cell as discussed here, or just an HTML cell, whose editor can be any HTML-authoring magic. I believe with CommonMark's behavior, anything inside a |
My understanding with commonmark is that any html tags should treat the enclosed material as html. That seems to be how the reference implementation behaves too: http://spec.commonmark.org/dingus/ |
Yep, seems so.
So, today, an inline prosemirror could serialize its JSON document model to cell metadata, and treat the source as its output. Though a bit tubby on bytes, this is nice, as then you've still got a "dead pixel" version of the content if someone doesn't have nbJadeDustUnderscoreHandlebarsReactHAMLJinjaLiquidLessSCSSStylusCoffeeTypeScript. I guess you'd have a helpful message at the top that suggested thou shalt not edit this cell, but since we don't lock cells, they'd be free to go on about their business if they didn't have your editor. Some translations even have reverse engineering capabilities, such as http://html2jade.org/, though this falls apart once you actually start using template features...
This is the behavior of nbviewer, inherited from mistune, as discussed here: jupyter/nbviewer#526 (comment) As described there, nbconvert( or nbviewer)'s configuration could be changed to mimic the live browser's marked, and the spec. |
The editor posted above comes straight out of draft-js, which has been used in production at Facebook for the last couple years and open sourced yesterday. It's React centric, though you can have React target any DOM element for rendering. I'm enjoying the APIs so far and the model underneath (people are using the same model in native apps now too). That's a diversion from the real problem I'm worried about: what our specification is for the markdown cells themselves. It's not specced and is a reflection of the way the current user's notebook server implements it. When used on nbviewer, github, or other static renderings, if we wanted consistency, we'd have to match the version of marked, mistune, commonmark, etc. as well as MathJax that matches the notebook server they came from. If we keep the spec consistent with commonmark and suggest HTML somewhere else, we lessen the rendering bugs that get reported elsewhere and can build a clean model (a necessity for a WYSIWYG editor). |
I don't have an answer here yet, just starting to digest the questions... But I want to throw one more data point into this topic: the Broad Institute's GenePattern Notebook exposes special input cells via an extension and a custom cell type, this page shows some examples. This is a pattern that is also used by KBase for its computational biology apps and methods, with a slightly more complex approach b/c it was created earlier, when our notebook infrastructure was less mature (so the KBase team had to hack more). This KBase/GenePatternNB approach fits certain use cases in biology really well, and it has made me think that we really need to find a clean, generic solution for it, as biology is not the only place where it's useful. So I think we should approach this question trying to solve these slightly different, but ultimately related, use cases in a unified way... If we don't get to a solution here, this should definitely be something to brainstorm on at the dev meeting! I've put it on the agenda. |
As @bollwyvl pointed out in #1123 (comment), the important thing in commonmark for html blocks is that there are no blank lines (e.g., it doesn't do html between div tags, it just pays attention to blank lines). |
Aha, interesting note about the blank lines, thanks. That seems a bit weird, but does point to us making a specific HTML cell (mimebundle or otherwise) that wysiwyzards can sit on. I wouldn't want to be saying "Make sure you don't add any empty lines in your HTML, or it'll interpret it as another language". |
To throw another thing into the discussion, I also worked on wysiwyg tools for generating code directly in code cells: http://bl.ocks.org/jasongrout/5378313. |
@jasongrout that's very cool! |
@jasongrout +1! |
I was trying to find out how to use the cell type to create to special cells and I found this awesome thread. I have been building presentations with There needs to be a tighter way to connect variables on the kernel with HTML and Javascript. I came up with this small cell magic called This notebook showcases
I feel like there is a cool editor in this idea. |
@tonyfast checkout jupyter-declarativewidgets. The main focus of that work is to connect data and functions from a kernel with visual interactive areas in the notebook authored using the html magic. |
Mega 👍 to the declarative widgets |
Neat, thanks for posting that, @tonyfast |
Very neat, @tonyfast! |
Looks like this issue has stalled... I've been pestering @minrk about this yesterday and today as it would make my life (as someone developing an extension that wants to add some fundamentally new functionality to the notebook) a whole lot easier. Currently, the only option I see (short of forking the notebook) is to hijack markdown cells with certain attributes and override the render method and create my own DOM elements outside of the codemirror/rendered output areas. Is the plan still to implement a mimebundle type cell in the notebook (outside of jupyterlab)? This would also make #1999 more-or-less trivial to implement as an extension which would be a huge plus; just set mime type to image/png and have an extension include some metadata with the cell to provide its own editor (as an html canvas editor). |
These are the steps that would need to be taken, ignoring whether or not people want this in the core document format. Many likely need to be done in parallel across the repos.
All that being said, your workaround is to establish a way to do this in a raw cell. Stick whatever you want in metadata, including the serialized version of what you want to do. |
v4 of the notebook format is officially designed such that a new notebook cell type is a minor revision of the notebook format. UIs that see unrecognized cell types in minor format revisions newer than the latest they support should handle it, even if they can't display them. This would be the first change to exercise this behavior, though, so it will be interesting to see if the upgrade experience goes as promised. |
Whoops, corrected the above. I see how |
Overall, I like this idea. I am a bit worried about the unintended side
effects of the decision. This change would make the jupyter notebook format
an essentially a sequence of arbitrary content specified by MIME types
(standard ones and made up ones). That is getting dangerously close to
saying "put anything you want in a notebook" and not having any ability to
reason about what you might find in a given notebook.
At the same time, this already perfectly describes output, so maybe it
isn't a big deal. I can clearly imagine *many* use cases for it.
Because of the broad impact across the entire project, I would prefer this
be proposed as a Jupyter Enhacement Proposal first.
…On Sun, Aug 27, 2017 at 12:48 PM, Kyle Kelley ***@***.***> wrote:
Whoops, corrected the above. I see how unrecognized_cell is declared now.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1123 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABr0FOO5jiq1_GVRCAq8NfmhYygNDbUks5sccgVgaJpZM4He2Y5>
.
--
Brian E. Granger
Associate Professor of Physics and Data Science
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
bgranger@calpoly.edu and ellisonbg@gmail.com
|
On the heels of #621 and IRkernel/IRkernel#260, I'm wondering about a cell that matches display data semantics. We already have a way to display
mimetype: data
bundles.Here's an example UX flow. A user goes to insert an image either via drag and drop or a menu:
The image then ends up embedded in the document as if it was injected using magics or running code.
Thinking on @lbustelo's use case for the declarative widgets, this would allow cross-language support for writing direct HTML cells without the use of magics. The cell then has two states - edit and view, just like the markdown cells.
This also spares the idea of having to do garbage collection in #621.
/cc @lbustelo @parente @julienr @jdfreder
The text was updated successfully, but these errors were encountered: