-
Notifications
You must be signed in to change notification settings - Fork 391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider support for Quarto qmd format? #837
Comments
Hi @matthew-brett , thanks for pointing out at Quarto. This sounds like a great initiative ! I've looked a bit at the documentation, and especially at the hello quarto example. That example reminds me a lot of the pandoc representation of Jupyter notebooks (the Also I see that Quarto allows the user to edit qmd notebooks in JupyterLab, is this something that you tried out already? Does that address your point, or not? Please keep me posted! |
Quarto is closer to the
We encode attributes within comments so that the metadata is easy to edit within existing Jupyter notebook UIs. So in many ways @mwouts you are correct that using comments means that we don't need to sync to .ipynb in all cases to be compatible w/ editing within Jupyter. However, I think there are lots of cases where editing in both a plain text format (for more exotic markdown constructs like callouts, grid tables, layout panels, etc.) and within Jupyter will be valuable, which I think makes Jupytext integration very desirable. It's on our short list to submit a PR for Jupytext |
Hello @jjallaire, thank you for joining the conversation! Well I am glad to see the RStudio team bringing its IDE expertise to the Python world (if you didn't know, I started working on this project a few years ago when I had to switch from R to Python, and realized how hard it was to collaborate on ipynb notebooks compared to Rmd...) So sure I'll be happy to help, and certainly we can add support for the Regarding the actual implementation, we have a series of options:
Which one of the above do you think is best? Please let me know |
Hi @mwouts, great to be here and cool to hear that your work w/ Rmd had some influence on Jupytext! Of those options I think that To really make this work well we'll want Hopefully this all makes sense to you. We can add the requisite flags/behavior to |
Thanks @jjallaire ! Yes definitely I can prepare a PR that adds the I've been able to test Also, should I expect a YAML header at the top of the file? When I ran Regarding the code and output visibility controls, I think we are still seeing some diversity in how this is handled in the Jupyter world. The notebook format now defines Finally, you mention that RStudio could call Jupytext when saving paired |
For CI I would just use the installers. The It seems like there should at a minimum be a YAML header indicating the Jupyter kernel, that's likely an oversight (I also think that we should convert an H1 at the top to a YAML title). I will make those changes. For brevity Quarto supports a straightforward I'm also thinking that it is likely that we could consider a In terms of the saving scenarios, one thing I found less than ideal about JupyterLab was that when an unsaved notebook is changed on disk by Jupytext, JupyterLab doesn't automatically reload it (as many text editors do for text files). Do you know if there is any prospect of this improving (as it would make the multi-editor scenario much better). |
@mwouts This change is on If there is already jupytext metadata in the .ipynb then we will write something like this (from the World Population example): ---
jupyter:
jupytext:
cell_markers: 'region,endregion'
formats: >-
ipynb,.pct.py:percent,.lgt.py:light,.spx.py:sphinx,md,Rmd,.pandoc.md:pandoc
kernelspec:
display_name: Python 3 (ipykernel)
language: python
name: python3
--- However, if there is no jupytext metadata (not sure if this is ever the case when jupytext is being run?) then we write our standard abbreviation of ---
jupyter: python3
--- So you'd need to know that in some cases you might see this YAML rather than the fully elaborated version w/ |
Hi @jjallaire , thanks for the fast iterations! On my side I have prepared a new branch https://github.com/mwouts/jupytext/tree/quarto in which I have added the Quarto format. At the moment the round trip tests don't pass yet (
I completely agree! Well at the moment most of Jupytext is implemented in Python. If we wanted to reload only the text file we would need to 1. watch the file in JupyterLab (TypeScript) and 2. make the merge between the current notebook in memory (with outputs) with the text notebook again in JupyterLab (TypeScript) (the Python implementation is combine_inputs_with_outputs). At the moment I am not very skilled at TS so an external contribution would be very useful. |
Excellent! I believe I have resolved both of these issues on For JupyterLab, I'd honestly rather wait for core functionality that monitors edited notebooks for changes and automatically reloads them (or prompts for resolution if they are unsaved). This is definitely the right solution for users as it supports multiple-editors for notebooks in all scenarios (e.g. user has the notebook open in both JupyterLab and VS Code). |
Thank you @jjallaire! I have tested v0.2.116 and indeed it does solve the two previous issues. Now we might want to improve speed (the round trip takes 3-4 secs per notebook, compared to 100ms for pandoc, and 10ms for the other formats). Also, not all my sample notebooks are stable yet over a round trip. I think I am seeing consecutive markdown cells being merged (but I agree this is hard to avoid), and also raw cells are being converted to plain markdown cells. I'll leave that up to you (I can live with that and skip these sample notebooks in the tests). There's one point for which I would need your input. Locally I have installed quarto with
Do you know how I can instead get the 'latest' version (as I'd like to install |
Great to hear! On performance, the issue there is Deno startup time -- modulo that we'd be sub 25ms. This has been an unpleasant surprise for us as our understanding was that Deno was intended for CLI tools (which clearly need to start up in < 50ms at the worst). We have one of two recourses here:
We'll be pursuing these over the next few months so I expect the performance issue to be resolved in that timeframe. Yes, Quarto uses Pandoc 'raw' markdown (e.g. I can think of two ways to get the "latest" version programmatically right now:
|
Just chiming in this discussion to share other method:
Hope it helps |
Hi @cderv , thank you for joining the conversation, this is very helpful. I like the If I take a step back, @jjallaire I think we're not far from being able to deliver a first version with support for It is a good thing that you plan to address the latency of Now there is one thing that is not included in the branch, it is the support for the |
We will 100% address that latency soon (it also comes up in a couple of places in the RStudio IDE). I think it would be great to have the |
Yes sure - not a big deal indeed, I have added that in the branch. I have also added a paragraph in the documentation, feel free to change it or expand it. If you don't mind I prefer to warn the users that the round trip is not always stable (as we try to make it stable for the other formats). Finally, if you or @matthew-brett are willing to test the
(and then you'll need to restart Jupyter) |
Hi @mwouts, yes it's fine to make a note about performance and stability for round-trips. We have started to take a deeper look at performance issues (denoland/deno#11916) and hopefully will have this much closer to 1 second very soon (and hopefully well below that in the not too distant future). Perhaps we should wait to merge this until we at least get the perf down below 1 second (so it will be less remark-worthy) |
Yes good idea, you will let me know! We will just need to change the minimal version of Also that will let the beta-testers to give us some feedback before we do the merge 😄 Also I still need to find out how to run |
@mwouts I think your workflow step should look like this: - name: Install Quarto
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
run: |
# download the latest release
gh release download --repo quarto-dev/quarto-cli --pattern '*.deb'
# install it
sudo apt install ./*.deb I don't know if you can use Let me know how it goes - it should work from my quick test. |
An update: I've confirmed that the actual user code for I am going to keep working w/ the Deno folks to see if we can compress that 330ms even further (not sure what mitigations/fixes are still available but we'll try). Will keep you posted on this and I think we will likely be okay to merge one way or another later this week. |
Okay I don't think we can easily wring out much more that we have but hopefully the ~ 3.5x speed up we got will make things much more snappy (btw if you were calling I'll give a PR for tweaks to the Quarto language in the docs (will remove the perf qualifiers -- feel free to put them back in if you think they are still necessary). LMK your thoughts about the |
Thank you @jjallaire for the speed improvement and @cderv for the tip on the CI - I hope to find some time tonight to fix the CI as advised. @jjallaire sure the documentation is yours, please go ahead, and thanks for addressing the issue! Regarding the |
Yes, I think it's fine to fix the format version number at 0.2 (or even v1.0) and if there are ever changes we can increment it (note I expect very few if any changes to the core format). There is another very inexpensive way to get the version (which is what we use in RStudio). This is described here: #846 (review). As noted, if there is no |
Okay, scratch all of that about not calling I still think that the format version number should probably be 1.0 as I think it's quite stable right now. |
A couple of additional notes on version checking:
|
Excellent! Here are the latest changes in the
I can merge this PR and publish a new version (Probably Jupytext 1.12.0) when you give your go for it. |
@mwouts green light from here (I think it's sufficient to have a more terse reference to quarto in the top-level docs so no need for edits there). Very happy that we've landed this, thanks so much for making it happen! |
@mwouts Glad CI is working now!
About this, for Windows, I maintain a Scoop bucket with Quarto manifest (https://github.com/cderv/r-bucket#quarto-cli). If you don't know Scoop, it is a package manager for Windows, which allow one to do However, you can also use the same as with Linux: This makes me wonder if we should provide an action to use in GHA to install Quarto. Probably worth it when first release will be out, in addition to dev version. 🤔 |
@cderv Even now I think it would be useful to have a GHA to install Quarto. |
Ok, I'll work on something then. This will be a simple action with probably no parameter until it will make sense to choose a specific release, instead of development version. |
Excellent! So my plan is to ship a new version tonight, then. With respect to the current branch I am just going to revert the tentative addition of |
The new version (As a follow-up I'll try to install |
you can now give a try to https://github.com/quarto-dev/quarto-actions and follow changes there. |
@mwouts I am seeing what appears to be warning in the Jupyter console when I save (the save still seems to work):
Here's the commit where ids were added to the cell schema: jupyter/nbformat@9ede360 |
Hi @jjallaire , I think this might be caused by an non up-to-date version of Can you run I expect that the error should disappear with |
Here is the output of
|
May I open up a discussion here again. I am trying to convert my .qmd file into .ipynb file, but some syntax were not converted properly. For example:
|
Hi @ntluong95 , the conversion between Can you check if the issue you are seeing is documented at https://github.com/quarto-dev/quarto-cli/issues, and if not, report it there? Thanks |
I wonder whether you would also consider supporting the new Quarto qmd format.
Why?
Because Quarto is looking like a very nice system for writing books and reports with excecutable documents, such as Jupyter and R notebooks. Qmd format is Quarto's cross-language text format for notebooks. It assumes .Rmd files are R notebooks, so Qmd format is the natural format for Jupyter notebooks in text form. We (@stefanv and I) are writing our new edition of a statistics book using Quarto. It would be a significant gain in usability if we could use the Jupytext / Notebook integration to edit and execute our documents in native Qmd format (although we are using Rmd at the moment).
What?
It's a slightly modified version of .Rmd format, where the cell metadata goes in comments at the top of the cell, as in:
The text was updated successfully, but these errors were encountered: