Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V0.8.4 #117

Merged
merged 31 commits into from
Oct 29, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
45fee4e
First markdown cell exported as a docstring in sphinx format #107
mwouts Oct 18, 2018
620a8f5
Set cell_marker on first cell with setdefault #107
mwouts Oct 22, 2018
f7e6015
Filter the notebook metadata in the text representation #105 #110
mwouts Oct 22, 2018
4050460
Update mirror representations #105
mwouts Oct 22, 2018
bd56c59
Update tests - filtered notebook metadata #105 #110
mwouts Oct 22, 2018
b560c4b
Version bump
mwouts Oct 22, 2018
6030c80
Test metadata filtering
mwouts Oct 23, 2018
550cff8
Reflowing code and text
mwouts Oct 23, 2018
bced502
Cell and notebook metadata filter #105 #106 #110
mwouts Oct 23, 2018
b52e8ff
cell_marker is another jupytext cell metadata
mwouts Oct 23, 2018
c4845b2
Preserve jupytext specific cell metadata
mwouts Oct 23, 2018
dceade8
jupytext metadata should not be compared
mwouts Oct 23, 2018
7f981e9
Cell and notebook metadata filters in the content manager
mwouts Oct 23, 2018
a8468bf
Fix test for Python 2.7
mwouts Oct 24, 2018
167e9bc
New config option additional_metadata_on_text_files #110
mwouts Oct 24, 2018
ce52b74
Document the filtering of cell and notebook metadata
mwouts Oct 24, 2018
dccc735
Preserve language magic arguments #111
mwouts Oct 24, 2018
4be2cc5
Newline after version is printed
mwouts Oct 24, 2018
d70e091
Version 0.8.4-rc0
mwouts Oct 24, 2018
6bb7b61
Update README.md
mwouts Oct 24, 2018
73bfedf
Update HISTORY.rst
mwouts Oct 24, 2018
ad2127c
Update README.md
mwouts Oct 25, 2018
733675a
Allow blank spaces in jupytext.formats #105
mwouts Oct 25, 2018
d110433
Metadata filtering config is a dictionary #110
mwouts Oct 25, 2018
c291395
Arguments to %%R now supported in Rmd and py formats #111
mwouts Oct 25, 2018
344ed54
v0.8.4-rc1
mwouts Oct 25, 2018
dadb0dc
Test that metadata filtering string is converted to dict
mwouts Oct 26, 2018
3cdb1b3
Update HISTORY.rst
mwouts Oct 26, 2018
f5bf4ed
Tell which file is created/updated/replaced #113
mwouts Oct 29, 2018
3d24bc8
freeze_metadata option #110
mwouts Oct 29, 2018
097a92d
Version 0.8.4
mwouts Oct 29, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions HISTORY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,16 @@
Release History
---------------

0.8.4 (2018-10-29)
++++++++++++++++++++++

**Improvements**

- Notebook metadata is filtered - only the most common metadata are stored in the text representation (#105)
- New config option ``freeze_metadata`` on the content manager and on the command line interface (defaults to ``False``). Use this option to avoid creating a YAML header or cell metadata if there was none initially. (#110)
- Language magic arguments are preserved in R Markdown, and also supported in ``light`` and ``percent`` scripts (#111, #114, #115)
- First markdown cell exported as a docstring when using the Sphinx format (#107)

0.8.3 (2018-10-19)
++++++++++++++++++++++

Expand Down
44 changes: 34 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,8 +177,9 @@ jupytext --test --update notebook.ipynb -to py:percent
Note that `jupytext --test` compares the resulting notebooks according to its expectations. If you wish to proceed to a strict comparison of the two notebooks, use `jupytext --test-strict`, and use the flag `-x` to report with more details on the first difference, if any.

Please note that
- When you associate a Jupyter kernel with your text notebook, that information goes to a YAML header at the top of your script or Markdown document. And Jupytext itself may create a `jupytext` entry in the notebook metadata.
- Cell metadata are available in `light` and `percent` formats for all cell types. Sphinx Gallery scripts in `sphinx` format do not support cell metadata. R Markdown and R scripts in `spin` format support cell metadata for code cells only. Markdown documents do not support cell metadata. And a few cell metadata (`autoscroll`, `collapsed`, `scrolled`, `trusted`) are never included in the text representation, but are still preserved by the paired notebooks, and the `--update` conversion.
- When you associate a Jupyter kernel with your text notebook, that information goes to a YAML header at the top of your script or Markdown document. And Jupytext itself may create a `jupytext` entry in the notebook metadata. Have a look at the [`freeze_metadata` option](#cell-and-notebook-metadata-filtering) if you want to avoid this.
- Cell metadata are available in `light` and `percent` formats for all cell types. Sphinx Gallery scripts in `sphinx` format do not support cell metadata. R Markdown and R scripts in `spin` format support cell metadata for code cells only. Markdown documents do not support cell metadata.
- By default, a few cell metadata are not included in the text representation of the notebook. And only the most standard notebook metadata are exported. Learn more on this in this in the [metadata filtering](#Cell-and-notebook-metadata-filtering) section.
- Representing a Jupyter notebook as a Markdown or R Markdown document has the effect of splitting markdown cells with two consecutive blank lines into multiple cells (as the two blank line pattern is used to separate cells).

## Format specifications
Expand Down Expand Up @@ -253,14 +254,6 @@ The `spin` format implements these [specifications](https://rmarkdown.rstudio.co
- Markdown cells are commented with `#' `.
- Code cells are exported verbatim. Cell metadata are signalled with `#+`. Cells end with a blank line, an explicit start of cell marker, or a markdown cell.

## Extending the `light` and `percent` formats to more languages

You want to extend the `light` and `percent` format to another language? Please let us know! In principle that is easy, and you will only have to:
- document the language extension and comment by adding one line to `_SCRIPT_EXTENSIONS` in `languages.py`.
- contribute a sample notebook in `tests\notebooks\ipynb_[language]`.
- add two tests in `test_mirror.py`: one for the `light` format, and another one for the `percent` format.
- Make sure that the tests pass, and that the text representations of your notebook, found in `tests\notebooks\mirror\ipynb_to_script` and `tests\notebooks\mirror\ipynb_to_percent`, are valid scripts.

## Jupyter Notebook or Jupyter Lab?

Jupytext works very well with the Jupyter Notebook editor, and we recommend that you get used to Jupytext within `jupyter notebook` first.
Expand All @@ -282,6 +275,37 @@ c.ContentsManager.comment_magics = True # or False

Also, you may want some cells to be active only in the Python, or R Markdown representation. For this, use the `active` cell metadata. Set `"active": "ipynb"` if you want that cell to be active only in the Jupyter notebook. And `"active": "py"` if you want it to be active only in the Python script. And `"active": "ipynb,py"` if you want it to be active in both, but not in the R Markdown representation...

## Cell and notebook metadata filtering

The text representation of the notebook focuses on the part of the notebook that you have written. That is also the part of the notebook that should go under version control. Outputs and metadata that are (re)-constructed automatically when the notebook is executed do not need to enter the text representation.

To that aim, cell metadata `autoscroll`, `collapsed`, `scrolled`, `trusted` and `ExecuteTime` are not included in the text representation. And only the required notebook metadata: `kernelspec`, `language_info` and `jupytext` are saved when a notebook is exported as text.

When a paired notebook is loaded, Jupytext reconstructs the filtered metadata using the `.ipynb` file. Please keep in mind that the `.ipynb` file is typically not distributed accross contributors, and that the cell metadata may be lost when an input cell changes (cells are matched according to their contents). Thus, if some cell or notebook metadata are important to your notebook, you should preserve it in the text version. Change the default metadata filtering as follows:
- If you want to preserve all the notebook metadata but `widgets` and `varInspector` in the YAML header, set a notebook metadata `"jupytext": {"metadata_filter": {"notebook": "all,-widgets,-varInspector"}}`
- If you want to preserve the `toc` section (in addition to the default YAML header), use `"jupytext": {"metadata_filter": {"notebook": "toc"}}`
- At last, if you want to modify the default cell filter and allow `ExecuteTime` and `autoscroll`, but not `hide_ouput`, use `"jupytext": {"metadata_filter": {"cells": "ExecuteTime,autoscroll,-hide_ouput"}}`

A default value for these filters can be set on Jupytext's content manager using, for instance
```
c.default_notebook_metadata_filter = "all,-widgets,-varInspector"
c.default_cell_metadata_filter = "ExecuteTime,autoscroll,-hide_ouput"
```
Help us improving the default configuration: if you are aware of a notebook metadata that should not be filtered, or of a cell metadata that should always be filtered, please open an issue and let us know.

Finally, if you prefer that scripts and markdown files with no YAML header do not get one (nor additional cell metadata) when opened and saved in Jupyter, use the `freeze_metadata` option on the command line `jupytext`, or set the following option on Jupytext's content manager:
```python
c.ContentsManager.freeze_metadata = True
```

## Extending the `light` and `percent` formats to more languages

You want to extend the `light` and `percent` format to another language? Please let us know! In principle that is easy, and you will only have to:
- document the language extension and comment by adding one line to `_SCRIPT_EXTENSIONS` in `languages.py`.
- contribute a sample notebook in `tests\notebooks\ipynb_[language]`.
- add two tests in `test_mirror.py`: one for the `light` format, and another one for the `percent` format.
- Make sure that the tests pass, and that the text representations of your notebook, found in `tests\notebooks\mirror\ipynb_to_script` and `tests\notebooks\mirror\ipynb_to_percent`, are valid scripts.

## Jupytext's releases and backward compatibility

Jupytext will continue to evolve as we collect more feedback, and discover more ways to represent notebooks as text files. When a new release of Jupytext comes out, we make our best to ensure that it will not break your notebooks. Format changes will not happen often, and we try hard not to introduce breaking changes.
Expand Down
19 changes: 6 additions & 13 deletions jupytext/cell_metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,14 +26,14 @@

_BOOLEAN_OPTIONS_DICTIONARY = [('hide_input', 'echo', True),
('hide_output', 'include', True)]
_IGNORE_METADATA = [
_IGNORE_CELL_METADATA = ','.join('-{}'.format(name) for name in [
# Frequent cell metadata that should not enter the text representation
# (these metadata are preserved in the paired Jupyter notebook).
'autoscroll', 'collapsed', 'scrolled', 'trusted', 'ExecuteTime',
# Pre-jupytext metadata
'skipline', 'noskipline',
# Jupytext metadata
'lines_to_next_cell', 'lines_to_end_of_cell_marker']
'cell_marker', 'lines_to_next_cell', 'lines_to_end_of_cell_marker'])
_PERCENT_CELL = re.compile(
r'(# |#)%%([^\{\[]*)(|\[raw\]|\[markdown\])([^\{\[]*)(|\{.*\})\s*$')

Expand Down Expand Up @@ -68,7 +68,6 @@ def metadata_to_rmd_options(language, metadata):
:return:
"""
options = (language or 'R').lower()
metadata = filter_metadata(metadata)
if 'name' in metadata:
options += ' ' + metadata['name'] + ','
del metadata['name']
Expand Down Expand Up @@ -237,9 +236,6 @@ def rmd_options_to_metadata(options):
else:
if update_metadata_from_rmd_options(name, value, metadata):
continue
if name == 'active':
metadata[name] = value.replace('"', '').replace("'", '')
continue
try:
metadata[name] = _py_logical_values(value)
continue
Expand All @@ -252,7 +248,7 @@ def rmd_options_to_metadata(options):
if ('active' in metadata or metadata.get('run_control', {}).get('frozen') is True) and 'eval' in metadata:
del metadata['eval']

return language, metadata
return metadata.get('language') or language, metadata


def md_options_to_metadata(options):
Expand Down Expand Up @@ -283,7 +279,9 @@ def try_eval_metadata(metadata, name):
value = metadata[name]
if not isinstance(value, (str, unicode)):
return
if value.startswith('"') or value.startswith("'"):
if (value.startswith('"') and value.endswith('"')) or (value.startswith("'") and value.endswith("'")):
if name in ['active', 'magic_args', 'language']:
metadata[name] = value[1:-1]
return
if value.startswith('c(') and value.endswith(')'):
value = '[' + value[2:-1] + ']'
Expand All @@ -304,11 +302,6 @@ def json_options_to_metadata(options, add_brackets=True):
return {}


def filter_metadata(metadata):
"""Filter technical metadata"""
return {k: metadata[k] for k in metadata if k not in _IGNORE_METADATA}


def metadata_to_json_options(metadata):
"""Represent metadata as json text"""
return json.dumps(metadata)
Expand Down
13 changes: 8 additions & 5 deletions jupytext/cell_reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,6 @@ class BaseCellReader(object):

cell_type = None
language = None
default_language = 'python'
default_comment_magics = None
metadata = None
content = []
Expand All @@ -95,6 +94,7 @@ class BaseCellReader(object):
def __init__(self, ext, comment_magics=None):
"""Create a cell reader with empty content"""
self.ext = ext
self.default_language = _SCRIPT_EXTENSIONS.get(ext, {}).get('language', 'python')
self.comment_magics = comment_magics if comment_magics is not None else self.default_comment_magics

def read(self, lines):
Expand All @@ -106,8 +106,7 @@ def read(self, lines):
self.metadata_and_language_from_option_line(lines[0])

if self.metadata and 'language' in self.metadata:
self.language = self.metadata['language']
del self.metadata['language']
self.language = self.metadata.pop('language')

# Parse cell till its end and set content, lines_to_next_cell
pos_next_cell = self.find_cell_content(lines)
Expand Down Expand Up @@ -202,10 +201,14 @@ def find_cell_content(self, lines):
# Cell content
source = lines[cell_start:cell_end_marker]

self.content = self.uncomment_code_and_magics(source)
if not is_active(self.ext, self.metadata) or \
('active' not in self.metadata and self.language and self.language != self.default_language):
self.content = uncomment(source, self.comment if self.ext != '.R' else '#')
else:
self.content = self.uncomment_code_and_magics(source)

# Exactly two empty lines at the end of cell (caused by PEP8)?
if (self.ext == '.py' and explicit_eoc and last_two_lines_blank(source)):
if self.ext == '.py' and explicit_eoc and last_two_lines_blank(source):
self.content = source[:-2]
self.metadata['lines_to_end_of_cell_marker'] = 2

Expand Down
59 changes: 42 additions & 17 deletions jupytext/cell_to_text.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
import re
from copy import copy
from .languages import cell_language
from .cell_metadata import filter_metadata, is_active, \
metadata_to_rmd_options, metadata_to_json_options, \
metadata_to_double_percent_options
from .cell_metadata import is_active, _IGNORE_CELL_METADATA
from .cell_metadata import metadata_to_rmd_options, metadata_to_json_options, metadata_to_double_percent_options
from .metadata_filter import filter_metadata
from .magics import comment_magic, escape_code_start
from .cell_reader import LightScriptCellReader
from .languages import _SCRIPT_EXTENSIONS
Expand All @@ -31,13 +31,29 @@ def comment_lines(lines, prefix):
class BaseCellExporter(object):
"""A class that represent a notebook cell as text"""
default_comment_magics = None
parse_cell_language = True

def __init__(self, cell, default_language, ext, comment_magics=None):
def __init__(self, cell, default_language, ext, comment_magics=None, cell_metadata_filter=None):
self.ext = ext
self.cell_type = cell.cell_type
self.source = cell_source(cell)
self.metadata = filter_metadata(cell.metadata)
self.language = cell_language(self.source) or default_language
self.unfiltered_metadata = cell.metadata
self.metadata = filter_metadata(copy(cell.metadata), cell_metadata_filter, _IGNORE_CELL_METADATA)
self.language, magic_args = cell_language(self.source) if self.parse_cell_language else (None, None)

if self.language:
if magic_args:
if ext.endswith('.Rmd'):
if "'" in magic_args:
magic_args = '"' + magic_args + '"'
else:
magic_args = "'" + magic_args + "'"
self.metadata['magic_args'] = magic_args

if not ext.endswith('.Rmd'):
self.metadata['language'] = self.language

self.language = self.language or default_language
self.default_language = default_language
self.comment = _SCRIPT_EXTENSIONS.get(ext, {}).get('comment', '#')
self.comment_magics = comment_magics if comment_magics is not None else self.default_comment_magics
Expand Down Expand Up @@ -96,8 +112,8 @@ class MarkdownCellExporter(BaseCellExporter):
"""A class that represent a notebook cell as Markdown"""
default_comment_magics = False

def __init__(self, cell, default_language, ext, comment_magics=None):
BaseCellExporter.__init__(self, cell, default_language, ext, comment_magics)
def __init__(self, *args, **kwargs):
BaseCellExporter.__init__(self, *args, **kwargs)
self.comment = ''

def code_to_text(self):
Expand All @@ -119,8 +135,8 @@ class RMarkdownCellExporter(BaseCellExporter):
"""A class that represent a notebook cell as Markdown"""
default_comment_magics = True

def __init__(self, cell, default_language, ext, comment_magics=None):
BaseCellExporter.__init__(self, cell, default_language, ext, comment_magics)
def __init__(self, *args, **kwargs):
BaseCellExporter.__init__(self, *args, **kwargs)
self.comment = ''

def code_to_text(self):
Expand Down Expand Up @@ -158,6 +174,12 @@ class LightScriptCellExporter(BaseCellExporter):
"""A class that represent a notebook cell as a Python or Julia script"""
default_comment_magics = True

def __init__(self, *args, **kwargs):
BaseCellExporter.__init__(self, *args, **kwargs)
for key in ['endofcell']:
if key in self.unfiltered_metadata:
self.metadata[key] = self.unfiltered_metadata[key]

def is_code(self):
# Treat markdown cells with metadata as code cells (#66)
if self.cell_type == 'markdown' and self.metadata:
Expand All @@ -169,10 +191,8 @@ def is_code(self):
def code_to_text(self):
"""Return the text representation of a code cell"""
active = is_active(self.ext, self.metadata)
if active and self.language != self.default_language:
if self.language != self.default_language and 'active' not in self.metadata:
active = False
self.metadata['active'] = 'ipynb'
self.metadata['language'] = self.language

source = copy(self.source)
escape_code_start(source, self.ext, self.language)
Expand Down Expand Up @@ -232,8 +252,8 @@ class RScriptCellExporter(BaseCellExporter):
"""A class that can represent a notebook cell as a R script"""
default_comment_magics = True

def __init__(self, cell, default_language, ext, comment_magics=None):
BaseCellExporter.__init__(self, cell, default_language, ext, comment_magics)
def __init__(self, *args, **kwargs):
BaseCellExporter.__init__(self, *args, **kwargs)
self.comment = "#'"

def code_to_text(self):
Expand Down Expand Up @@ -267,6 +287,7 @@ class DoublePercentCellExporter(BaseCellExporter):
"""A class that can represent a notebook cell as an
Hydrogen/Spyder/VScode script (#59)"""
default_comment_magics = False
parse_cell_language = False

def code_to_text(self):
"""Not used"""
Expand Down Expand Up @@ -303,10 +324,14 @@ class SphinxGalleryCellExporter(BaseCellExporter):
default_cell_marker = '#' * 79
default_comment_magics = True

def __init__(self, cell, default_language, ext, comment_magics=None):
BaseCellExporter.__init__(self, cell, default_language, ext, comment_magics)
def __init__(self, *args, **kwargs):
BaseCellExporter.__init__(self, *args, **kwargs)
self.comment = '#'

for key in ['cell_marker']:
if key in self.unfiltered_metadata:
self.metadata[key] = self.unfiltered_metadata[key]

def code_to_text(self):
"""Not used"""
pass
Expand Down
Loading