Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Enable short_caption in to_latex #35668

Merged
merged 39 commits into from
Oct 17, 2020
Merged
Show file tree
Hide file tree
Changes from 34 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
13d07ae
Extract helper method for caption & label macro
ivanovmg Aug 11, 2020
8912c81
Enable short_caption for df.to_latex
ivanovmg Aug 11, 2020
3f71b46
Replace unwanted pytest.warns with tm.assert...
ivanovmg Aug 11, 2020
22d2ca4
Fix missing f-string placeholder
ivanovmg Aug 11, 2020
01152c1
Apply black
ivanovmg Aug 11, 2020
6adb05c
Optionally unpack caption=(caption, short_caption)
ivanovmg Aug 18, 2020
4725d73
Merge branch 'master' into feature/latex-shortcaption
ivanovmg Aug 18, 2020
a7b64c0
Add edge cases for caption testing
ivanovmg Aug 18, 2020
a2216a1
Pass through black
ivanovmg Aug 18, 2020
6c93de4
Remove typing and short_caption from to_latex
ivanovmg Aug 19, 2020
45de6eb
Merge branch 'master' into feature/latex-shortcaption
ivanovmg Sep 7, 2020
e70aafa
DOC: add parameters to LatexFormatter docstring
ivanovmg Sep 7, 2020
a0e3f53
TYP: remove type ignore for column_format
ivanovmg Sep 8, 2020
6725ca8
REF: move short caption parsing to LatexFormatter
ivanovmg Sep 8, 2020
8466421
DOC: add whatsnew for position and short caption
ivanovmg Sep 10, 2020
0cc0664
DOC: add issue number for short caption
ivanovmg Sep 10, 2020
c60a705
Merge branch 'master' into feature/latex-shortcaption
ivanovmg Sep 10, 2020
27891a3
CLN: update error message
ivanovmg Sep 13, 2020
e43b52f
TST: ensure that error message is tested
ivanovmg Sep 13, 2020
fbea9eb
TST: add tests for bad tuples
ivanovmg Sep 13, 2020
29b37e2
Merge branch 'master' into feature/latex-shortcaption
ivanovmg Sep 13, 2020
ed0132c
DOC: add/update versionadded, versionchanged tags
ivanovmg Sep 13, 2020
ed4b705
TST: add assertions in caption setter to help mypy
ivanovmg Sep 13, 2020
3453d43
DOC: add reason for caption type ignore
ivanovmg Sep 13, 2020
16884bc
DOC: add reason for strrows arg-type ignore
ivanovmg Sep 13, 2020
6fd52ca
DOC: add missing empty line before versionadded
ivanovmg Sep 13, 2020
5acfc44
Merge branch 'master' into feature/latex-shortcaption
ivanovmg Sep 14, 2020
ae1babe
REF: replace caption setter with initialize method
ivanovmg Sep 22, 2020
ddec116
Merge branch 'master' into feature/latex-shortcaption
ivanovmg Sep 22, 2020
15227b3
TST: align longtable test with the recent changes
ivanovmg Sep 22, 2020
336bfb5
Merge branch 'master' into feature/latex-shortcaption
ivanovmg Sep 23, 2020
b30d2d7
TST: add for list [full_caption, short_caption]
ivanovmg Sep 23, 2020
f1781f6
Merge branch 'master' into feature/latex-shortcaption
ivanovmg Sep 25, 2020
f4b18ff
Merge branch 'master' into feature/latex-shortcaption
ivanovmg Sep 30, 2020
a18c85d
Merge branch 'master' into feature/latex-shortcaption
ivanovmg Oct 7, 2020
e957c37
REF: use string concat for caption macro
ivanovmg Oct 7, 2020
09d9c85
REF: move method to module level
ivanovmg Oct 7, 2020
559ca2a
REF: drop property _short_caption_macro
ivanovmg Oct 7, 2020
0fd73e8
Merge branch 'master' into feature/latex-shortcaption
ivanovmg Oct 17, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions doc/source/whatsnew/v1.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,32 @@ For example:
buffer = io.BytesIO()
data.to_csv(buffer, mode="w+b", encoding="utf-8", compression="gzip")

Support for short caption and table position in ``to_latex``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

:meth:`DataFrame.to_latex` now allows one to specify
a floating table position (:issue:`35281`)
and a short caption (:issue:`36267`).

New keyword ``position`` is implemented to set the position.

.. ipython:: python

data = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
table = data.to_latex(position='ht')
print(table)

Usage of keyword ``caption`` is extended.
Besides taking a single string as an argument,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could add a link to to_latest here: https://pandas.pydata.org/docs/user_guide/io.html (followon as prob need a short section as well).

one can optionally provide a tuple of ``(full_caption, short_caption)``
to add a short caption macro.

.. ipython:: python

data = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
table = data.to_latex(caption=('the full long caption', 'short caption'))
print(table)

.. _whatsnew_120.read_csv_table_precision_default:

Change in default floating precision for ``read_csv`` and ``read_table``
Expand Down
18 changes: 14 additions & 4 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -3002,6 +3002,9 @@ def to_latex(
.. versionchanged:: 1.0.0
Added caption and label arguments.

.. versionchanged:: 1.2.0
Added position argument, changed meaning of caption argument.

Parameters
----------
buf : str, Path or StringIO-like, optional, default None
Expand Down Expand Up @@ -3063,11 +3066,16 @@ def to_latex(
centered labels (instead of top-aligned) across the contained
rows, separating groups via clines. The default will be read
from the pandas config module.
caption : str, optional
The LaTeX caption to be placed inside ``\caption{{}}`` in the output.
caption : str or tuple, optional
Tuple (full_caption, short_caption),
which results in ``\caption[short_caption]{{full_caption}}``;
if a single string is passed, no short caption will be set.

.. versionadded:: 1.0.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. versionadded:: 1.0.0
.. versionchanged:: 1.2.0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think I should keep .. versionadded for the consistency. Shouldn't I?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure.

can have both. and add a one-liner describing the changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added these tags for caption, position (PR #35284) and to_latex function itself.


.. versionchanged:: 1.2.0
Optionally allow caption to be a tuple ``(full_caption, short_caption)``.

label : str, optional
The LaTeX label to be placed inside ``\label{{}}`` in the output.
This is used with ``\ref{{}}`` in the main ``.tex`` file.
Expand All @@ -3076,6 +3084,8 @@ def to_latex(
position : str, optional
The LaTeX positional argument for tables, to be placed after
``\begin{{}}`` in the output.

.. versionadded:: 1.2.0
{returns}
See Also
--------
Expand All @@ -3086,8 +3096,8 @@ def to_latex(
Examples
--------
>>> df = pd.DataFrame(dict(name=['Raphael', 'Donatello'],
... mask=['red', 'purple'],
... weapon=['sai', 'bo staff']))
... mask=['red', 'purple'],
... weapon=['sai', 'bo staff']))
>>> print(df.to_latex(index=False)) # doctest: +NORMALIZE_WHITESPACE
\begin{{tabular}}{{lll}}
\toprule
Expand Down
2 changes: 1 addition & 1 deletion pandas/io/formats/format.py
Original file line number Diff line number Diff line change
Expand Up @@ -1021,7 +1021,7 @@ def to_latex(
multicolumn: bool = False,
multicolumn_format: Optional[str] = None,
multirow: bool = False,
caption: Optional[str] = None,
caption: Optional[Union[str, Tuple[str, str]]] = None,
label: Optional[str] = None,
position: Optional[str] = None,
) -> Optional[str]:
Expand Down
71 changes: 63 additions & 8 deletions pandas/io/formats/latex.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Module for formatting output data in Latex.
"""
from abc import ABC, abstractmethod
from typing import IO, Iterator, List, Optional, Type
from typing import IO, Iterator, List, Optional, Tuple, Type, Union

import numpy as np

Expand Down Expand Up @@ -275,6 +275,8 @@ class TableBuilderAbstract(ABC):
Use multirow to enhance MultiIndex rows.
caption: str, optional
Table caption.
short_caption: str, optional
Table short caption.
label: str, optional
LaTeX label.
position: str, optional
Expand All @@ -289,6 +291,7 @@ def __init__(
multicolumn_format: Optional[str] = None,
multirow: bool = False,
caption: Optional[str] = None,
short_caption: Optional[str] = None,
label: Optional[str] = None,
position: Optional[str] = None,
):
Expand All @@ -298,6 +301,7 @@ def __init__(
self.multicolumn_format = multicolumn_format
self.multirow = multirow
self.caption = caption
self.short_caption = short_caption
self.label = label
self.position = position

Expand Down Expand Up @@ -384,8 +388,23 @@ def _position_macro(self) -> str:

@property
def _caption_macro(self) -> str:
r"""Caption macro, extracted from self.caption, like \caption{cap}."""
return f"\\caption{{{self.caption}}}" if self.caption else ""
r"""Caption macro, extracted from self.caption.

With short caption:
\caption[short_caption]{caption_string}.

Without short caption:
\caption{caption_string}.
"""
if self.caption:
return f"\\caption{self._short_caption_macro}{{{self.caption}}}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return f"\\caption{self._short_caption_macro}{{{self.caption}}}"
return f"\\caption{self._short_caption or ''}{{{self.caption}}}"

This can be simplified to get rid of the property

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

short_caption_macro is either an empty string or [short caption text] (in square brackets).
Putting this logic into f-string seems to be rather complicated.
IMHO having a separate property (_short_caption_macro) with clear intent is more readable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd, does my answer make sense?
@jreback, @simonjayhawkins, @toobaz, any plans on merging this?
Or does it require additional work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivanovmg am looking now, we have 200 open PRs; these take time

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way you have it is ok; the f-string actually is pretty complicated to grok on the substitutions. if you can make it simpler would take (e.g. use concatenation maybe)

Copy link
Member Author

@ivanovmg ivanovmg Oct 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made string joining (and substitution in place). And removed property _short_caption_macro as @WillAyd suggested. Would you consider it more readable?

return ""

@property
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above - I think can remove this altogether

def _short_caption_macro(self) -> str:
if self.short_caption:
return f"[{self.short_caption}]"
return ""

@property
def _label_macro(self) -> str:
Expand Down Expand Up @@ -596,15 +615,32 @@ def env_end(self) -> str:


class LatexFormatter(TableFormatter):
"""
r"""
Used to render a DataFrame to a LaTeX tabular/longtable environment output.

Parameters
----------
formatter : `DataFrameFormatter`
longtable : bool, default False
Use longtable environment.
column_format : str, default None
The columns format as specified in `LaTeX table format
<https://en.wikibooks.org/wiki/LaTeX/Tables>`__ e.g 'rcl' for 3 columns
multicolumn : bool, default False
Use \multicolumn to enhance MultiIndex columns.
multicolumn_format : str, default 'l'
The alignment for multicolumns, similar to `column_format`
multirow : bool, default False
Use \multirow to enhance MultiIndex rows.
caption : str or tuple, optional
Tuple (full_caption, short_caption),
which results in \caption[short_caption]{full_caption};
if a single string is passed, no short caption will be set.
label : str, optional
The LaTeX label to be placed inside ``\label{}`` in the output.
position : str, optional
jreback marked this conversation as resolved.
Show resolved Hide resolved
The LaTeX positional argument for tables, to be placed after
``\begin{}`` in the output.

See Also
--------
Expand All @@ -619,18 +655,18 @@ def __init__(
multicolumn: bool = False,
multicolumn_format: Optional[str] = None,
multirow: bool = False,
caption: Optional[str] = None,
caption: Optional[Union[str, Tuple[str, str]]] = None,
label: Optional[str] = None,
position: Optional[str] = None,
):
self.fmt = formatter
self.frame = self.fmt.frame
self.longtable = longtable
self.column_format = column_format # type: ignore[assignment]
self.column_format = column_format
self.multicolumn = multicolumn
self.multicolumn_format = multicolumn_format
self.multirow = multirow
self.caption = caption
self.caption, self.short_caption = self._split_into_long_short_caption(caption)
self.label = label
self.position = position

Expand Down Expand Up @@ -658,6 +694,7 @@ def builder(self) -> TableBuilderAbstract:
multicolumn_format=self.multicolumn_format,
multirow=self.multirow,
caption=self.caption,
short_caption=self.short_caption,
label=self.label,
position=self.position,
)
Expand All @@ -670,8 +707,26 @@ def _select_builder(self) -> Type[TableBuilderAbstract]:
return RegularTableBuilder
return TabularBuilder

def _split_into_long_short_caption(
self, caption: Optional[Union[str, Tuple[str, str]]]
) -> Tuple[str, str]:
if caption:
if isinstance(caption, str):
long_caption = caption
short_caption = ""
else:
try:
long_caption, short_caption = caption
except ValueError as err:
msg = "caption must be either a string or a tuple of two strings"
raise ValueError(msg) from err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test that hits this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is there, if you mean raising ValueError.

String 561.

        # test that wrong number of params is raised
        with pytest.raises(ValueError):
            df.to_latex(caption=(the_caption, the_short_caption, "extra_string"))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the test to ensure that the error message matches expectations.

else:
long_caption = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this a free function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this function to module level.

short_caption = ""
return long_caption, short_caption

@property
def column_format(self) -> str:
def column_format(self) -> Optional[str]:
"""Column format."""
return self._column_format

Expand Down
Loading