Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split handling of HTML attributes & style CSS properties #1211

Merged
merged 3 commits into from
Jun 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 7 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@ This can also be enabled programmatically with `warnings.simplefilter('default',
* support for quadratic and cubic Bézier curves with [`FPDF.bezier()`](https://py-pdf.github.io/fpdf2/fpdf/Shapes.html#fpdf.fpdf.FPDF.bezier) - thanks to @awmc000
* feature to identify the Unicode script of the input text and break it into fragments when different scripts are used, improving [text shaping](https://py-pdf.github.io/fpdf2/TextShaping.html) results
* [`FPDF.image()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.image): now handles `keep_aspect_ratio` in combination with an enum value provided to `x`
* file names are mentioned in errors when `fpdf2` fails to parse a SVG image
* [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html): now supports CSS page breaks properties : [documentation](https://py-pdf.github.io/fpdf2/HTML.html#page-breaks)
* [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html): spacing before lists can now be adjusted via the `HTML2FPDF.list_vertical_margin` attribute - thanks to @lcgeneralprojects
* [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html): spacing before lists can now be adjusted via the `tag_styles` attribute - thanks to @lcgeneralprojects
* file names are mentioned in errors when `fpdf2` fails to parse a SVG image
### Fixed
* [`FPDF.local_context()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.local_context) used to leak styling during page breaks, when rendering `footer()` & `header()`
* [`fpdf.drawing.DeviceCMYK`](https://py-pdf.github.io/fpdf2/fpdf/drawing.html#fpdf.drawing.DeviceCMYK) objects can now be passed to [`FPDF.set_draw_color()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.set_draw_color), [`FPDF.set_fill_color()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.set_fill_color) and [`FPDF.set_text_color()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.set_text_color) without raising a `ValueError`: [documentation](https://py-pdf.github.io/fpdf2/Text.html#text-formatting).
Expand All @@ -38,10 +38,13 @@ This can also be enabled programmatically with `warnings.simplefilter('default',
* default values for `top_margin` and `bottom_margin` in `HTML2FPDF._new_paragraph()` calls are now correctly converted into chosen document units.
* In [text_columns()](https://py-pdf.github.io/fpdf2/extColumns.html), paragraph top/bottom margins didn't correctly trigger column breaks; [issue #1214](https://github.com/py-pdf/fpdf2/issues/1214)
### Removed
* an obscure and undocumented [feature](https://github.com/py-pdf/fpdf2/issues/1198) of [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html), which used to magically pass local variables as arguments.
* an obscure and undocumented [feature](https://github.com/py-pdf/fpdf2/issues/1198) of [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html), which used to magically pass instance attributes as arguments.
### Deprecated
* `fpdf.TitleStyle` has been renamed into `fpdf.TextStyle`
* [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html): `tag_indents` introduced in the last version - Now the indentation can be provided through the `tag_styles` parameter, using the `.l_margin` of `TextStyle` instances
### Changed
* [`FPDF.table()`](https://py-pdf.github.io/fpdf2/Tables.html) now raises an error when a single row is too high to be rendered on a single page
* [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html): `tag_indents` can now be non-integer. Indentation of HTML elements is now independent of font size and bullet strings.
* [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html): indentation of HTML elements can now be non-integer (float), and is now independent of font size and bullet strings.
* improved performance of font glyph selection by using functools cache

## [2.7.9] - 2024-05-17
Expand Down
24 changes: 19 additions & 5 deletions docs/HTML.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,16 +97,16 @@ pdf.write_html("""
<p>Hello world!</p>
</section>
""", tag_styles={
"h1": FontFace(color=(148, 139, 139), size_pt=32),
"h2": FontFace(color=(148, 139, 139), size_pt=24),
"h1": FontFace(color="#948b8b", size_pt=32),
"h2": FontFace(color="#948b8b", size_pt=24),
})
pdf.output("html_styled.pdf")
```

Similarly, the indentation of several HTML tags (`<blockquote>`, `<dd>`, `<li>`) can be set globally, for the whole HTML document, by passing `tag_indents` to `FPDF.write_html()`:
Similarly, the indentation of several HTML tags (`<blockquote>`, `<dd>`, `<li>`) can be set globally, for the whole HTML document, by passing `tag_styles` to `FPDF.write_html()`:

```python
from fpdf import FPDF
from fpdf import FPDF, TextStyle

pdf = FPDF()
pdf.add_page()
Expand All @@ -115,10 +115,23 @@ pdf.write_html("""
<dt>Term</dt>
<dd>Definition</dd>
</dl>
""", tag_indents={"dd": 5})
<blockquote>
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed non risus.
Suspendisse lectus tortor, dignissim sit amet, adipiscing nec, ultricies sed, dolor.
Cras elementum ultrices diam.
</blockquote>
""", tag_styles={
"dd": TextStyle(l_margin=5),
"blockquote": TextStyle(color="#ccc", font_style="I",
t_margin=5, b_margin=5, l_margin=10),
})
pdf.output("html_dd_indented.pdf")
```

⚠️ Note that this styling is currently only supported for a subset of all HTML tags,
and that some [`FontFace`](https://py-pdf.github.io/fpdf2/fpdf/fonts.html#fpdf.fonts.FontFace) or [`TextStyle`](https://py-pdf.github.io/fpdf2/fpdf/fonts.html#fpdf.fonts.TextStyle) properties may not be honored.
However, **Pull Request are welcome** to implement missing features!


## Supported HTML features

Expand All @@ -143,6 +156,7 @@ pdf.output("html_dd_indented.pdf")
* `<td>`: cells (with `align`, `bgcolor`, `width`, `rowspan`, `colspan` attributes)

### Page breaks

_New in [:octicons-tag-24: 2.7.10](https://github.com/py-pdf/fpdf2/blob/master/CHANGELOG.md)_

Page breaks can be triggered explicitly using the [break-before](https://developer.mozilla.org/en-US/docs/Web/CSS/break-before) or [break-after](https://developer.mozilla.org/en-US/docs/Web/CSS/break-after) CSS properties.
Expand Down
5 changes: 3 additions & 2 deletions fpdf/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
* `fpdf.enums.YPos`
* `fpdf.errors.FPDFException`
* `fpdf.fonts.FontFace`
* `fpdf.fpdf.TitleStyle`
* `fpdf.fonts.TextStyle`
* `fpdf.prefs.ViewerPreferences`
* `fpdf.template.Template`
* `fpdf.template.FlexTemplate`
Expand All @@ -25,7 +25,7 @@
FPDF_FONT_DIR as _FPDF_FONT_DIR,
FPDF_VERSION as _FPDF_VERSION,
)
from .fonts import FontFace
from .fonts import FontFace, TextStyle
from .html import HTMLMixin, HTML2FPDF
from .prefs import ViewerPreferences
from .template import Template, FlexTemplate
Expand Down Expand Up @@ -74,6 +74,7 @@
"Template",
"FlexTemplate",
"TitleStyle",
"TextStyle",
"ViewerPreferences",
# Deprecated classes:
"HTMLMixin",
Expand Down
8 changes: 8 additions & 0 deletions fpdf/enums.py
Original file line number Diff line number Diff line change
Expand Up @@ -245,6 +245,14 @@ def style(self):
name for name, value in self.__class__.__members__.items() if value & self
)

def add(self, value: "TextEmphasis"):
return self | value

def remove(self, value: "TextEmphasis"):
return TextEmphasis.coerce(
"".join(s for s in self.style if s not in value.style)
)

@classmethod
def coerce(cls, value):
if isinstance(value, str):
Expand Down
78 changes: 77 additions & 1 deletion fpdf/fonts.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
in non-backward-compatible ways.
"""

import re
import re, warnings

from bisect import bisect_left
from collections import defaultdict
Expand All @@ -31,6 +31,7 @@ def __deepcopy__(self, _memo):
except ImportError:
hb = None

from .deprecation import get_stack_level
from .drawing import convert_to_device_color, DeviceGray, DeviceRGB
from .enums import FontDescriptorFlags, TextEmphasis
from .syntax import Name, PDFObject
Expand Down Expand Up @@ -109,6 +110,81 @@ def combine(default_style, override_style):
)


class TextStyle(FontFace):
"""
Subclass of `FontFace` that allows to specify vertical & horizontal spacing
"""

def __init__(
self,
font_family: Optional[str] = None, # None means "no override"
# Whereas "" means "no emphasis"
font_style: Optional[str] = None,
font_size_pt: Optional[int] = None,
color: Union[int, tuple] = None, # grey scale or (red, green, blue),
fill_color: Union[int, tuple] = None, # grey scale or (red, green, blue),
underline: bool = False,
t_margin: Optional[int] = None,
l_margin: Optional[int] = None,
b_margin: Optional[int] = None,
):
super().__init__(
font_family,
((font_style or "") + "U") if underline else font_style,
font_size_pt,
color,
fill_color,
)
self.t_margin = t_margin or 0
self.l_margin = l_margin or 0
self.b_margin = b_margin or 0

def __repr__(self):
return (
super().__repr__()[:-1]
+ f", t_margin={self.t_margin}, l_margin={self.l_margin}, b_margin={self.b_margin})"
)

def replace(
self,
/,
font_family=None,
emphasis=None,
font_size_pt=None,
color=None,
fill_color=None,
t_margin=None,
l_margin=None,
b_margin=None,
):
return TextStyle(
font_family=font_family or self.family,
font_style=self.emphasis if emphasis is None else emphasis.style,
font_size_pt=font_size_pt or self.size_pt,
color=color or self.color,
fill_color=fill_color or self.fill_color,
t_margin=self.t_margin if t_margin is None else t_margin,
l_margin=self.l_margin if l_margin is None else l_margin,
b_margin=self.b_margin if b_margin is None else b_margin,
)


class TitleStyle(TextStyle):
def __init__(self, *args, **kwargs):
warnings.warn(
(
"fpdf.TitleStyle is deprecated since 2.7.10."
" It has been replaced by fpdf.TextStyle."
),
DeprecationWarning,
stacklevel=get_stack_level(),
)
super().__init__(*args, **kwargs)


__pdoc__ = {"TitleStyle": False} # Replaced by TextStyle


class CoreFont:
# RAM usage optimization:
__slots__ = ("i", "type", "name", "up", "ut", "cw", "fontkey", "emphasis")
Expand Down
93 changes: 31 additions & 62 deletions fpdf/fpdf.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ class Image:
YPos,
)
from .errors import FPDFException, FPDFPageFormatException, FPDFUnicodeEncodingException
from .fonts import CoreFont, CORE_FONTS, FontFace, TTFFont
from .fonts import CoreFont, CORE_FONTS, FontFace, TextStyle, TitleStyle, TTFFont
from .graphics_state import GraphicsStateMixin
from .html import HTML2FPDF
from .image_datastructures import (
Expand Down Expand Up @@ -142,36 +142,6 @@ class Image:
}


class TitleStyle(FontFace):
def __init__(
self,
font_family: Optional[str] = None, # None means "no override"
# Whereas "" means "no emphasis"
font_style: Optional[str] = None,
font_size_pt: Optional[int] = None,
color: Union[int, tuple] = None, # grey scale or (red, green, blue),
underline: bool = False,
t_margin: Optional[int] = None,
l_margin: Optional[int] = None,
b_margin: Optional[int] = None,
):
super().__init__(
font_family,
((font_style or "") + "U") if underline else font_style,
font_size_pt,
color,
)
self.t_margin = t_margin
self.l_margin = l_margin
self.b_margin = b_margin

def __repr__(self):
return (
super().__repr__()[:-1]
+ f", t_margin={self.t_margin}, l_margin={self.l_margin}, b_margin={self.b_margin})"
)


class ToCPlaceholder(NamedTuple):
render_function: Callable
start_page: int
Expand Down Expand Up @@ -307,7 +277,7 @@ def __init__(
self._toc_placeholder = None # optional ToCPlaceholder instance
self._outline = [] # list of OutlineSection
self._sign_key = None
self.section_title_styles = {} # level -> TitleStyle
self.section_title_styles = {} # level -> TextStyle

self.core_fonts_encoding = "latin-1"
"Font encoding, Latin-1 by default"
Expand Down Expand Up @@ -413,25 +383,24 @@ def write_html(self, text, *args, **kwargs):

Args:
text (str): HTML content to render
image_map (function): an optional one-argument function that map <img> "src"
to new image URLs
li_tag_indent (int): [**DEPRECATED since v2.7.8**]
numeric indentation of <li> elements - Set tag_indents instead
dd_tag_indent (int): [**DEPRECATED since v2.7.8**]
numeric indentation of <dd> elements - Set tag_indents instead
table_line_separators (bool): enable horizontal line separators in <table>
ul_bullet_char (str): bullet character preceding <li> items in <ul> lists.
li_prefix_color (tuple | str | drawing.Device* instance):
color for bullets or numbers preceding <li> tags.
This applies to both <ul> & <ol> lists.
heading_sizes (dict): [**DEPRECATED since v2.7.8**]
font size per heading level names ("h1", "h2"...) - Set tag_styles instead
pre_code_font (str): [**DEPRECATED since v2.7.8**]
font to use for <pre> & <code> blocks - Set tag_styles instead
warn_on_tags_not_matching (bool): control warnings production for unmatched HTML tags
tag_indents (dict):
mapping of HTML tag names to numeric values representing their horizontal left identation
tag_styles (dict): mapping of HTML tag names to colors
image_map (function): an optional one-argument function that map `<img>` "src" to new image URLs
li_tag_indent (int): [**DEPRECATED since v2.7.9**]
numeric indentation of `<li>` elements - Set `tag_styles` instead
dd_tag_indent (int): [**DEPRECATED since v2.7.9**]
numeric indentation of `<dd>` elements - Set `tag_styles` instead
table_line_separators (bool): enable horizontal line separators in `<table>`. Defaults to `False`.
ul_bullet_char (str): bullet character preceding `<li>` items in `<ul>` lists.
Can also be configured using the HTML `type` attribute of `<ul>` tags.
li_prefix_color (tuple, str, fpdf.drawing.DeviceCMYK, fpdf.drawing.DeviceGray, fpdf.drawing.DeviceRGB): color for bullets
or numbers preceding `<li>` tags. This applies to both `<ul>` & `<ol>` lists.
heading_sizes (dict): [**DEPRECATED since v2.7.9**]
font size per heading level names ("h1", "h2"...) - Set `tag_styles` instead
pre_code_font (str): [**DEPRECATED since v2.7.9**]
font to use for `<pre>` & `<code>` blocks - Set `tag_styles` instead
warn_on_tags_not_matching (bool): control warnings production for unmatched HTML tags. Defaults to `True`.
tag_indents (dict): [**DEPRECATED since v2.7.10**]
mapping of HTML tag names to numeric values representing their horizontal left identation. - Set `tag_styles` instead
tag_styles (dict[str, fpdf.fonts.TextStyle]): mapping of HTML tag names to `fpdf.TextStyle` or `fpdf.FontFace` instances
"""
html2pdf = self.HTML2FPDF_CLASS(self, *args, **kwargs)
with self.local_context():
Expand Down Expand Up @@ -5033,18 +5002,18 @@ def set_section_title_styles(
After calling this method, calls to `FPDF.start_section` will render section names visually.

Args:
level0 (TitleStyle): style for the top level section titles
level1 (TitleStyle): optional style for the level 1 section titles
level2 (TitleStyle): optional style for the level 2 section titles
level3 (TitleStyle): optional style for the level 3 section titles
level4 (TitleStyle): optional style for the level 4 section titles
level5 (TitleStyle): optional style for the level 5 section titles
level6 (TitleStyle): optional style for the level 6 section titles
level0 (TextStyle): style for the top level section titles
level1 (TextStyle): optional style for the level 1 section titles
level2 (TextStyle): optional style for the level 2 section titles
level3 (TextStyle): optional style for the level 3 section titles
level4 (TextStyle): optional style for the level 4 section titles
level5 (TextStyle): optional style for the level 5 section titles
level6 (TextStyle): optional style for the level 6 section titles
"""
for level in (level0, level1, level2, level3, level4, level5, level6):
if level and not isinstance(level, TitleStyle):
if level and not isinstance(level, TextStyle):
raise TypeError(
f"Arguments must all be TitleStyle instances, got: {type(level)}"
f"Arguments must all be TextStyle instances, got: {type(level)}"
)
self.section_title_styles = {
0: level0,
Expand Down Expand Up @@ -5115,7 +5084,7 @@ def start_section(self, name, level=0, strict=True):
)

@contextmanager
def _use_title_style(self, title_style: TitleStyle):
def _use_title_style(self, title_style: TextStyle):
if title_style:
if title_style.t_margin:
self.ln(title_style.t_margin)
Expand Down Expand Up @@ -5177,7 +5146,7 @@ def table(self, *args, **kwargs):
relative to the page, when it's not using the full page width.
borders_layout (str, fpdf.enums.TableBordersLayout): optional, default to ALL. Control what cell
borders are drawn.
cell_fill_color (int, tuple, fpdf.drawing.DeviceGray, fpdf.drawing.DeviceRGB): optional.
cell_fill_color (int, tuple, fpdf.drawing.DeviceCMYK, fpdf.drawing.DeviceGray, fpdf.drawing.DeviceRGB): optional.
Defines the cells background color.
cell_fill_mode (str, fpdf.enums.TableCellFillMode): optional. Defines which cells are filled
with color in the background.
Expand Down
Loading
Loading