Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indent HTML lists correctly (Issue 1073) #1170

Merged
merged 45 commits into from
Jun 15, 2024

Conversation

lcgeneralprojects
Copy link

@lcgeneralprojects lcgeneralprojects commented May 17, 2024

Fixes #1073
Implements paragraph indentation via adjustments of pdf.x instead of using a natural number of whitespaces.
Breaks up list items into individual paragraphs.

Checklist:

  • The GitHub pipeline is OK (green),
    meaning that both pylint (static code analyzer) and black (code formatter) are happy with the changes of this PR.

  • A unit test is covering the code added / modified by this PR

  • This PR is ready to be merged

  • In case of a new feature, docstrings have been added, with also some documentation in the docs/ folder

  • A mention of the change is present in CHANGELOG.md

Not sure how to actually name the HTML2FPDF.list_pseudo_margin attribute. It is used for determining the height of the \n line created when a <ul> or <ol> starting tag is handled.

Should the re-implementation of paragraph indentation be reflected in a doctstring, even though the tag_indents parameter of write_html() has not been touched? If so, where should it be placed?

By submitting this pull request, I confirm that my contribution is made under the terms of the GNU LGPL 3.0 license.

Some debugging and polish needed.
Some variables need tweaking.
Needs testing.
Code reuse unsatisfactory.
Some variables need tweaking.
Needs testing.
Code reuse unsatisfactory.
Potentially significant issues with tests:
1. test_html_ln_outside_p - IndexError: list index out of range.
2. test_html_ol_ul_line_height - actual distance between lines differs slightly from expected.

Code reuse unsatisfactory.
Need feedback for handling <dd> and <blockquote>.

Potentially significant issues with tests:
1. test_html_ol_ul_line_height - actual distance between lines differs slightly from expected.

Need feedback for whether or not the new indentation that contradicts old tests is satisfactory.

Code reuse unsatisfactory.
Need feedback.
Bug present: bullets are made one per line instead of one per paragraph.
Saving progress before introducing a `Bullet` class.
Feature implemented.
Testing and adjustments of tests needed.
Prevented from `Paragraph.top_margin` being added to `pdf.y` of first lines of paragraphs with bullets.
Prevented from `Paragraph.top_margin` being added to `pdf.y` of first lines of paragraphs with bullets.
`<ul>` and `<ol>` tags now cause a creation of a paragraph with the string `\n` being used to generate a fragment of the height `list_pseudo_margin`.
Adjusted defaults for `li_tag_indent`.
fpdf/fpdf.py Outdated Show resolved Hide resolved
Changed `Paragraph.generate_bullet_frag()` into `generate_bullet_frag_and_tl`, and made it also generate the bullet text line.
Dealing with the issue of inappropriately large distance between `<dt>` and their child `<dd>` elements when `Paragraph.top_margin` is 0.
Changed `Paragraph.generate_bullet_frag()` into `generate_bullet_frag_and_tl`, and made it also generate the bullet text line.
@lcgeneralprojects lcgeneralprojects marked this pull request as ready for review May 20, 2024 07:37
@gmischler gmischler changed the title Issue 1073 Indent HTML lists correctly (Issue 1073) May 20, 2024
Copy link
Collaborator

@gmischler gmischler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work so far, but the devil is in the detail...

As I'm sure you've noticed, the interplay between the HTML parser, text regions, line wrapping, and rendering is non-trivial. I've added some pointers of how to fix the parts that don't quite add up yet.

fpdf/line_break.py Outdated Show resolved Hide resolved
fpdf/html.py Outdated Show resolved Hide resolved
fpdf/html.py Show resolved Hide resolved
fpdf/text_region.py Show resolved Hide resolved
fpdf/text_region.py Outdated Show resolved Hide resolved
fpdf/text_region.py Show resolved Hide resolved
fpdf/text_region.py Show resolved Hide resolved
fpdf/html.py Outdated Show resolved Hide resolved
test/html/test_html.py Show resolved Hide resolved
…`list_vertical_margin`.

Removed the `MultiLineBreak.indent` attribute.
Added a test for long `<ol>` bullets.
fpdf/html.py Outdated Show resolved Hide resolved
test/html/test_html.py Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
test/html/test_html.py Outdated Show resolved Hide resolved
Added `test_html_list_vertical_margin`.
Fixed the non-assignment of `HTML2FPDF.list_vertical_margin` when the constructor argument `list_vertical_margin` is not None.
fpdf/text_region.py Outdated Show resolved Hide resolved
test/html/test_html.py Outdated Show resolved Hide resolved
lcgeneralprojects and others added 2 commits June 12, 2024 22:05
…agraph()`. Updated relevant docstring and `TextRegion.md` documentation.

Adjusted `test_bulleted_paragraphs` to remove mentions of `rel_y_displacement`, changed instances of string `"rel_x_displacement"` to `bullet_r_margin` and introduced usage of `case["bullet_r_margin"]` in the test.
Copy link
Collaborator

@gmischler gmischler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, just two last clean-up items left, and then this will be ready to merge!

Quite an impressive list of fixes in the change log, which resulted in many intricate details to take care about.

docs/TextRegion.md Outdated Show resolved Hide resolved
test/html/test_html.py Outdated Show resolved Hide resolved
@lcgeneralprojects
Copy link
Author

Not sure how that happened, considering that I was accepting changes from master when merging.

@gmischler
Copy link
Collaborator

Not sure how that happened, considering that I was accepting changes from master when merging.

I think I've merged #1198 after your last rebase. Just do another one.

@gmischler
Copy link
Collaborator

Wait, in 69c4107 you seem to have reverted all of your changes to test_html.py. You'll want to restore those, except for test_html_customize_ul().

@gmischler
Copy link
Collaborator

When you do a rebase and there are conflicts, you usually need to resolve those manually.

The files will contain marked sections that show the differences, and you can chose the right parts for each section.

@lcgeneralprojects
Copy link
Author

lcgeneralprojects commented Jun 15, 2024

I am aware. I opted to go for blanket-accepting changes from master because I thought that it will just resolve that particular conflict and not just replace the entire file.

Trying to figure out how to re-do the merge without destroying the history. (Although, yeah, I'm aware that I can just copy the relevant stuff and paste it manually. I would still rather try to find out how to deal with situations like these.)

@gmischler
Copy link
Collaborator

Don't worry about the history. We flatten everything into a single commit anyway when merging a PR.

# Conflicts:
#	test/html/html_customize_ul.pdf
#	test/html/test_html.py
Correctly this time.
@gmischler gmischler merged commit 1547d4d into py-pdf:master Jun 15, 2024
11 checks passed
@gmischler
Copy link
Collaborator

Merged.

Thanks for the useful contribution, @lcgeneralprojects ! 👍

@Lucas-C
Copy link
Member

Lucas-C commented Jun 17, 2024

@allcontributors please add @lcgeneralprojects for bug, code

Copy link

@Lucas-C

I've put up a pull request to add @lcgeneralprojects! 🎉

@andersonhc
Copy link
Collaborator

@lcgeneralprojects

Congrats on successfully completing this pull request. This was no small feat! You changed some complex parts of the code and despite the numerous iterations you showed patience and dedication to push through.

Thank you for your hard work and perseverance. Looking forward to more collaborations in the future!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Long HTML list entries are not correctly indented
4 participants