Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IMP] convert_html_fragment: prevent lxml wraps and false positives #349

Merged

Conversation

chienandalu
Copy link
Member

@chienandalu chienandalu commented Oct 31, 2023

When the fragment has no common node, lxml wraps the code under a common one.
This content for instance:

<p><p/><p></p>

is parsed as:

<div><p><p/><p></p></div>

To avoid this we force a custom wrapper tag on every parsed string so every xml
receives the same treatment and we can extract it later with no harm.

We don't want to update any fragment which has no changes after all the
replacements are checked. For that cases we'll return the original string.

At the end, we just trim our initial custom wrapper tag and return our
treated string.

I made this test case and tested it in a v16 shell:

from openupgradelib.openupgrade_160 import convert_string_bootstrap_4to5

# test_regular_no_change
html_string = "<div><p></p><p></p></div>"
html_string_parsed = convert_string_bootstrap_4to5(html_string)
assert html_string == html_string_parsed

# test_multi_root_no_change
html_string = "<p></p><p></p>"
html_string_parsed = convert_string_bootstrap_4to5(html_string)
assert html_string == html_string_parsed

# test_regular_with_change
html_string = "<div><p data-html='test'></p><p data-display='test'></p></div>"
expected_output = '<div><p data-bs-html="test"/><p data-bs-display="test"/></div>'
# Turn off pretty-print for easier comparison
html_string_parsed = convert_string_bootstrap_4to5(html_string, pretty_print=False)
assert expected_output == html_string_parsed

# test_multi_root_with_change
html_string = "<p data-html='test'></p><p data-display='test'></p>"
expected_output = '<p data-bs-html="test"/><p data-bs-display="test"/>'
# Turn off pretty-print for easier comparison
html_string_parsed = convert_string_bootstrap_4to5(html_string, pretty_print=False)
assert expected_output == html_string_parsed

cc @Tecnativa TT44169 TT45734

please take a look @pedrobaeza

When the fragment has no common node, lxml wraps the code under a common one.
This content for instance:

    `<p><p/><p></p>`

is parsed as:

`<div><p><p/><p></p></div>`

To avoid this we force a custom wrapper tag on every parsed string so every xml
receives the same treatment and we can extract it later with no harm.

We don't want to update any fragment which has no changes after all the
replacements are checked. For that cases we'll return the original string.

At the end, we just trim our initial custom wrapper tag and return our
treated string.

TT44169 TT45734
@chienandalu chienandalu force-pushed the convert_html_fragment-avoid-false-positives branch from eda241c to 04a8873 Compare October 31, 2023 12:05
@chienandalu chienandalu marked this pull request as ready for review October 31, 2023 12:08
Copy link
Member

@pedrobaeza pedrobaeza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch

@pedrobaeza pedrobaeza merged commit ad1f95c into OCA:master Oct 31, 2023
3 checks passed
@pedrobaeza pedrobaeza deleted the convert_html_fragment-avoid-false-positives branch October 31, 2023 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants