Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML API: Add method to create fragment at node #7348

Conversation

sirreal
Copy link
Member

@sirreal sirreal commented Sep 13, 2024

Trac ticket: https://core.trac.wordpress.org/ticket/62357

The HTML API has two ways of creating HTML processors, create_fragment and
create_full_parser.

create_fragment is a static method that creates a fragment parser from a very reasonable default context node, a BODY element.

However, it may be necessary to create a fragment parser with a different context node, for example when working with HTML inside of an SVG or TABLE tag may be parsed very differently.

One strong motivation for this is the implementation of a set_inner_html method which requires parsing an HTML fragment with the appropriate context node in order to change its contents.

Add a create_fragment_at_current_node( string $html_fragment ) method. This method must be called when the processor is paused at a #tag with some strict constraints:

  • Must be paused on a #tag.
  • The opening and closing tags must appear in the HTML input (no virtual tokens).
  • No "self-contained" elements are allowed ( IFRAME, SCRIPT, TITLE, etc.).
 OK, but incomplete, skipped, or risky tests!
-Tests: 2525, Assertions: 4375, Skipped: 422.
+Tests: 2708, Assertions: 4535, Skipped: 465.

This supports work on set_inner_html in #7326 and #7742.

#7777 is a follow-up that leverages this new method ::create_fragment to make all fragments from a full parser, which is closer to the specification.


This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

@sirreal sirreal force-pushed the html-api/add-spawn-fragment-parser-method branch from 7e0e922 to 8ccd0cb Compare October 10, 2024 15:09
@sirreal sirreal force-pushed the html-api/add-spawn-fragment-parser-method branch from 9f5a895 to 42f3df4 Compare November 8, 2024 18:58
@sirreal sirreal marked this pull request as ready for review November 12, 2024 19:53
Copy link

github-actions bot commented Nov 12, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props jonsurrell, bernhard-reiter, gziolo.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@sirreal
Copy link
Member Author

sirreal commented Nov 13, 2024

This behavior can be tested with the playground and the html api debugger plugin.

Interesting snippets:

Context Html
<svg> <rect width="10" height="10" /><foreignobject><style>* {fill: red;}</style>
<!DOCTYPE quirks-mode><body> <p><table>
<!DOCTYPE html><body> <p><table>
<table> <td>cell generates tbody > tr > td
<body> <td>cell is ignored
<body> <p class="CASE-inSENSITIVE">uppercase me.</p><style>.case-insensitive {text-transform: uppercase;}</style>
<!DOCTYPE html><body> <p class="CASE-SENSITIVE">don't uppercase me.</p><style>.case-sensitive {text-transform: uppercase;}</style>

Co-authored-by: Bernie Reiter <96308+ockham@users.noreply.github.com>
@ockham
Copy link
Contributor

ockham commented Nov 20, 2024

How does this relate to create_fragment's $context argument? The latter is currently hard-wired to <body>.

It seems like create_fragment_at_current_node will allow to bypass this limitation (as it sets the new parser's context_node). Is this the only way we'll be able to create a fragment with a non-<body> context?

If yes: Should we deprecate create_fragment's $context arg?
If no: Would it make sense to move some of the logic from create_fragment_at_current_node to create_fragment, in order to have the latter support different contexts?

@sirreal
Copy link
Member Author

sirreal commented Nov 20, 2024

Those are all good questions and they're not addressed in this PR, but are in #7777.

I wanted to keep these two PRs independent because ::create_fragment is still the most common way to create a fragment and I didn't want to refactor it in the same changeset.

This method serves as an "advanced" way of creating fragments (with a different doctype, node, or in quirks mode), while ::create_fragment remains unchanged.

#7777 changes this so that instead of create_fragment_at_current_node calling create_fragment and then changing a bunch of internal state, the processor always starts from a full HTML processor and then creates a fragment processor from that. This basically aligns with the HTML specification. This will allow more contexts to be used in ::create_fragment as well. It will create a full processor, find the last tag opener in the context, and create the fragment at that location.

There's some relevant conversation on #7777 #7777 (review).

@sirreal sirreal requested a review from ockham November 20, 2024 17:10
Copy link
Contributor

@ockham ockham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving. Thank you for your explanations and inline comments, they've been instrumental in understanding how this works. The increased test coverage by html5lib is pretty impressive 👍

How would you like to land this? Should the commit message say Fixes #62357 and then we re-open the ticket and close it again with #7777, or should we do something else?

@sirreal
Copy link
Member Author

sirreal commented Nov 20, 2024

Thank you!

How would you like to land this?

I don't feel too strongly, I this can be landed closing #62357, then #7777 could be against another ticket like "allow more fragment context in ::create_fragment".

#7777 supersedes #7141. That PR was associated with #61576, but that was a general "HTAML API in 6.7" ticket. We can sort out #7777 later.

@ockham
Copy link
Contributor

ockham commented Nov 21, 2024

Committed to Core in https://core.trac.wordpress.org/changeset/59444/.

@ockham ockham closed this Nov 21, 2024
@sirreal sirreal deleted the html-api/add-spawn-fragment-parser-method branch November 21, 2024 13:37
$fragment_processor->state->context_node = array( $fragment_processor->context_node->node_name, array() );

$attribute_names = $this->get_attribute_names_with_prefix( '' );
if ( null !== $attribute_names ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs say that get_attribute_names_with_prefix returns null

when no tag opener is matched.

This got me thinking -- should we disallow create_fragment_at_current_node from being called when the processor is paused on a tag closer?

(If we do, then we should be able to remove this guard, as it would be guaranteed that we're at a tag opener.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thought! This won't behave as expected on closers, and it seems like a strange thing to do in general.

#7859

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants