Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the loading of images using stored URL metrics #884

Merged
Show file tree
Hide file tree
Changes from 33 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
4758013
Fix prefix for output buffer template filter
westonruter Nov 15, 2023
5fb6fda
Add small and medium breakpoints from Gutenberg to go along with mobile
westonruter Nov 16, 2023
bd1153d
Add ilo_get_lcp_elements_by_minimum_viewport_widths()
westonruter Nov 16, 2023
a990982
Use array_filter() to simplify breakpoint merge
westonruter Nov 16, 2023
b18a0c7
Remove fetchpriority from images when different breakpoints have diff…
westonruter Nov 17, 2023
25ea308
Update ilo_get_lcp_elements_by_minimum_viewport_widths() to account f…
westonruter Nov 17, 2023
c88f358
WIP: Breadcrumb calculation on server
westonruter Nov 18, 2023
c20f967
Set appropriate fetchpriority and otherwise add preload links
westonruter Nov 18, 2023
08635fa
Use preg_replace() with limit 1 for injection or preload links at end…
westonruter Nov 28, 2023
ec55c9a
Improve variable naming
westonruter Nov 28, 2023
6e0f809
Add todos for doing breadcrumbs exclusively on server
westonruter Nov 28, 2023
e429983
Account for HEAD closing tag possibly being upper-case
westonruter Nov 28, 2023
8974ac8
Disable on Customizer preview and non-GET responses
westonruter Nov 28, 2023
acde9b4
Introduce ILO_HTML_Tag_Processor
westonruter Nov 28, 2023
024719d
Allow callable not just Closure
westonruter Nov 28, 2023
38cb821
Prevent error when no ilo_url_metrics post
westonruter Nov 28, 2023
234e2fe
Clarify logic in ilo_optimize_template_output_buffer
westonruter Nov 28, 2023
1585a78
Set attribute when server-side heuristics were correct
westonruter Nov 28, 2023
292d1c4
Prevent setting fetchpriority on IMG when not all breakpoints have da…
westonruter Nov 28, 2023
fca7f8d
Remove needless if statement
westonruter Nov 28, 2023
3cda887
Disable background image detection until implemented on server
westonruter Nov 28, 2023
f00addc
Use tag instead of tagName in breadcrumbs
westonruter Nov 28, 2023
ce1db2f
Prevent adding media query to preload link when just min-width:0
westonruter Nov 28, 2023
d74420d
Add preload links always and consolidate code paths
westonruter Nov 29, 2023
f0ee6b2
Add since and private access tags
westonruter Nov 29, 2023
1e495db
Update return tag phpdoc for ilo_construct_preload_links()
westonruter Nov 29, 2023
8065801
Add missing since and private access tags to ilo_get_lcp_elements_by_…
westonruter Nov 29, 2023
131a9a0
Move GH comment into code comment
westonruter Nov 29, 2023
ee83ba0
Remove needless remove_fetchpriority_attribute method
westonruter Nov 29, 2023
3d99128
Prevent removing fetchpriority when all breakpoints do not have URL m…
westonruter Nov 29, 2023
d42601c
Use wp_trigger_error() in ILO_HTML_Tag_Processor and improve phpdoc
westonruter Nov 29, 2023
4db5161
Add var phpdoc tag to class constant
westonruter Nov 29, 2023
2c3cf3a
Use a generator instead of a callback
westonruter Nov 29, 2023
3d7b5fa
Fix comment typo
westonruter Nov 30, 2023
957d2c4
Improve naming of void tags and add missing tags which close P
westonruter Nov 30, 2023
e8a36b4
Remove todo related to user being logged-in
westonruter Dec 1, 2023
0cca4dd
Add note about how ILO_HTML_Tag_Processor is needed until WP_HTML_Pro…
westonruter Dec 1, 2023
83d0362
Move class file include to load.php
westonruter Dec 1, 2023
605fb02
Add TODO to remove loading attribute
westonruter Dec 1, 2023
74145a4
Optimize looking up LCP element by breadcrumb
westonruter Dec 1, 2023
2758c0d
Make ilo_construct_preload_links() easier to read
westonruter Dec 1, 2023
9d82615
Never include loading=lazy on the LCP image common across all breakpo…
westonruter Dec 2, 2023
9bfcdc9
Account for the same element being LCP on different breakpoints
westonruter Dec 4, 2023
185ebf6
Clarify LCP element language in comment
westonruter Dec 5, 2023
1d259b5
Add comment explaining why for loop is used
westonruter Dec 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,296 @@
<?php
/**
* Image Loading Optimization: ILO_HTML_Tag_Processor class
*
* @package performance-lab
* @since n.e.x.t
*/

/**
* Processor leveraging WP_HTML_Tag_Processor which gathers breadcrumbs which can be queried while iterating the open_tags() generator .
*
* @since n.e.x.t
* @access private
*/
final class ILO_HTML_Tag_Processor {

/**
* HTML elements that are self-closing.
*
* @link https://www.w3.org/TR/html5/syntax.html#serializing-html-fragments
* @link https://github.com/ampproject/amp-toolbox-php/blob/c79a0fe558a3c042aee4789bbf33376cca7a733d/src/Html/Tag.php#L206-L232
*
* @var string[]
*/
const SELF_CLOSING_TAGS = array(
westonruter marked this conversation as resolved.
Show resolved Hide resolved
'AREA',
'BASE',
'BASEFONT',
'BGSOUND',
'BR',
'COL',
'EMBED',
'FRAME',
'HR',
'IMG',
'INPUT',
'KEYGEN',
'LINK',
'META',
'PARAM',
'SOURCE',
'TRACK',
'WBR',
);

/**
* The set of HTML tags whose presence will implicitly close a <p> element.
* For example '<p>foo<h1>bar</h1>' should parse the same as '<p>foo</p><h1>bar</h1>'.
*
* @link https://www.w3.org/TR/html-markup/p.html
* @link https://github.com/ampproject/amp-toolbox-php/blob/c79a0fe558a3c042aee4789bbf33376cca7a733d/src/Html/Tag.php#L262-L293
*
* @var string[]
*/
const P_CLOSING_TAGS = array(
'ADDRESS',
'ARTICLE',
'ASIDE',
'BLOCKQUOTE',
'DIR',
'DL',
'FIELDSET',
'FOOTER',
'FORM',
'H1',
'H2',
'H3',
'H4',
'H5',
'H6',
'HEADER',
'HR',
'MENU',
'NAV',
'OL',
'P',
'PRE',
'SECTION',
'TABLE',
'UL',
westonruter marked this conversation as resolved.
Show resolved Hide resolved
);

/**
* Open stack tags.
*
* @var string[]
*/
private $open_stack_tags = array();

/**
* Open stag indices.
*
* @var int[]
*/
private $open_stack_indices = array();

/**
* Processor.
*
* @var WP_HTML_Tag_Processor
*/
private $processor;

/**
* Constructor.
*
* @param string $html HTML to process.
*/
public function __construct( string $html ) {
$this->processor = new WP_HTML_Tag_Processor( $html );
}

/**
* Gets all open tags in the document.
*
* A generator is used so that when iterating at a specific tag, additional information about the tag at that point
* can be queried from the class. Similarly, mutations may be performed when iterating at an open tag.
*
* @return Generator<string> Tag name of current open tag.
*/
public function open_tags(): Generator {
$p = $this->processor;

/*
* The keys for the following two arrays correspond to each other. Given the following document:
*
* <html>
* <head>
* </head>
* <body>
* <p>Hello!</p>
* <img src="lcp.png">
* </body>
* </html>
*
* The two upon processing the IMG element, the two arrays should be equal to the following:
westonruter marked this conversation as resolved.
Show resolved Hide resolved
*
* $open_stack_tags = array( 'HTML', 'BODY', 'IMG' );
* $open_stack_indices = array( 0, 1, 1 );
*/
$this->open_stack_tags = array();
$this->open_stack_indices = array();
while ( $p->next_tag( array( 'tag_closers' => 'visit' ) ) ) {
$tag_name = $p->get_tag();
if ( ! $p->is_tag_closer() ) {

// Close an open P tag when a P-closing tag is encountered.
// TODO: There are quite a few more cases of optional closing tags: https://html.spec.whatwg.org/multipage/syntax.html#optional-tags
// Nevertheless, given WordPress's legacy of XHTML compatibility, the lack of closing tags may not be common enough to warrant worrying about any of them.
if ( in_array( $tag_name, self::P_CLOSING_TAGS, true ) ) {
$i = array_search( 'P', $this->open_stack_tags, true );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is probably fine enough, but be aware that there are certain markers/boundaries in the stack that effectively isolate regions of HTML. for example, if we're currently inside a TEMPLATE element and encounter one of these P-closing elements, they don't close the P from the outside of the template.

<p>
	This template is isolated.
	<template><section><p>Hidden</p></section></template>
	The outer P remains.
</p>

if ( false !== $i ) {
array_splice( $this->open_stack_tags, $i );
array_splice( $this->open_stack_indices, count( $this->open_stack_tags ) );
}
}

$level = count( $this->open_stack_tags );
$this->open_stack_tags[] = $tag_name;

if ( ! isset( $this->open_stack_indices[ $level ] ) ) {
$this->open_stack_indices[ $level ] = 0;
} else {
++$this->open_stack_indices[ $level ];
}

// TODO: We should consider not collecting metrics when the admin bar is shown and the user is logged-in.
westonruter marked this conversation as resolved.
Show resolved Hide resolved
// Only increment the tag index at this level only if it isn't the admin bar, since the presence of the
// admin bar can throw off the indices.
if ( 'DIV' === $tag_name && $p->get_attribute( 'id' ) === 'wpadminbar' ) {
--$this->open_stack_indices[ $level ];
}

// Now that the breadcrumbs are constructed, yield the tag name so that they can be queried if desired.
// Other mutations may be performed to the open tag's attributes by the callee at this point as well.
yield $tag_name;

// Immediately pop off self-closing tags.
if ( in_array( $tag_name, self::SELF_CLOSING_TAGS, true ) ) {
array_pop( $this->open_stack_tags );
}
} else {
// If the closing tag is for self-closing tag, we ignore it since it was already handled above.
if ( in_array( $tag_name, self::SELF_CLOSING_TAGS, true ) ) {
continue;
}

// Since SVG and MathML can have a lot more self-closing/empty tags, potentially pop off the stack until getting to the open tag.
$did_splice = false;
if ( 'SVG' === $tag_name || 'MATH' === $tag_name ) {
$i = array_search( $tag_name, $this->open_stack_tags, true );
if ( false !== $i ) {
array_splice( $this->open_stack_tags, $i );
$did_splice = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if encountering an element which isn't an HTML Void element, then regardless of the tag, if the self-closing flag is present, then it closes the tag. all self-closing flags on HTML elements are invalid and ignored, but all on HTML foreign elements are authoritative and obeyed.

<p>There are no empty <div/> tags. This is inside the DIV</p>.
<wp-group>Is a custom element, and so <wp-group /> is self-closing while others aren't.</wp-group>
if (
	WP_HTML_Processor::is_void( $tag_name ) || 
	(
		! WP_HTML_Processor::is_html_element( $tag_name ) &&
		$p->has_self_closing_flag
	)
) {
	// this tag immediately closes as soon as we jump to the next tag.
}

this requires having a list of all HTML elements, which the HTML Processor currently doesn't do because it doesn't support any, but we'll have to add it. this can lead to common parsing failures because the invalid self-closing flag has become more popular post-React where it's valid and normative in JSX.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very interesting. I wasn't aware of the WP_HTML_Processor::has_self_closing_flag() method.

Nevertheless, maybe having a list of all HTML elements isn't needed here because: (1) if math or svg are ancestors, we can assume that all tags with self-closing flags will close the tag, and (2) custom elements always have hyphens in them, so if present we can also honor the self-closing tag.

Nevertheless, I just checked your example and it doesn't seem the second example with a custom element is actually true. I adapted your example to use a span instead of div (since a div closes an open p):

<p>There are no empty <span/> tags. This is inside the SPAN</p>.
<wp-group>Is a custom element, and so <wp-group /> is self-closing while others aren't.</wp-group>

And Chrome renders this as:

image

So perhaps such self-closing foreign elements are limited to MathML and SVG contexts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example:

<svg>Is a custom element, and so <g /> is self-closing while others aren't.</svg>

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by golly, you're right. I have been misinterpreting "foreign element" for a long time now! thankfully I haven't gotten to the place in the HTML Processor where that matters.

thank you very much for pointing this out.

}
}

if ( ! $did_splice ) {
$popped_tag_name = array_pop( $this->open_stack_tags );
if ( $popped_tag_name !== $tag_name && function_exists( 'wp_trigger_error' ) ) {
wp_trigger_error(
__METHOD__,
esc_html(
sprintf(
/* translators: 1: Popped tag name, 2: Closing tag name */
__( 'Expected popped tag stack element %1$s to match the currently visited closing tag %2$s.', 'performance-lab' ),
$popped_tag_name,
$tag_name
)
)
);
}
}
array_splice( $this->open_stack_indices, count( $this->open_stack_tags ) + 1 );
}
}
}

/**
* Gets breadcrumbs for the current open tag.
*
* Breadcrumbs are constructed to match the format from detect.js.
*
* @return array<array{tag: string, index: int}> Breadcrumbs.
*/
public function get_breadcrumbs(): array {
$breadcrumbs = array();
foreach ( $this->open_stack_tags as $i => $breadcrumb_tag_name ) {
$breadcrumbs[] = array(
'tag' => $breadcrumb_tag_name,
'index' => $this->open_stack_indices[ $i ],
);
}
return $breadcrumbs;
}

/**
* Returns the value of a requested attribute from a matched tag opener if that attribute exists.
*
* This is a wrapper around the underlying HTML_Tag_Processor method of the same name since only a limited number of
* methods can be exposed to prevent moving the pointer in such a way as the breadcrumb calculation is invalidated.
*
* @see WP_HTML_Tag_Processor::get_attribute()
*
* @param string $name Name of attribute whose value is requested.
* @return string|true|null Value of attribute or `null` if not available. Boolean attributes return `true`.
*/
public function get_attribute( string $name ) {
return $this->processor->get_attribute( $name );
}

/**
* Updates or creates a new attribute on the currently matched tag with the passed value.
*
* This is a wrapper around the underlying HTML_Tag_Processor method of the same name since only a limited number of
* methods can be exposed to prevent moving the pointer in such a way as the breadcrumb calculation is invalidated.
*
* @see WP_HTML_Tag_Processor::set_attribute()
*
* @param string $name The attribute name to target.
* @param string|bool $value The new attribute value.
* @return bool Whether an attribute value was set.
*/
public function set_attribute( string $name, $value ): bool {
return $this->processor->set_attribute( $name, $value );
}

/**
* Removes an attribute from the currently-matched tag.
*
* This is a wrapper around the underlying HTML_Tag_Processor method of the same name since only a limited number of
* methods can be exposed to prevent moving the pointer in such a way as the breadcrumb calculation is invalidated.
*
* @see WP_HTML_Tag_Processor::remove_attribute()
*
* @param string $name The attribute name to remove.
* @return bool Whether an attribute was removed.
*/
public function remove_attribute( string $name ): bool {
return $this->processor->remove_attribute( $name );
}

/**
* Returns the string representation of the HTML Tag Processor.
*
* This is a wrapper around the underlying HTML_Tag_Processor method of the same name since only a limited number of
* methods can be exposed to prevent moving the pointer in such a way as the breadcrumb calculation is invalidated.
*
* @see WP_HTML_Tag_Processor::get_updated_html()
*
* @return string The processed HTML.
*/
public function get_updated_html(): string {
return $this->processor->get_updated_html();
}
}
45 changes: 34 additions & 11 deletions modules/images/image-loading-optimization/detection/detect.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ const consoleLogPrefix = '[Image Loading Optimization]';

const storageLockTimeSessionKey = 'iloStorageLockTime';

const adminBarId = 'wpadminbar';

/**
* Checks whether storage is locked.
*
Expand Down Expand Up @@ -78,8 +80,8 @@ function error( ...message ) {

/**
* @typedef {Object} Breadcrumb
* @property {number} index - Index of element among sibling elements.
* @property {string} tagName - Tag name.
* @property {number} index - Index of element among sibling elements.
* @property {string} tag - Tag name.
*/

/**
Expand All @@ -104,19 +106,36 @@ function error( ...message ) {
/**
* Gets element index among siblings.
*
* @todo Eliminate this in favor of doing all breadcrumb generation exclusively on the server.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be done in subsequent PR.

*
* @param {Element} element Element.
* @return {number} Index.
*/
function getElementIndex( element ) {
if ( ! element.parentElement ) {
return 0;
}
return [ ...element.parentElement.children ].indexOf( element );
const children = [ ...element.parentElement.children ];
let index = children.indexOf( element );
if ( children.includes( document.getElementById( adminBarId ) ) ) {
// TODO: Should detection just be turned off when is_user_logged_in()?
westonruter marked this conversation as resolved.
Show resolved Hide resolved
--index;
}
if (
children.includes(
document.querySelector( '.skip-link.screen-reader-text' )
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be cached.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to leave this as-is for this PR. In the next PR I intend to eliminate client-side breadcrumb construction since it is too easy for JavaScript DOM mutations to cause client-side generated breadcrumbs from being able to apply on the server.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you switch to generate breadcrumbs on the server, maybe a shorter index or hash passed to the front end?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. Currently I'm thinking that when detection is needed that the server would add breadcrumbs via a data attribute, for example:

<img data-ilo-breadcrumbs="html,0 body,1 main,2 figure,10 img,0" ...>

When detection is not needed, such data attributes would not be present.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a TODO comment? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already one:

* @todo Eliminate this in favor of doing all breadcrumb generation exclusively on the server.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the next PR: #892

)
) {
--index;
}
return index;
}

/**
* Gets breadcrumbs for a given element.
*
* @todo Eliminate this in favor of doing all breadcrumb generation exclusively on the server.
*
* @param {Element} leafElement
* @return {Breadcrumb[]} Breadcrumbs.
*/
Expand All @@ -127,7 +146,7 @@ function getBreadcrumbs( leafElement ) {
let element = leafElement;
while ( element instanceof Element ) {
breadcrumbs.unshift( {
tagName: element.tagName,
tag: element.tagName,
index: getElementIndex( element ),
} );
element = element.parentElement;
Expand Down Expand Up @@ -238,23 +257,27 @@ export default async function detect( {

// Obtain the admin bar element because we don't want to detect elements inside of it.
const adminBar =
/** @type {?HTMLDivElement} */ doc.getElementById( 'wpadminbar' );
/** @type {?HTMLDivElement} */ doc.getElementById( adminBarId );

// We need to capture the original elements and their breadcrumbs as early as possible in case JavaScript is
// mutating the DOM from the original HTML rendered by the server, in which case the breadcrumbs obtained from the
// client will no longer be valid on the server. As such, the results are stored in an array and not any live list.
const breadcrumbedImages = doc.body.querySelectorAll( 'img' );

// We do the same for elements with background images which are not data: URLs.
const breadcrumbedElementsWithBackgrounds = Array.from(
doc.body.querySelectorAll( '[style*="background"]' )
).filter( ( /** @type {Element} */ el ) =>
/url\(\s*['"](?!=data:)/.test( el.style.backgroundImage )
);
// TODO: Re-enable background image support when server-side is implemented.
// const breadcrumbedElementsWithBackgrounds = Array.from(
// doc.body.querySelectorAll( '[style*="background"]' )
// ).filter( ( /** @type {Element} */ el ) =>
// /url\(\s*['"](?!=data:)/.test( el.style.backgroundImage )
// );

/** @type {Map<Element, Breadcrumb[]>} */
const breadcrumbedElementsMap = new Map(
[ ...breadcrumbedImages, ...breadcrumbedElementsWithBackgrounds ].map(
[
...breadcrumbedImages /*, ...breadcrumbedElementsWithBackgrounds*/,
].map(
// TODO: Instead of generating breadcrumbs here, rely instead on server-generated breadcrumbs that are added to a data attribute by the server.
( element ) => [ element, getBreadcrumbs( element ) ]
)
Comment on lines 266 to 281
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes will be undone in subsequent PRs which will (1) move to do all breadcrumbing on the server, and (2) actually implement support for background-image.

);
Expand Down
Loading