Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHP 8.2 Improvements #533

Merged
merged 24 commits into from
Sep 20, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
b491d83
CI: run tests on PHP 8.2
swissspidy Aug 24, 2022
525e918
Avoid dynamic properties in the `Document` class
swissspidy Aug 24, 2022
167da1d
Avoid dynamic properties in `BaseTransformerConfiguration`
swissspidy Aug 24, 2022
05c7cc0
Do not use deprecated `utf8_decode()` function in tests
swissspidy Aug 24, 2022
a25c899
Support PHP 8.1 in `composer.json`
swissspidy Aug 24, 2022
9d17198
Stop marking 8.1 tests as experimental
swissspidy Aug 24, 2022
419c696
Revert "Avoid dynamic properties in the `Document` class"
swissspidy Aug 24, 2022
6d3dfea
Avoid dynamic properties in `Document`
swissspidy Aug 24, 2022
6e6a8ae
Ignore PHPUnit cache file
swissspidy Aug 24, 2022
e72abc9
Fix more getters
swissspidy Aug 24, 2022
c9183d9
More fixes
swissspidy Aug 24, 2022
cef1d12
More fixes
swissspidy Aug 25, 2022
ca8d06b
Update Element class
swissspidy Aug 25, 2022
76809be
Undo change to composer.json
swissspidy Aug 29, 2022
7a81724
Fix incorrect `tagName` access in `MinifyHtml`
swissspidy Sep 1, 2022
3873095
Remove now unneeded `isset()` check
swissspidy Sep 1, 2022
b9ce62b
Merge branch 'main' into try/532-php82
swissspidy Sep 7, 2022
3d6cf89
Use private members instead of properties array
westonruter Sep 19, 2022
5f3ecf0
Use delta for float comparison in testItCanGetNumericDimensions
westonruter Sep 19, 2022
4130385
Avoid unnecessary use of setter
westonruter Sep 19, 2022
f5b41e7
Revert "Use private members instead of properties array"
westonruter Sep 19, 2022
d535106
Use properties array instead of using setter
westonruter Sep 19, 2022
ec7430d
Prevent invoking getter recursively for links property
westonruter Sep 19, 2022
012bef9
Add warnings and update addInlineStyleByteCount
westonruter Sep 20, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,15 @@ jobs:
strategy:
fail-fast: false
matrix:
php: ['8.0', '7.4', '7.3', '7.2', '7.1', '7.0', '5.6']
php: ['8.1','8.0', '7.4', '7.3', '7.2', '7.1', '7.0', '5.6']
coverage: [false]
random: [false]
include:
- php: '8.0'
coverage: true
- php: '8.0'
random: true
- php: '8.1'
- php: '8.2'
experimental: true

steps:
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
/vendor/
composer.lock
.phpunit.result.cache
166 changes: 117 additions & 49 deletions src/Dom/Document.php
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,13 @@ final class Document extends DOMDocument
*/
const XPATH_INLINE_STYLE_ATTRIBUTES_QUERY = './/@style';

/**
* Associative array for lazily-created, cached properties for the document.
*
* @var array
*/
private $properties = [];

Comment on lines +111 to +117
Copy link
Member

@westonruter westonruter Sep 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized a potential large simplification here. Why if each of the properties were instead just declared as null?

Suggested change
/**
* Associative array for lazily-created, cached properties for the document.
*
* @var array
*/
private $properties = [];
public $xpath
public $html;
public $head;
public $body;
public $charset;
public $viewport;
public $ampElements;
public $ampCustomStyle;
public $ampCustomStyleByteCount;
public $inlineStyleByteCount
public $links;

The phpdoc notwithstanding.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you did this in 525e918 but then reverted. Did you try private instead of public? This would model what is done with the $links property already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the problem with that approach. It's that private access in the class won't cause the getter to be invoked. I think I have a way around that, and that is to manually invoke __get() for methods on Document. I'm going to give it a try, though the commit may not be pushed until Monday.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bah. Doing this makes PHPStan blow up with a lot of "Access to private property" errors.

/**
* Associative array of options to configure the behavior of the DOM document abstraction.
*
Expand Down Expand Up @@ -142,13 +149,6 @@ final class Document extends DOMDocument
*/
private $cssMaxByteCountEnforced = -1;

/**
* Resource hint manager to manage resource hint <link> tags in the <head>.
*
* @var LinkManager|null
*/
private $links;

/**
* List of document filter class names.
*
Expand Down Expand Up @@ -314,7 +314,7 @@ public static function fromNode(DOMNode &$node)
private function reset()
{
// Drop references to old DOM document.
unset($this->xpath, $this->head, $this->body);
unset($this->properties['xpath'], $this->properties['head'], $this->properties['body']);

// Reference of the document itself doesn't change here, but might need to change in the future.
return $this;
Expand Down Expand Up @@ -587,6 +587,8 @@ private function normalizeDocumentStructure($content)

/**
* Normalize the structure of the document if it was already provided as a DOM.
*
* Warning: This method may not use any magic getters for html, head, or body.
*/
public function normalizeDomStructure()
{
Expand Down Expand Up @@ -618,8 +620,8 @@ public function normalizeDomStructure()
throw FailedToRetrieveRequiredDomElement::forHeadElement($head);
}

$this->head = $head;
$html->appendChild($this->head);
$this->properties['head'] = $head;
$html->appendChild($head);

if ($oldDocumentElement->nodeName === Tag::BODY) {
$body = $oldDocumentElement;
Expand All @@ -634,23 +636,23 @@ public function normalizeDomStructure()
throw FailedToRetrieveRequiredDomElement::forBodyElement($body);
}

$this->body = $body;
$html->appendChild($this->body);
$this->properties['body'] = $body;
$html->appendChild($body);

if ($oldDocumentElement !== $this->body && $oldDocumentElement !== $this->head) {
$this->body->appendChild($oldDocumentElement);
if ($oldDocumentElement !== $body && $oldDocumentElement !== $this->head) {
$body->appendChild($oldDocumentElement);
}
} else {
$head = $this->getElementsByTagName(Tag::HEAD)->item(0);
if (!$head) {
$this->head = $this->createElement(Tag::HEAD);
$this->documentElement->insertBefore($this->head, $this->documentElement->firstChild);
$this->properties['head'] = $this->createElement(Tag::HEAD);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not continue to use $this->head and $this->body? Is there a reason to switch to using $this->properties?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the sake of performance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran into some issues when I originally tried that, but I might have been doing something wrong. I can give it another try 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested again. Even on PHP 8.1 I am getting lots of Undefined property warnings in this case.

Even when doing something seemingly simple like this:

// Throws warning
$this->html = $html;
return $this->html;

// Works
$this->html = $html;
return $html;

// Also works
$this->properties['html'] = $html;
return $this->properties['html'];

I also just analyzed the AmpProject\Dom\DocumentTest::testNormalizeDomStructure() test case for this.

  1. Calling normalizeDomStructure() calls $this->head
  2. This triggers __get()
  3. The getter for head runs, and it sees that it isn't set yet, which causes a call to normalizeDomStructure() again. The property hasn't been set yet.
  4. That one then calls moveInvalidHeadNodesToBody at the end
  5. That one then triggers Undefined property: AmpProject\Dom\Document::$head.

No such issues happen with the current $properties approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue at question is this note:

PHP will not call an overloaded method from within the same overloaded method. That means, for example, writing return $this->foo inside of __get() will return null and raise an E_WARNING if there is no foo property defined, rather than calling __get() a second time. However, overload methods may invoke other overload methods implicitly (such as __set() triggering __get()).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, if a caller is trying to access $x->foo which invokes a magic getter, the getter must not also try to access $x->foo although it can access $x->bar. So there can't be any recursive reference to the current magic property being gotten.

$this->documentElement->insertBefore($this->properties['head'], $this->documentElement->firstChild);
}

$body = $this->getElementsByTagName(Tag::BODY)->item(0);
if (!$body) {
$this->body = $this->createElement(Tag::BODY);
$this->documentElement->appendChild($this->body);
$this->properties['body'] = $this->createElement(Tag::BODY);
$this->documentElement->appendChild($this->properties['body']);
}
}

Expand All @@ -660,15 +662,20 @@ public function normalizeDomStructure()

/**
* Move invalid head nodes back to the body.
*
* Warning: This method may not use any magic getters for html, head, or body.
*/
private function moveInvalidHeadNodesToBody()
{
// Walking backwards makes it easier to move elements in the expected order.
$node = $this->head->lastChild;
$node = $this->properties['head']->lastChild;
while ($node) {
$nextSibling = $node->previousSibling;
if (! $this->isValidHeadNode($node)) {
$this->body->insertBefore($this->head->removeChild($node), $this->body->firstChild);
if (!$this->isValidHeadNode($node)) {
$this->properties['body']->insertBefore(
$this->properties['head']->removeChild($node),
$this->properties['body']->firstChild
);
}
$node = $nextSibling;
}
Expand All @@ -681,12 +688,14 @@ private function moveInvalidHeadNodesToBody()
* the </body> not valid in AMP, but trailing elements after </html> will get wrapped in additional <html> elements.
* While comment nodes would be allowed in AMP, everything is moved regardless so that source stack comments will
* retain their relative position with the element nodes they annotate.
*
* Warning: This method may not use any magic getters for html, head, or body.
*/
private function movePostBodyNodesToBody()
{
// Move nodes (likely comments) from after the </body>.
while ($this->body->nextSibling) {
$this->body->appendChild($this->body->nextSibling);
while ($this->properties['body']->nextSibling) {
$this->properties['body']->appendChild($this->properties['body']->nextSibling);
}

// Move nodes from after the </html>.
Expand All @@ -695,18 +704,20 @@ private function movePostBodyNodesToBody()
if ($nextSibling instanceof Element && Tag::HTML === $nextSibling->nodeName) {
// Handle trailing elements getting wrapped in implicit duplicate <html>.
while ($nextSibling->firstChild) {
$this->body->appendChild($nextSibling->firstChild);
$this->properties['body']->appendChild($nextSibling->firstChild);
}
$nextSibling->parentNode->removeChild($nextSibling); // Discard now-empty implicit <html>.
} else {
$this->body->appendChild($this->documentElement->nextSibling);
$this->properties['body']->appendChild($this->documentElement->nextSibling);
}
}
}

/**
* Determine whether a node can be in the head.
*
* Warning: This method may not use any magic getters for html, head, or body.
*
* @link https://github.com/ampproject/amphtml/blob/445d6e3be8a5063e2738c6f90fdcd57f2b6208be/validator/engine/htmlparser.js#L83-L100
* @link https://www.w3.org/TR/html5/document-metadata.html
*
Expand Down Expand Up @@ -787,7 +798,7 @@ public function addAmpCustomStyle($style)
}

$this->ampCustomStyle->textContent = $newStyle;
$this->ampCustomStyleByteCount = $newByteCount;
$this->properties['ampCustomStyleByteCount'] = $newByteCount;
}

/**
Expand Down Expand Up @@ -818,6 +829,29 @@ public function getRemainingCustomCssSpace()
);
}

/**
* Get the array of allowed keys of lazily-created, cached properties.
* The array index is the key and the array value is the key's default value.
*
* @return array Array of allowed keys.
*/
protected function getAllowedKeys()
{
return [
'xpath',
Tag::HTML,
Tag::HEAD,
Tag::BODY,
Attribute::CHARSET,
Attribute::VIEWPORT,
'ampElements',
'ampCustomStyle',
'ampCustomStyleByteCount',
'inlineStyleByteCount',
'links',
];
}

/**
* Magic getter to implement lazily-created, cached properties for the document.
*
Expand All @@ -828,8 +862,8 @@ public function __get($name)
{
switch ($name) {
case 'xpath':
$this->xpath = new DOMXPath($this);
return $this->xpath;
$this->properties['xpath'] = new DOMXPath($this);
return $this->properties['xpath'];
case Tag::HTML:
$html = $this->getElementsByTagName(Tag::HTML)->item(0);

Expand All @@ -843,8 +877,8 @@ public function __get($name)
throw FailedToRetrieveRequiredDomElement::forHtmlElement($html);
}

$this->html = $html;
return $this->html;
$this->properties['html'] = $html;
return $this->properties['html'];
case Tag::HEAD:
$head = $this->getElementsByTagName(Tag::HEAD)->item(0);

Expand All @@ -858,8 +892,8 @@ public function __get($name)
throw FailedToRetrieveRequiredDomElement::forHeadElement($head);
}

$this->head = $head;
return $this->head;
$this->properties['head'] = $head;
return $this->properties['head'];
case Tag::BODY:
$body = $this->getElementsByTagName(Tag::BODY)->item(0);

Expand All @@ -873,8 +907,8 @@ public function __get($name)
throw FailedToRetrieveRequiredDomElement::forBodyElement($body);
}

$this->body = $body;
return $this->body;
$this->properties['body'] = $body;
return $this->properties['body'];
case Attribute::CHARSET:
// This is not cached as it could potentially be requested too early, before the viewport was added, and
// the cache would then store null without rechecking later on after the viewport has been added.
Expand Down Expand Up @@ -915,53 +949,87 @@ public function __get($name)
$this->head->appendChild($ampCustomStyle);
}

$this->ampCustomStyle = $ampCustomStyle;
$this->properties['ampCustomStyle'] = $ampCustomStyle;

return $this->ampCustomStyle;
return $this->properties['ampCustomStyle'];

case 'ampCustomStyleByteCount':
if (!isset($this->ampCustomStyle)) {
if (!isset($this->properties['ampCustomStyle'])) {
$ampCustomStyle = $this->xpath->query(self::XPATH_AMP_CUSTOM_STYLE_QUERY, $this->head)->item(0);
if (!$ampCustomStyle instanceof Element) {
return 0;
} else {
$this->ampCustomStyle = $ampCustomStyle;
}

$this->properties['ampCustomStyle'] = $ampCustomStyle;
}

if (!isset($this->ampCustomStyleByteCount)) {
$this->ampCustomStyleByteCount = strlen($this->ampCustomStyle->textContent);
if (!isset($this->properties['ampCustomStyleByteCount'])) {
$this->properties['ampCustomStyleByteCount'] =
strlen($this->properties['ampCustomStyle']->textContent);
}

return $this->ampCustomStyleByteCount;
return $this->properties['ampCustomStyleByteCount'];

case 'inlineStyleByteCount':
if (!isset($this->inlineStyleByteCount)) {
$this->inlineStyleByteCount = 0;
if (!isset($this->properties['inlineStyleByteCount'])) {
$this->properties['inlineStyleByteCount'] = 0;
$attributes = $this->xpath->query(
self::XPATH_INLINE_STYLE_ATTRIBUTES_QUERY,
$this->documentElement
);
foreach ($attributes as $attribute) {
$this->inlineStyleByteCount += strlen($attribute->textContent);
$this->properties['inlineStyleByteCount'] += strlen($attribute->textContent);
}
}

return $this->inlineStyleByteCount;
return $this->properties['inlineStyleByteCount'];

case 'links':
if (! isset($this->links)) {
$this->links = new LinkManager($this);
if (! isset($this->properties['links'])) {
$this->properties['links'] = new LinkManager($this);
}

return $this->links;
return $this->properties['links'];
}

// Mimic regular PHP behavior for missing notices.
trigger_error(self::PROPERTY_GETTER_ERROR_MESSAGE . $name, E_USER_NOTICE);
return null;
}

/**
* Magic setter to implement lazily-created, cached properties for the document.
*
* @param string $name Name of the property to set.
* @param mixed $value Value of the property.
*/
public function __set($name, $value)
{
if (!in_array($name, $this->getAllowedKeys(), true)) {
// Mimic regular PHP behavior for missing notices.
trigger_error(self::PROPERTY_GETTER_ERROR_MESSAGE . $name, E_USER_NOTICE);
return;
}

$this->properties[$name] = $value;
}

/**
* Magic callback for lazily-created, cached properties for the document.
*
* @param string $name Name of the property to set.
*/
public function __isset($name)
{
if (!in_array($name, $this->getAllowedKeys(), true)) {
// Mimic regular PHP behavior for missing notices.
trigger_error(self::PROPERTY_GETTER_ERROR_MESSAGE . $name, E_USER_NOTICE);
return false;
}

return isset($this->properties[$name]);
}

/**
* Make sure we properly reinitialize on clone.
*
Expand Down
Loading