-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUGFIX] Check existing Content-Type is in <head> #961
Conversation
b99fea8
to
569a312
Compare
Theres a Psalm error I need to fix... |
569a312
to
14f63ee
Compare
Now fixed the Psalm errors. (I've not been running it locally because I get about 50 more errors that we don't get on the build server - not sure why, maybe different OS or PHP version - am using the same PHAR.) |
You'll probably need to edit the commit message and title before merging, as it includes your (@oliverklee) initial WIP commit with tests. |
14f63ee
to
57aa616
Compare
Last force-push was just fixing a typo in a comment, and adding a |
57aa616
to
3e6dff5
Compare
Now just moved exception handling to higher level for clarity. |
3e6dff5
to
5e1b0ec
Compare
If a valid `Content-Type` `<meta>` element is present, DOM conversion will create a `<head>` element for it even without an explicit `<head>` tag in the HTML. However, to be valid, it must not be in the `<body>` element. As well as with an explicit `<body>` start tag, the `<body>` element also begins whenever a start tag for an element which cannot be in the `<head>` is encountered. This is now checked. Fixes #923.
5e1b0ec
to
678c546
Compare
aa9cca6
to
3c47164
Compare
During review for #961, a regex change to capture the HTML before the `Content-Type` tag drew attention to a couple of edge cases that weren't explicitly tested for and would fail if there was a mistake in the regex pattern: - A newline before the `Content-Type` tag (would fail without the `PCRE_DOTALL`/`s` modifier - while caught by another test, `getDomDocumentWithNormalizedHtmlRepresentsTheGivenHtml`, that test was not specifically intended for this purpose); - A `Content-Type` tag in both the `<head>` and `<body>` (would fail without the lazy/`?` quantifier after `.*`). Tests have now been added to cover these.
During review for #961, a regex change to capture the HTML before the `Content-Type` tag drew attention to a couple of edge cases that weren't explicitly tested for and would fail if there was a mistake in the regex pattern: - A newline before the `Content-Type` tag (would fail without the `PCRE_DOTALL`/`s` modifier - while caught by another test, `getDomDocumentWithNormalizedHtmlRepresentsTheGivenHtml`, that test was not specifically intended for this purpose); - A `Content-Type` tag in both the `<head>` and `<body>` (would fail without the lazy/`?` quantifier after `.*`). Tests have now been added to cover these.
During review for #961, a regex change to capture the HTML before the `Content-Type` tag drew attention to a couple of edge cases that weren't explicitly tested for and would fail if there was a mistake in the regex pattern: - A newline before the `Content-Type` tag (would fail without the `PCRE_DOTALL`/`s` modifier - while caught by another test, `getDomDocumentWithNormalizedHtmlRepresentsTheGivenHtml`, that test was not specifically intended for this purpose); - A `Content-Type` tag in both the `<head>` and `<body>` (would fail without the lazy/`?` quantifier after `.*`). Tests have now been added to cover these.
If a valid
Content-Type
<meta>
element is present, DOM conversion willcreate a
<head>
element for it even without an explicit<head>
tag in theHTML.
However, to be valid, it must not be in the
<body>
element. As well as withan explicit
<body>
start tag, the<body>
element also begins whenever astart tag for an element which cannot be in the
<head>
is encountered. Thisis now checked.
Fixes #923.