Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unrecognized elements are silently removed without validation error reporting #1100

Closed
westonruter opened this issue Apr 27, 2018 · 4 comments
Assignees
Labels
Bug Something isn't working Release Validation
Milestone

Comments

@westonruter
Copy link
Member

When coming across an element that is not recognized at all, the following code in process_node will be invoked:

https://github.com/Automattic/amp-wp/blob/1e2cd22f1bfe8ab9d833c901e02111120f52fe72/includes/sanitizers/class-amp-tag-and-attribute-sanitizer.php#L284-L292

In the call to replace_node_with_children it will call remove_node instead of remove_invalid_child, and only the latter will report the removal. This needs to be improved to report invalidity when it is being removed for that reason.

@westonruter westonruter added Bug Something isn't working Validation labels Apr 27, 2018
@westonruter westonruter added this to the v1.0 milestone May 2, 2018
@kienstra
Copy link
Contributor

Question About remove_invalid_child()

Hi @westonruter,
Thanks for your detailed description.

I'm probably not understanding this. But it looks like when replace_node_with_children() calls remove_node(), remove_node() calls remove_invalid_child(). Is that enough to report the removed node?

https://github.com/Automattic/amp-wp/blob/024e450798c35a89c3390c410ab42522a924f712/includes/sanitizers/class-amp-tag-and-attribute-sanitizer.php#L1700-L1709

@kienstra
Copy link
Contributor

I think I understand this now 😄

@hellofromtonya hellofromtonya self-assigned this Jul 24, 2018
@hellofromtonya
Copy link
Contributor

hellofromtonya commented Jul 25, 2018

Recreating the Issue

Using this HTML in a new post:

<h2>Finds and reports these invalid elements</h2>

<imgs src="/none.jpg" width="100" height="100" alt="None"></imgs>

<baz class="baz"><p>Invalid baz parent element.</p></baz>

<h2>Silently removes these elements</h2>

<foo class="foo">Invalid baz tag.</foo>

<foobaz class="foobaz"><zab>Invalid <span>nested elements</span></zab></foobaz>

<bazbar invalid="bazbar"><span>Is an invalid "bar" tag.</span></bazbar>

<div class="parent">
	<p>Nesting valid and invalid elements.</p>
	<invalid class="invalid">Is an invalid "invalid" tag</invalid>
	<bazfoo class="bazfoo">Is an invalid "foo" tag <p id="testing-id" style="width: 100px">This should pass.</p></bazfoo>
</div>

<ul>
	<li>hello</li>
	<lili>world</lili>
</ul>

Then publish the post.

Results

  1. It catches <imgs> and <baz> and reports these 2 elements as invalid.
  2. It does not catch the other invalid elements and silently removes them: <foo>, <foobaz>, <zab>, <bazbar>, <invalid>, <bazfoo>, and <lili>.

screen shot 2018-07-25 at 2 54 15 pm

Tracing Issue via Xdebug

The 2 invalid elements that were caught flowed into replace_node_with_children() and into the if code block to run $this->remove_node( $node );:

https://github.com/Automattic/amp-wp/blob/024e450798c35a89c3390c410ab42522a924f712/includes/sanitizers/class-amp-tag-and-attribute-sanitizer.php#L1662-L1668

The other invalid elements that were not caught flowed into the else code block:

https://github.com/Automattic/amp-wp/blob/024e450798c35a89c3390c410ab42522a924f712/includes/sanitizers/class-amp-tag-and-attribute-sanitizer.php#L1668-L1687

Possible Problem

There is no validation error being generated in the else for the node that's being replaced by the fragment after pushing child nodes back onto the stack.

hellofromtonya pushed a commit that referenced this issue Jul 25, 2018
Closes #1100.

Fixes the problem of silently removing unrecognized elements.  Uses the same code pattern and methodology as `remove_invalid_child`:

- Prevent double-reporting nodes that are rejected.
- Check through should_sanitize_validation_error().
- Replace if true.
- Else, store in the `should_not_replace_nodes` property.
@kienstra
Copy link
Contributor

kienstra commented Aug 3, 2018

Moving To "Ready For Merging"

If it's alright, I'm moving this to "Ready For Merging." If you think this could use functional testing, feel free to move it back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Release Validation
Projects
None yet
Development

No branches or pull requests

4 participants