Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML block content gets transformed into entities #24282

Closed
MatzeKitt opened this issue Jul 30, 2020 · 9 comments · Fixed by #27268
Closed

HTML block content gets transformed into entities #24282

MatzeKitt opened this issue Jul 30, 2020 · 9 comments · Fixed by #27268
Assignees
Labels
[Block] HTML Affects the the HTML Block [Status] In Progress Tracking issues with work in progress

Comments

@MatzeKitt
Copy link
Contributor

MatzeKitt commented Jul 30, 2020

Update: this bug is not isolated to reusable blocks. It happens when creating a core/html block when it contains these characters as the HTML entities &, < and >.


Describe the bug
If you create an HTML block with the content 3 < 4 and store it as reusable block save it, and then reload the page, the < becomes &lt;.

Background: I need to use this HTML block to add some JavaScript. If parts of this JavaScript gets changed by encoding to HTML entities, it just doesn’t work on the one hand and can result in a total crash of Gutenberg on the other hand.

To reproduce
Steps to reproduce the behavior:

  1. Add a new HTML block
  2. Add this content: 3 < 4
  3. Store the block as reusable block

Expected behavior
The block is stored as reusable with the raw content added.

Editor version (please complete the following information):

  • WordPress version: 5.5-RC1-48687
  • Does the website has Gutenberg plugin installed, or is it using the block editor that comes by default? Yes
  • If the Gutenberg plugin is installed, which version is it? 8.6.1

Desktop (please complete the following information):

  • OS: macOS 10.15.6
  • Browser: Safari
  • Version: 13.1.2 (15609.3.5.1.3)
@MatzeKitt MatzeKitt changed the title HTML gets transformed into entities by storing as reusable block HTML block content gets transformed into entities by storing as reusable block Jul 30, 2020
@annezazu annezazu added [Block] HTML Affects the the HTML Block [Feature] Synced Patterns Related to synced patterns (formerly reusable blocks) Needs Testing Needs further testing to be confirmed. labels Aug 14, 2020
@getdave
Copy link
Contributor

getdave commented Sep 7, 2020

I tested this and can confirm the Issue is occurring as described. I also noted the following block validation issue n the browser console:

Block validation: Block validation failed for `core/html`...

Content generated by `save` function:

3 &lt; 4

Content retrieved from post body:

3 < 4

The post body when inspected was:

<!-- wp:html -->
3 < 4
<!-- /wp:html -->

This notes a mismatch between what is being saved (encoded entities) and what is in the post body (string of raw HTML).

I can see the block's save passes the block content through <RawHTML>

export default function save( { attributes } ) {
return <RawHTML>{ attributes.content }</RawHTML>;
}

...which is a wrapper around React's dangerouslySetInnerHTML prop

export default function RawHTML( { children, ...props } ) {
// The DIV wrapper will be stripped by serializer, unless there are
// non-children props present.
return createElement( 'div', {
dangerouslySetInnerHTML: { __html: children },
...props,
} );
}

This also happens when you don't save as a Reusable block. Simply enter the HTML block, save the Post as draft and reload the page. Same issue occurs.

After some quick debugging this function is where the content 3 < 4 is converted into the entity version

export function parseWithAttributeSchema( innerHTML, attributeSchema ) {
return hpqParse( innerHTML, matcherFromSource( attributeSchema ) );
}

Specifically the html matcher passed to hpqParse seems to be the cause of the change. At this point returning the .innerHTML of the node causes the content to become encoded

return match.innerHTML;

...and indeed if we refer to the MDN reference for innerHTML we see:

Note: If a <div>, <span>, or <noembed> node has a child text node that includes the characters (&), (<), or (>), innerHTML returns these characters as the HTML entities "&", "<" and ">" respectively. Use Node.textContent to get a raw copy of these text nodes' contents.

@getdave
Copy link
Contributor

getdave commented Oct 2, 2020

@KittMedia Do you have an example of the JavaScript you need to include that is not working. A reduced test-case form for testing would be fine.

@MatzeKitt
Copy link
Contributor Author

Of course:

function conditionalMemberFields() {
	var membershipRadios = document.querySelectorAll( '[name="membership"]' );
	var membershipConditionals = document.querySelectorAll( '.conditional-membernumber' );
	
	if ( ! membershipRadios.length || ! membershipConditionals.length ) {
		return;
	}
	
	for ( var i = 0; i < membershipRadios.length; i++ ) {
		membershipRadios[ i ].addEventListener( 'change', function( event ) {
			for ( var n = 0; n < membershipConditionals.length; n++ ) {
				if ( event.currentTarget.value === 'Ja' ) {
					membershipConditionals[ n ].classList.remove( 'conditional-membernumber' );
				}
				else {
					membershipConditionals[ n ].classList.add( 'conditional-membernumber' );
				}
			}
		} );
	}
}

@KokkieH
Copy link

KokkieH commented Oct 18, 2020

This doesn't happen only when saving the block as a reusable block. I can also replicate with these steps:

  1. Add the content to a HTML block
  2. Save draft
  3. Exit the editor
  4. Re-open the post for editing.

On reopening, the block displays the HTML entity rather than the character.

Reported by a WordPress.com user in https://wordpress.com/forums/topic/editing-posts-with-markdown-markup/

@getdave
Copy link
Contributor

getdave commented Nov 25, 2020

Of course:

function conditionalMemberFields() {
	var membershipRadios = document.querySelectorAll( '[name="membership"]' );
	var membershipConditionals = document.querySelectorAll( '.conditional-membernumber' );
	
	if ( ! membershipRadios.length || ! membershipConditionals.length ) {
		return;
	}
	
	for ( var i = 0; i < membershipRadios.length; i++ ) {
		membershipRadios[ i ].addEventListener( 'change', function( event ) {
			for ( var n = 0; n < membershipConditionals.length; n++ ) {
				if ( event.currentTarget.value === 'Ja' ) {
					membershipConditionals[ n ].classList.remove( 'conditional-membernumber' );
				}
				else {
					membershipConditionals[ n ].classList.add( 'conditional-membernumber' );
				}
			}
		} );
	}
}

Just to confirm if you try this out by adding a HTML block then pasting in the content above you'll get a browser console log warning that block validation failed because save function and post body don't match. This is because < is converted to &lt; for serialization but doesn't seem to be deserialized back to < for use in the editor.

function conditionalMemberFields() {
	var membershipRadios = document.querySelectorAll( '[name="membership"]' );
	var membershipConditionals = document.querySelectorAll( '.conditional-membernumber' );
	
	if ( ! membershipRadios.length || ! membershipConditionals.length ) {
		return;
	}
	
-	for ( var i = 0; i < membershipRadios.length; i++ ) {
+	for ( var i = 0; i &lt; membershipRadios.length; i++ ) {
		membershipRadios[ i ].addEventListener( 'change', function( event ) {
-			for ( var n = 0; n &lt; membershipConditionals.length; n++ ) {
+			for ( var n = 0; n < membershipConditionals.length; n++ ) {
				if ( event.currentTarget.value === 'Ja' ) {
					membershipConditionals[ n ].classList.remove( 'conditional-membernumber' );
				}
				else {
					membershipConditionals[ n ].classList.add( 'conditional-membernumber' );
				}
			}
		} );
	}
}

@ellatrix ellatrix mentioned this issue Nov 25, 2020
6 tasks
@github-actions github-actions bot added the [Status] In Progress Tracking issues with work in progress label Nov 25, 2020
@joehoyle
Copy link

joehoyle commented Apr 13, 2021

For anyone that might want a not-pretty workaround, when you need to do a JS less-than comparison in a HTML block: Math.sign( a - b ) === -1 can be a substitute. I told you it wasn't pretty! h/t @rmccue

@ftoppi
Copy link

ftoppi commented Sep 12, 2021

Hello, the issue is still in present 5.8.1 . It also happens with square brackets.

@getdave getdave changed the title HTML block content gets transformed into entities by storing as reusable block HTML block content gets transformed into entities Sep 14, 2021
@getdave
Copy link
Contributor

getdave commented Sep 14, 2021

Confirmed. Issue still present.

@getdave
Copy link
Contributor

getdave commented Dec 7, 2022

@MatzeKitt With the merge of #27268 this should now be fixed. Gutenberg 14.8 will be released soon so you should be able to test it using the Gutenberg Plugin by next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Block] HTML Affects the the HTML Block [Status] In Progress Tracking issues with work in progress
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants