Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cleanForSlug() to remove additional non-word characters #21007

Merged
merged 2 commits into from
Apr 16, 2020

Conversation

earnjam
Copy link
Contributor

@earnjam earnjam commented Mar 19, 2020

Description

This updates the cleanForSlug() function to remove punctuation characters that would get removed by sanitize_title() upon publishing.

This fixes the original issue from #12907 regarding punctuation in the slug. It doesn't resolve the related issues referenced there that were opened for characters in other languages (such as German and Danish) not being converted properly. If we merge this and close #12907, we may want to reopen one of those that was closed as a duplicate in order to keep a more accurate record.

On investigating removing other punctuation, I realized sanitize_title() allows underscores in slugs, so I adjusted the function to allow those, but to remove any other non-hyphen punctuation. I also removed the toLower dependency and switched to .toLowerCase()

How has this been tested?

Used identical strings with a variety of characters in the classic editor and in the current editor to test how they compared. Also updated the unit tests to reflect the new characters that should be removed.

Screenshots

Here is a sample string in the classic editor:
Screen Shot 2020-03-18 at 1 14 05 PM

Vs the block editor after this change:
Screen Shot 2020-03-18 at 1 15 26 PM

Checklist:

  • My code is tested.
  • My code follows the WordPress code style.
  • My code follows the accessibility standards.
  • My code has proper inline documentation.
  • I've included developer documentation if appropriate.
  • I've updated all React Native files affected by any refactorings/renamings in this PR.

@github-actions
Copy link

Size Change: +9 B (0%)

Total Size: 857 kB

Filename Size Change
build/editor/index.js 43.8 kB +3 B (0%)
build/url/index.js 4.02 kB +6 B (0%)
ℹ️ View Unchanged
Filename Size Change
build/a11y/index.js 998 B 0 B
build/annotations/index.js 3.43 kB 0 B
build/api-fetch/index.js 3.39 kB 0 B
build/autop/index.js 2.58 kB 0 B
build/blob/index.js 620 B 0 B
build/block-directory/index.js 6.02 kB 0 B
build/block-directory/style-rtl.css 760 B 0 B
build/block-directory/style.css 760 B 0 B
build/block-editor/index.js 100 kB 0 B
build/block-editor/style-rtl.css 10.9 kB 0 B
build/block-editor/style.css 10.9 kB 0 B
build/block-library/editor-rtl.css 7.21 kB 0 B
build/block-library/editor.css 7.21 kB 0 B
build/block-library/index.js 111 kB 0 B
build/block-library/style-rtl.css 7.42 kB 0 B
build/block-library/style.css 7.43 kB 0 B
build/block-library/theme-rtl.css 669 B 0 B
build/block-library/theme.css 671 B 0 B
build/block-serialization-default-parser/index.js 1.65 kB 0 B
build/block-serialization-spec-parser/index.js 3.1 kB 0 B
build/blocks/index.js 57.5 kB 0 B
build/components/index.js 191 kB 0 B
build/components/style-rtl.css 15.7 kB 0 B
build/components/style.css 15.7 kB 0 B
build/compose/index.js 6.21 kB 0 B
build/core-data/index.js 10.6 kB 0 B
build/data-controls/index.js 1.04 kB 0 B
build/data/index.js 8.2 kB 0 B
build/date/index.js 5.37 kB 0 B
build/deprecated/index.js 771 B 0 B
build/dom-ready/index.js 568 B 0 B
build/dom/index.js 3.06 kB 0 B
build/edit-post/index.js 91.2 kB 0 B
build/edit-post/style-rtl.css 8.47 kB 0 B
build/edit-post/style.css 8.46 kB 0 B
build/edit-site/index.js 5.56 kB 0 B
build/edit-site/style-rtl.css 2.62 kB 0 B
build/edit-site/style.css 2.62 kB 0 B
build/edit-widgets/index.js 4.43 kB 0 B
build/edit-widgets/style-rtl.css 2.58 kB 0 B
build/edit-widgets/style.css 2.58 kB 0 B
build/editor/editor-styles-rtl.css 381 B 0 B
build/editor/editor-styles.css 382 B 0 B
build/editor/style-rtl.css 3.97 kB 0 B
build/editor/style.css 3.96 kB 0 B
build/element/index.js 4.44 kB 0 B
build/escape-html/index.js 733 B 0 B
build/format-library/index.js 6.95 kB 0 B
build/format-library/style-rtl.css 502 B 0 B
build/format-library/style.css 502 B 0 B
build/hooks/index.js 1.93 kB 0 B
build/html-entities/index.js 621 B 0 B
build/i18n/index.js 3.49 kB 0 B
build/is-shallow-equal/index.js 711 B 0 B
build/keyboard-shortcuts/index.js 2.3 kB 0 B
build/keycodes/index.js 1.69 kB 0 B
build/list-reusable-blocks/index.js 2.99 kB 0 B
build/list-reusable-blocks/style-rtl.css 226 B 0 B
build/list-reusable-blocks/style.css 226 B 0 B
build/media-utils/index.js 4.83 kB 0 B
build/notices/index.js 1.58 kB 0 B
build/nux/index.js 3.01 kB 0 B
build/nux/style-rtl.css 616 B 0 B
build/nux/style.css 613 B 0 B
build/plugins/index.js 2.54 kB 0 B
build/primitives/index.js 1.5 kB 0 B
build/priority-queue/index.js 780 B 0 B
build/redux-routine/index.js 2.83 kB 0 B
build/rich-text/index.js 14.3 kB 0 B
build/server-side-render/index.js 2.55 kB 0 B
build/shortcode/index.js 1.7 kB 0 B
build/token-list/index.js 1.27 kB 0 B
build/viewport/index.js 1.61 kB 0 B
build/warning/index.js 1.14 kB 0 B
build/wordcount/index.js 1.18 kB 0 B

compressed-size-action

@aduth
Copy link
Member

aduth commented Mar 31, 2020

to remove punctuation characters that would get removed by sanitize_title() upon publishing.

For posterity's sake, the reference implementation:

https://github.com/WordPress/wordpress-develop/blob/5.3.2/src/wp-includes/formatting.php#L2145
https://github.com/WordPress/wordpress-develop/blob/5.3.2/src/wp-includes/formatting.php#L1215-L1986

Copy link
Member

@aduth aduth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a reasonable change to me.

Do you know of any usage of cleanForSlug which may have relied on this behavior, broken as it may have been, for which these changes might cause some unforeseen incompatibility?

If the idea is for it to adhere to core's equivalent treatment of sanitize_title, it could be a good idea (perhaps separately from here) to include some or all of its test cases:

@earnjam
Copy link
Contributor Author

earnjam commented Mar 31, 2020

I have another branch I started for duplicating sanitize_title() in JS, but this seemed like a quicker win until I could get that complete and more broadly tested.

@gziolo gziolo added [Package] Url /packages/url [Type] Enhancement A suggestion for improvement. labels Apr 16, 2020
@gziolo
Copy link
Member

gziolo commented Apr 16, 2020

Should we add a note in the changelog of the url package? It looks like it can be merged otherwise.

@gziolo gziolo merged commit 7080f02 into master Apr 16, 2020
@gziolo gziolo deleted the update/cleanforslug branch April 16, 2020 06:03
@github-actions github-actions bot added this to the Gutenberg 8.0 milestone Apr 16, 2020
@Soean Soean mentioned this pull request Apr 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Package] Url /packages/url [Type] Enhancement A suggestion for improvement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some inappropriate characters are included in auto-generated permalinks
3 participants