Use fromCodePoint to convert high value unicode entities #243

ryanjduffy · 2016-11-29T18:15:19Z

Q	A
Bug fix?	yes
Breaking change?	no
New feature?	no
Deprecations?	no
Spec compliancy?	no
Tests added/pass?	yes
Fixed tickets	comma-separated list of tickets fixed by the PR, if any
License	MIT

High code point Unicode entity references are being truncated because String.fromCharCode can only return a 2-byte character whereas String.fromCodePoint can return a 4-byte character. Interestingly, the existing test case includes an appropriately high code point reference but the expected result is the truncated value and not the truly expected value.

With this change, 💩 is correctly converted to 💩 instead of  (which would be the value of ).

danez · 2016-12-01T20:08:30Z

Thanks for spotting this. Awesome.

You added the polyfill as devDependency although it probably should be a dependency. But even then it won't work I think, because we use rollup to bundle babylon.

Other option would be to drop support for node < 4.
@hzoo @DrewML ideas?

hzoo · 2016-12-01T20:10:21Z

Wouldn't rollup just bundle the polyfill in?

Yeah we can drop it but not in this pr, would need to plan that out

danez · 2016-12-01T20:11:22Z

not sure how the bundling works, would need to check. But then npm/yarn couldn't do deduplicating correctly as we hard inline it and it also seems to modify globals.

hzoo · 2016-12-01T20:28:59Z

Oh - then we need to use a version that gives a reference rather than the actual polyfill - like this kind of thing

var fromCodePoint = require('fromCodePoint');
fromCodePoint();

ryanjduffy · 2016-12-05T15:09:02Z

If the plan is to drop support for node < 4 anyway, what do you think of including an inline copy of the polyfill (the fn without updating String, that is)? The polyfill code hasn't been updated in 2.5 years so there's not much risk of getting out of sync between now and when you'd be able to remove it. Seems like a safe and minimally intrusive alternative.

In order to avoid modifying String as the polyfill does, I've copied the source from the polyfill and adapted it return the polyfill function if the native version does not exist. Once support for node versions that lack fromCodePoint is dropped, this polyfill can be removed.

ryanjduffy · 2016-12-08T17:21:25Z

Went ahead and made a copy of the polyfill and dropped the dependency. Made the necessary changes to adapt it to the babylon lint settings and avoid updating String. Let me know what you think.

cc: @mathiasbynens - fyi on copying String.fromcodepoint (and thx for the polyfill!).

mathiasbynens · 2016-12-08T17:32:15Z

LGTM! Thanks for the CC, @ryanjduffy — I appreciate the heads-up 👍

Ref. babel/babel#4315 for dropping Node.js < 4 (in which case the polyfill can be removed).

mathiasbynens · 2016-12-08T17:35:17Z

Btw, since the goal is to decode HTML entities, consider using a library like he.

Ryan Duffy added 2 commits November 29, 2016 11:59

Use fromCodePoint to convert high value unicode entities

6cf4ff7

Include polyfill for String.fromCodePoint

850d467

danez added the Tag: Bug Fix label Dec 1, 2016

Ryan Duffy added 2 commits December 8, 2016 11:14

move license notice to top of file

97c05d9

danez approved these changes Dec 9, 2016

View reviewed changes

danez merged commit 1c13800 into babel:master Jan 2, 2017

mathiasbynens mentioned this pull request Jan 10, 2017

Remove String.fromCodePoint shim #279

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use fromCodePoint to convert high value unicode entities #243

Use fromCodePoint to convert high value unicode entities #243

ryanjduffy commented Nov 29, 2016 •

edited

Loading

danez commented Dec 1, 2016

hzoo commented Dec 1, 2016

danez commented Dec 1, 2016 •

edited

Loading

hzoo commented Dec 1, 2016

ryanjduffy commented Dec 5, 2016 •

edited

Loading

ryanjduffy commented Dec 8, 2016

mathiasbynens commented Dec 8, 2016 •

edited by danez

Loading

mathiasbynens commented Dec 8, 2016

Use fromCodePoint to convert high value unicode entities #243

Use fromCodePoint to convert high value unicode entities #243

Conversation

ryanjduffy commented Nov 29, 2016 • edited Loading

danez commented Dec 1, 2016

hzoo commented Dec 1, 2016

danez commented Dec 1, 2016 • edited Loading

hzoo commented Dec 1, 2016

ryanjduffy commented Dec 5, 2016 • edited Loading

ryanjduffy commented Dec 8, 2016

mathiasbynens commented Dec 8, 2016 • edited by danez Loading

mathiasbynens commented Dec 8, 2016

ryanjduffy commented Nov 29, 2016 •

edited

Loading

danez commented Dec 1, 2016 •

edited

Loading

ryanjduffy commented Dec 5, 2016 •

edited

Loading

mathiasbynens commented Dec 8, 2016 •

edited by danez

Loading