-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Expose TextEncoder and TextDecoder #11
base: main
Are you sure you want to change the base?
Conversation
Hi, thanks for starting this. As a start, I'd say don't bother with some of the changes, like those to the license ( Additionally, I'd like you to remove the |
b24f788
to
17770b7
Compare
I have started an beginning implementation utilising Regarding your original comment, as far as I can see On that same line, I have converted tests using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have started an beginning implementation utilising
whatwg-encoding.js
, please let me know if I'm on the right track.
It's looking really good! I've left some comments in any case.
Regarding your original comment, as far as I can see
package-lock.json
, is indeed present in this package.
https://github.com/jsdom/whatwg-encoding/blob/master/package-lock.json
I'm happy to change and use Yarm instead, but this seem to be slightly out of scope.
Heh, interesting. In that case it should be fine then :) sorry for the false alarm.
On that same line, I have converted tests using
new Buffer()
to usingBuffer.from()
, this also feels a bit out of scope, should I revert ?
You could do that, but since it's in its own commit and relatively self-contained, leaving it in would work for me.
3e93928
to
e0bf542
Compare
@TimothyGu I have been able to look at this in a bit more detail. There should now be support for streams as well. I have had to stop relying on the There seems to be limitations to bringing a compliant version up. Of those, here are some of the things right now :
|
Thanks for doing the additional investigation!
That's fine. It doesn't use
I think we can copy some of those here.
Let's just say no
It looks like For the UTF-16 issue, as uncomfortable as it is for me to say this, I think we could use Node.js' built-in |
Why jsdom cannot just expose TextEncoder and TextDecoder from node.js ? Are these do not conform spec ? |
This commit allows the RxPlayer to use the `TextEncoder` and `TextDecoder` APIs when available respectively to encode JS strings into an UTF-8 bytes sequence (TextEncoder doesn't seem to be able to encode into any other encoding) and to decode from either UTF-8, UTF-16BE or UTF-16LE into a JS string. Because `TextEncoder` and `TextDecoder` are not defined in old browser versions we claim to support and in IE11, we still fallback to custom implementation either if it doesn't exist or if the operation fails. It is important to note of a sensible difference between using the `TextDecoder` interface and the previous implementation: when encountering invalid byte sequences in the correponding encoding, the `TextDecoder` will replace those by a "REPLACEMENT CHARACTER" (�). This seems fine and even desirable, but the previous implementation just threw in that same situation. This means that we now have two different behaviors, depending on the current platform / browser. Those functions using the `TextDecoder` APIs are even directly defined in the `StringUtils` tools, and thus that new behavior can be directly noticable by applications using it. Thankfully, nothing is defined in our API documentation about invalid sequences. Even if we can consider that this does not break our API (though it is still unclear to me), it should be is something to keep in mind as this might be unexpected for users relying on this API throwing. Also, I tried to add unit tests, but it appears that "jsdom", on which relies jest to perform unit test while simulation a browser in node, does not include either APIs yet. Though it is under way: jsdom/whatwg-encoding#11
This commit allows the RxPlayer to use the `TextEncoder` and `TextDecoder` APIs when available respectively to encode JS strings into an UTF-8 bytes sequence (TextEncoder doesn't seem to be able to encode into any other encoding) and to decode from either UTF-8, UTF-16BE or UTF-16LE into a JS string. Because `TextEncoder` and `TextDecoder` are not defined in old browser versions we claim to support and in IE11, we still fallback to custom implementation either if it doesn't exist or if the operation fails. It is important to note of a sensible difference between using the `TextDecoder` interface and the previous implementation: when encountering invalid byte sequences in the correponding encoding, the `TextDecoder` will replace those by a "REPLACEMENT CHARACTER" (�). This seems fine and even desirable, but the previous implementation just threw in that same situation. This means that we now have two different behaviors, depending on the current platform / browser. Those functions using the `TextDecoder` APIs are even directly defined in the `StringUtils` tools, and thus that new behavior can be directly noticable by applications using it. Thankfully, nothing is defined in our API documentation about invalid sequences. Even if we can consider that this does not break our API (though it is still unclear to me), it should be is something to keep in mind as this might be unexpected for users relying on this API throwing. Also, I tried to add unit tests, but it appears that "jsdom", on which relies jest to perform unit test while simulation a browser in node, does not include either APIs yet. Though it is under way: jsdom/whatwg-encoding#11
This commit allows the RxPlayer to use the `TextEncoder` and `TextDecoder` APIs when available respectively to encode JS strings into an UTF-8 bytes sequence (TextEncoder doesn't seem to be able to encode into any other encoding) and to decode from either UTF-8, UTF-16BE or UTF-16LE into a JS string. Because `TextEncoder` and `TextDecoder` are not defined in old browser versions we claim to support and in IE11, we still fallback to custom implementation either if it doesn't exist or if the operation fails. It is important to note of a sensible difference between using the `TextDecoder` interface and the previous implementation: when encountering invalid byte sequences in the correponding encoding, the `TextDecoder` will replace those by a "REPLACEMENT CHARACTER" (�). This seems fine and even desirable, but the previous implementation just threw in that same situation. This means that we now have two different behaviors, depending on the current platform / browser. Those functions using the `TextDecoder` APIs are even directly defined in the `StringUtils` tools, and thus that new behavior can be directly noticable by applications using it. Thankfully, nothing is defined in our API documentation about invalid sequences. Even if we can consider that this does not break our API (though it is still unclear to me), it should be is something to keep in mind as this might be unexpected for users relying on this API throwing. Also, I tried to add unit tests, but it appears that "jsdom", on which relies jest to perform unit test while simulation a browser in node, does not include either APIs yet. Though it is under way: jsdom/whatwg-encoding#11
Hey, I'm wondering what the status of this PR is? It looks like it's been sitting idle for almost a year now. The lack of support for |
@TimothyGu I was having a look at this again today. Beyond the elements I mentioned in #11 (comment) I have come to realize that the testing harness instrumentation from WPT simply cannot work here without a serious rewrite. Following your suggestions on #11 (comment) I can see there are far to many APIs they rely on, such as As such, this PR, with all its limitations, at the very least might allow people with less complex requirements to have an easier life. I am happy to write a disclaimer to accompany the current code should an initial "substandard" release be done. (At this point I'd argue for a not-so-elegant release rather than no release at all) A brave soul could look at improving the test harness and support for more complex use cases later on 🧠. |
7fb9a1c
to
4e07f9a
Compare
…issuecomment-605658333
16dfc21
to
80596b7
Compare
80596b7
to
3f99fac
Compare
@TimothyGu @fguitton are there further changes needed on this branch, or would it in its current state be a good step forwards? |
As discussed on Twitter, what is needed before this is merged is ensuring all tests pass. Currently most are commented out, and none of the ones in subdirectories of encoding/ are even downloaded. |
Any updates? |
@domenic yeah, I've not had a chance to work on this, been snowed under with other things. |
This PR has for vocation to address jsdom/jsdom#2524 by complementing this package as suggested in jsdom/jsdom#2928.
This PR transforms the repo to adopt a structure similar to
whatwg-url
.