-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
provide a way to introduce new encodings for new Buffer('someString', 'someEncoding')
#2835
Comments
And a status badge goes to #2798. |
This issue may be seen as a spiritual successor to nodejs/node-v0.x-archive#1772. |
I'm inclined to say this is solely on iconv-lite's head for trying to monkey-patch core, and I say this as the author of a module that does something similar. Opinions, anyone? |
Monkey-patching is not supported, and will never be for obvious reasons. That includes both overriding builtin methods and adding new methods to builtins (both Node.js and v8). Such code could and will break in minor or patch versions, or even within the same Node.js version. It's entirely On the other hand, a supported method of defining new encodings seems like a valid idea, given that overriding an already defined (both built-in or thirdparty) encoding produces an error. |
Introducing new encoding options from userland seems valid, so long as it doesn't require any V8 monkey-business. Patching core, especially in this case, isn't exactly supported. Buffer changes were forced on us via V8, nothing we can really do. |
We could add API like |
On the other hand, I see no actual profit in the above. I am just saying that doing so is possible. As always, partial (~⅓) usage of
That's actually pretty low (18 modules). I expect the total count to be around 50-60 modules. @Mithgol Could you explain how |
It's better because it's easier to introduce to an existing project. It feels almost infinitely easier. You just write You don't have to do anything else. For example, you don't have to remember every place where |
I don't think it should be supported to change the behavior of core functions from userland. We don't do that anywhere else as far as I'm aware. |
I think that allowing a module to add more encodings to Buffer - makes so much sense, that I can't understand the resistance. The proper solution would be to use the native Node.js features for handling Buffers and Encoding. But the thing that is missing is the support for new encodings. |
That's true, Node.js does not support extending the behaviour of its core functions anywhere else, but that's probably because Node.js does not have an obviously limited support of an obviously vast area (such as encodings) anywhere else. Also, most of other such core modules are wrappers around third-party libraries, and that's their excuse. For example,
|
This comment is a nudge after a couple of months. |
Yes. |
I don't think Buffer should be used for encoding/decoding beyond the few common encodings that it provides for convenience. It's simply outside of its scope of responsibility.
TBH if one ends up with Buffer calls all over the codebase and needs to change the encoding in all of them, it seems like code smell. It's not node's job to help with bad architectural decisions. And even in your example, you'd still have to carefully "upgrade" all the It's such a rare edge case that adding an API that will likely cause many issues down the road for questionable benefit doesn't seem worth it. |
Recommending that this be closed. Modules that want to provide support for other encodings can do so easily without monkey patching node or having node provide any kind of extension mechanism. I don't think we should be encouraging this anti-pattern further. |
Let's close then. |
Quick summary of changes that are necessary to do without
|
@ChALkeR Unfortunately, not all of them, because the list of WHATWG-supported encodings seems to be quite limited. For example, UTF-7 is not supported and hence I cannot use it for Fidonet Unicode substrings. I'd better stay on |
In the constructor
new Buffer('someString', 'someEncoding')
Node.js v4.0.0 itself supports a limited number of encodings:'ascii'
,'utf8'
,'utf16le'
(aka'ucs2'
),'base64'
,'hex'
,'binary'
. That's a half of a dozen. That's not plenty.And that's why a widely used package
iconv-lite
(450+ direct dependents, ≈240+ thousands of daily downloads) provides a method (.extendNodeEncodings()
) that adds a support of many other known encodings to theBuffer
API.However,
iconv-lite
does not seem to work in Node v4.0.0 well enough. Any use of an iconv-lite-provided encoding in an attempt ofnew Buffer('someLatinString', 'encodingName')
results in some random output such as the following:It seems to me that either 0fa6c4a was not enough to fix #1547 or a separate deeper issue exists.
Currently iconv-lite's
extend-node.js
changes the behaviour of the following methods:SlowBuffer
Buffer
SlowBuffer.prototype.toString
Buffer.prototype.toString
SlowBuffer.prototype.write
Buffer.prototype.write
SlowBuffer.byteLength
Buffer.byteLength
Buffer.isEncoding
Why those changes were enough in Node v0.10 and v0.12 but aren't in Node v4.0.0?
I may be wrong, but… it seems to me that in Node v0.12 the Buffer's constructor have used
this.write(subject, encoding)
internally but in the current Node v4.0.0 neither the Buffer's constructor nor itsfromString
helper do that. They seem to usebinding.createFromString(string, encoding)
(where necessary) orallocPool.write(string, poolOffset, encoding)
and both of these come fromprocess.binding('buffer')
and aren't replaced byiconv-lite
. And it won't be easy to replace them from userland, I suppose.Is my assumption correct?
What should be done in
iconv-lite
(or in Node.js, or in both) for a multitude of encodings to work in thenew Buffer('someString', 'encodingName')
constructor correctly?The text was updated successfully, but these errors were encountered: