Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffer.isEncoding regards an empty string as a valid encoding #9654

Closed
shinnn opened this issue Nov 17, 2016 · 8 comments
Closed

Buffer.isEncoding regards an empty string as a valid encoding #9654

shinnn opened this issue Nov 17, 2016 · 8 comments
Labels
buffer Issues and PRs related to the buffer subsystem. doc Issues and PRs related to the documentations. question Issues that look for answers.

Comments

@shinnn
Copy link
Contributor

shinnn commented Nov 17, 2016

  • Version: Node.js v7.1.0
  • Platform: Darwin Kernel Version 16.1.0

As stated in the issue title, Buffer.isEncoding('') returns true.

Actually, passing an empty string to encoding options of fs.readFile, Buffer.from or other functions supporting encoding doesn't result in an error, because they use the default encoding utf8 as a fallback in that case.

From this point of view, '' is also a valid encoding and Buffer.isEncoding('') === true looks fine. However, if we approve this behavior of Buffer.isEncoding, I think it's inconsistent that Buffer.isEncoding returns false when it takes other falsy values − false, null, undefined etc.

So, which design of Buffer.isEncoding is most reasonable?

  1. There is no problem with the current behavior of Buffer.isEncoding.
  2. Buffer.isEncoding should return false when it takes ''.
  3. Buffer.isEncoding should return true when it takes any falsy values.

I support the second one and can create a pull request.

I'd like to hear your opinion. Thanks.

@mscdex mscdex added buffer Issues and PRs related to the buffer subsystem. question Issues that look for answers. labels Nov 17, 2016
@BrandonZacharie
Copy link

I also vote for option 2 ('' !== 'utf8').

@targos targos added the doc Issues and PRs related to the documentations. label Jan 8, 2017
@Trott
Copy link
Member

Trott commented Jul 15, 2017

@nodejs/buffer

@Trott
Copy link
Member

Trott commented Aug 10, 2017

I'm going to put a ctc-review label on this to at least try to get some input on whether this should be changed or not. @nodejs/ctc

@bnoordhuis
Copy link
Member

IMO, it would be consistent for Buffer.isEncoding() to accept any encoding that Buffer.from() accepts. That is:

Buffer.from('x', '')     // ok, defaults to utf-8
Buffer.from('x', true)   // ok, defaults to utf-8
Buffer.from('x', false)  // ok, defaults to utf-8
// etc.

The alternative is to make Buffer.from() and friends reject such values but that is almost certainly going to result in (possibly significant) ecosystem breakage.

Another alternative is to simply state in the documentation that Buffer.from() accepts them from legacy reasons but that Buffer.isEncoding() is the one source of truth. Not a bad option either.

@tniessen
Copy link
Member

Another alternative is to simply state in the documentation that Buffer.from() accepts them from legacy reasons but that Buffer.isEncoding() is the one source of truth. Not a bad option either.

I'd definitely support this, Buffer.isEncoding() should only return true for strings that specify an existing encoding, while Buffer.from may accept other inputs for backward compatibility.

@cjihrig
Copy link
Contributor

cjihrig commented Aug 10, 2017

Another alternative is to simply state in the documentation that Buffer.from() accepts them from legacy reasons but that Buffer.isEncoding() is the one source of truth. Not a bad option either.

I'm in favor of that approach.

@rvagg
Copy link
Member

rvagg commented Aug 14, 2017

I'm going to abstain from taking a position on this because the implications could be quite broad, but I will point out that this logic isn't quite right:

However, if we approve this behavior of Buffer.isEncoding, I think it's inconsistent that Buffer.isEncoding returns false when it takes other falsy values − false, null, undefined etc.

Just because '' is falsy doesn't mean we should include other falsy values in reasoning about acceptable categories. The broadest category of acceptable values is Strings, so we should just be restricting our arguments to what acceptable string contents are and just because a particular string is also in the category of "falsy" doesn't mean we should consider those values—similarly if we had an encoding that was also a number doesn't mean we should include numbers in our thinking about acceptable categories. A non-string value is clearly not an encoding type, so I believe we should be limiting the discussion to what string values are acceptable and whether '' can be treated as "default encoding", which seems reasonable, on the surface anyway.

@BridgeAR
Copy link
Member

This got fixed in 452eed9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. doc Issues and PRs related to the documentations. question Issues that look for answers.
Projects
None yet
Development

No branches or pull requests

10 participants