Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: fix single-character string filling #9837

Closed
wants to merge 2 commits into from

Conversation

addaleax
Copy link
Member

@addaleax addaleax commented Nov 29, 2016

Checklist
  • make -j8 test (UNIX), or vcbuild test nosign (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

buffer

Description of change

Fix the fast path for buffer.fill() with a single-character string.

The fast path only works for strings that are equivalent to a
single-byte buffer, but that condition was not checked properly
for the utf8 or utf16le encodings and is always true for the
latin1 encoding.

This change fixes these problems.

Fixes: #9836

/cc @nodejs/buffer
CI: https://ci.nodejs.org/job/node-test-commit/6222/
CI: https://ci.nodejs.org/job/node-test-commit/6224/
CI: https://ci.nodejs.org/job/node-test-commit/6600/

@nodejs-github-bot nodejs-github-bot added the buffer Issues and PRs related to the buffer subsystem. label Nov 29, 2016
if (encoding !== undefined && typeof encoding !== 'string') {
throw new TypeError('encoding must be a string');
}
if (typeof encoding === 'string' && !Buffer.isEncoding(encoding)) {
encoding = internalUtil.normalizeEncoding(encoding);
if (encoding === undefined) {
throw new TypeError('Unknown encoding: ' + encoding);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to throw Unknown encoding: undefined.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnoordhuis Right, updated!

@addaleax addaleax force-pushed the buffer-fix-single-byte-fill branch from 0690d5f to 641cfa2 Compare November 29, 2016 11:31
@bnoordhuis
Copy link
Member

Test failures look related. I would guess it's an endianness issue.

not ok 27 parallel/test-buffer-fill
  ---
  duration_ms: 0.139
  severity: fail
  stack: |-
    
    assert.js:85
      throw new assert.AssertionError({
      ^
    AssertionError: '愀愀愀愀愀愀愀愀' === 'aaaaaaaa'
        at Object.<anonymous> (/home/iojs/build/workspace/node-test-commit-plinux/nodes/ppcbe-ubuntu1404/test/parallel/test-buffer-fill.js:394:8)

@addaleax addaleax force-pushed the buffer-fix-single-byte-fill branch 5 times, most recently from c7c27c9 to 63e0a5e Compare November 29, 2016 13:14
@addaleax
Copy link
Member Author

I’ve updated the test file here to see whether the failure is related to my changes in lib/buffer.js or not, and it looks like this just exposed the problem:

assert.deepStrictEqual(
    Buffer.allocUnsafeSlow(16).fill('ab', 'utf16le'),
    Buffer.from('61006200610062006100620061006200', 'hex'));

results in

AssertionError: <Buffer 00 61 00 62 00 61 00 62 00 61 00 62 00 61 00 62> deepStrictEqual <Buffer 61 00 62 00 61 00 62 00 61 00 62 00 61 00 62 00>

on big endian machines (example ci run).

Not sure why that is (I think we’re already testing UCS2 on big-endian machines), I’ll take a look later unless somebody else wants to.

@addaleax addaleax force-pushed the buffer-fix-single-byte-fill branch 3 times, most recently from fb755ba to 6790a6f Compare December 6, 2016 02:28
@addaleax
Copy link
Member Author

addaleax commented Dec 6, 2016

Finally got back to this; Updated, new CI @ https://ci.nodejs.org/job/node-test-commit/6451/

Apparently the tests expected a swapped byte order of #fill on big-endian platforms for utf16le, so I needed to fix that up, too. (I think it’s safe to say that the expected behaviour for an encoding called utf16le is to always have little-endian output.)

@nodejs/buffer PTAL

Copy link
Contributor

@trevnorris trevnorris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I'll sign off once the final test is fixed.

} else if (val.length === 1) {
var code = val.charCodeAt(0);
if ((normalizedEncoding === 'utf8' && code < 128) ||
normalizedEncoding === 'latin1') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For backwards compatibility do we still need to check for 'binary', or is that no longer supported? If this is back ported then it'll probably need to be checked there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trevnorris I switched the blocks around here so that this is after normalizeEncoding, so there’s no need to worry about alternate names

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool. thanks.

@@ -606,6 +606,9 @@ void Fill(const FunctionCallbackInfo<Value>& args) {

} else if (enc == UCS2) {
node::TwoByteValue str(env->isolate(), args[1]);
if (IsBigEndian())
SwapBytes16(reinterpret_cast<char*>(&str[0]), str_length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From your comment sounds like just this needs to be removed right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have made this clearer, but this is already the working version; i.e. this line was missing and should already have been there afaict

Copy link
Member Author

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New CI since the last one had build failures: https://ci.nodejs.org/job/node-test-commit/6469/

(everything should be green here in theory)

} else if (val.length === 1) {
var code = val.charCodeAt(0);
if ((normalizedEncoding === 'utf8' && code < 128) ||
normalizedEncoding === 'latin1') {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trevnorris I switched the blocks around here so that this is after normalizeEncoding, so there’s no need to worry about alternate names

@@ -606,6 +606,9 @@ void Fill(const FunctionCallbackInfo<Value>& args) {

} else if (enc == UCS2) {
node::TwoByteValue str(env->isolate(), args[1]);
if (IsBigEndian())
SwapBytes16(reinterpret_cast<char*>(&str[0]), str_length);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have made this clearer, but this is already the working version; i.e. this line was missing and should already have been there afaict

Copy link
Member

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once @trevnorris is happy with it

@trevnorris
Copy link
Contributor

LGTM

@addaleax
Copy link
Member Author

@bnoordhuis Any further comments here? Otherwise I’d like to land this within the next few days.

Copy link
Member

@bnoordhuis bnoordhuis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a suggestion.

'ЉЉЉЉЉЉЉЉ');
assert.strictEqual(
Buffer.allocUnsafeSlow(16).fill('Љ', 'latin1').toString('latin1'),
'\t'.repeat(16));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: maybe use String#repeat() for the other strings as well, it looks more Obviously Correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnoordhuis done, makes sense

There was a byte-order mismatch for `buffer#fill` on big-endian
platforms. Weirdly, the tests seemed to expect that wrong behaviour.
Fix the fast path for `buffer.fill()` with a single-character string.

The fast path only works for strings that are equivalent to a
single-byte buffer, but that condition was not checked properly
for the `utf8` or `utf16le` encodings and is always true for the
`latin1` encoding.

This change fixes these problems.

Fixes: nodejs#9836
@addaleax addaleax force-pushed the buffer-fix-single-byte-fill branch from 6790a6f to b0a80b7 Compare December 12, 2016 14:33
@addaleax
Copy link
Member Author

@addaleax
Copy link
Member Author

Landed in 9ee915b and d4f00fe

@addaleax addaleax closed this Dec 12, 2016
@addaleax addaleax deleted the buffer-fix-single-byte-fill branch December 12, 2016 15:59
addaleax added a commit that referenced this pull request Dec 12, 2016
There was a byte-order mismatch for `buffer#fill` on big-endian
platforms. Weirdly, the tests seemed to expect that wrong behaviour.

PR-URL: #9837
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
addaleax added a commit that referenced this pull request Dec 12, 2016
Fix the fast path for `buffer.fill()` with a single-character string.

The fast path only works for strings that are equivalent to a
single-byte buffer, but that condition was not checked properly
for the `utf8` or `utf16le` encodings and is always true for the
`latin1` encoding.

This change fixes these problems.

Fixes: #9836
PR-URL: #9837
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
Fishrock123 pushed a commit that referenced this pull request Dec 13, 2016
There was a byte-order mismatch for `buffer#fill` on big-endian
platforms. Weirdly, the tests seemed to expect that wrong behaviour.

PR-URL: #9837
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
Fishrock123 pushed a commit that referenced this pull request Dec 13, 2016
Fix the fast path for `buffer.fill()` with a single-character string.

The fast path only works for strings that are equivalent to a
single-byte buffer, but that condition was not checked properly
for the `utf8` or `utf16le` encodings and is always true for the
`latin1` encoding.

This change fixes these problems.

Fixes: #9836
PR-URL: #9837
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
@italoacasas italoacasas mentioned this pull request Dec 15, 2016
italoacasas added a commit that referenced this pull request Dec 15, 2016
Notable changes:
- buffer:
  - fix single-character string filling (Anna Henningsen) #9837
  - handle UCS2.fill() properly on BE (Anna Henningsen) #9837
- url:
  -  including base argument in originFor (joyeecheung) #10021
  - improve URLSearchParams spec compliance (Timothy Gu) #9484
- http:
  - remove stale timeout listeners (Karl Böhlmark) #9440
- build:
  - fix node_g target (Daniel Bevenius) #10153

PR-URL: #10277
italoacasas added a commit that referenced this pull request Dec 15, 2016
Notable changes

SEMVER-MINOR
- url:
    - add inspect function to TupleOrigin (Safia Abdalla) #10039
- crypto:
    - allow adding extra certs to well-known CAs (Sam Roberts) #9139

SEMVER-PATCH
- buffer:
    - fix single-character string filling (Anna Henningsen) #9837
    - handle UCS2 .fill() properly on BE (Anna Henningsen) #9837
- url:
    - including base argument in originFor (joyeecheung) #10021
    - improve URLSearchParams spec compliance (Timothy Gu) #9484
- http:
    - remove stale timeout listeners (Karl Böhlmark) #9440
- build:
    - fix node_g target (Daniel Bevenius) #10153
- fs:
    - remove unused argument from copyObject() (Ethan Arrowood) #10041
- timers:
    - fix handling of cleared immediates (hveldstra) #9759

PR-URL: #10277
italoacasas added a commit that referenced this pull request Dec 17, 2016
Notable changes:
* **crypto**:
  - Allow adding extra certificates to well-known CAs. (Sam Roberts)
    [#9139](#9139)
* **buffer**:
  - Fix single-character string filling. (Anna Henningsen)
    [#9837](#9837)
  - Handle UCS2 `.fill()` properly on BE. (Anna Henningsen)
    [#9837](#9837)
* **url**:
  - Add inspect function to TupleOrigin. (Safia Abdalla)
    [#10039](#10039)
  - Including base argument in originFor. (joyeecheung)
    [#10021](#10021)
  - Improve URLSearchParams spec compliance. (Timothy Gu)
    [#9484](#9484)
* **http**:
  - Remove stale timeout listeners. (Karl Böhlmark)
    [#9440](#9440)
* **build**:
  - Fix node_g target. (Daniel Bevenius)
    [#10153](#10153)
* **fs**:
  - Remove unused argument from copyObject(). (EthanArrowood)
    [#10041](#10041)
* **timers**:
  - Fix handling of cleared immediates. (hveldstra)
    [#9759](#9759)
* **src**:
  - Add wrapper for process.emitWarning(). (SamRoberts)
    [#9139](#9139)
  - Fix string format mistake for 32 bit node.(Alex Newman)
    [#10082](#10082)
italoacasas added a commit that referenced this pull request Dec 19, 2016
Notable changes:
* **crypto**: The built-in list of Well-Known CAs (Certificate Authorities) can now
              be extended via a NODE_EXTRA_CA_CERTS environment variable. (Sam Roberts)
              [#9139](#9139)
* **buffer**: buffer.fill() now works properly for the UCS2 encoding on Big-Endian
              machines. (Anna Henningsen) [#9837](#9837)
* **url**:
    - Including base argument in URL.originFor() to meet specification compliance. (joyeecheung)
      [#10021](#10021)
    - Improve URLSearchParams to meet specification compliance. (Timothy Gu)
      [#9484](#9484)
* **http**: Remove stale timeout listeners in order to prevent a memory leak when using
            keep alive. (Karl Böhlmark) [#9440](#9440)
cjihrig added a commit to cjihrig/node that referenced this pull request Dec 20, 2016
Notable changes:

* buffer:
  - buffer.fill() now works properly for the UCS2 encoding on
    Big-Endian machines.
    (Anna Henningsen) nodejs#9837
* cluster:
  - disconnect() now returns a reference to the disconnected
    worker. (Sean Villars)
    nodejs#10019
* crypto:
  - The built-in list of Well-Known CAs (Certificate Authorities)
    can now be extended via a NODE_EXTRA_CA_CERTS environment
    variable. (Sam Roberts)
    nodejs#9139
* http:
  - Remove stale timeout listeners in order to prevent a memory leak
    when using keep alive. (Karl Böhlmark)
    nodejs#9440
* tls:
  - Allow obvious key/passphrase combinations. (Sam Roberts)
    nodejs#10294
* url:
  - Including base argument in URL.originFor() to meet specification
    compliance. (joyeecheung)
    nodejs#10021
  - Improve URLSearchParams to meet specification compliance.
    (Timothy Gu) nodejs#9484

PR-URL: nodejs#10277
cjihrig added a commit to cjihrig/node that referenced this pull request Dec 20, 2016
Notable changes:

* buffer:
  - buffer.fill() now works properly for the UCS2 encoding on
    Big-Endian machines.
    (Anna Henningsen) nodejs#9837
* cluster:
  - disconnect() now returns a reference to the disconnected
    worker. (Sean Villars)
    nodejs#10019
* crypto:
  - The built-in list of Well-Known CAs (Certificate Authorities)
    can now be extended via a NODE_EXTRA_CA_CERTS environment
    variable. (Sam Roberts)
    nodejs#9139
* http:
  - Remove stale timeout listeners in order to prevent a memory leak
    when using keep alive. (Karl Böhlmark)
    nodejs#9440
* tls:
  - Allow obvious key/passphrase combinations. (Sam Roberts)
    nodejs#10294
* url:
  - Including base argument in URL.originFor() to meet specification
    compliance. (joyeecheung)
    nodejs#10021
  - Improve URLSearchParams to meet specification compliance.
    (Timothy Gu) nodejs#9484

PR-URL: nodejs#10277
cjihrig added a commit that referenced this pull request Dec 20, 2016
Notable changes:

* buffer:
  - buffer.fill() now works properly for the UCS2 encoding on
    Big-Endian machines.
    (Anna Henningsen) #9837
* cluster:
  - disconnect() now returns a reference to the disconnected
    worker. (Sean Villars)
    #10019
* crypto:
  - The built-in list of Well-Known CAs (Certificate Authorities)
    can now be extended via a NODE_EXTRA_CA_CERTS environment
    variable. (Sam Roberts)
    #9139
* http:
  - Remove stale timeout listeners in order to prevent a memory leak
    when using keep alive. (Karl Böhlmark)
    #9440
* tls:
  - Allow obvious key/passphrase combinations. (Sam Roberts)
    #10294
* url:
  - Including base argument in URL.originFor() to meet specification
    compliance. (joyeecheung)
    #10021
  - Improve URLSearchParams to meet specification compliance.
    (Timothy Gu) #9484

PR-URL: #10277
imyller added a commit to imyller/meta-nodejs that referenced this pull request Dec 21, 2016
    Notable changes:

    * buffer:
      - buffer.fill() now works properly for the UCS2 encoding on
        Big-Endian machines.
        (Anna Henningsen) nodejs/node#9837
    * cluster:
      - disconnect() now returns a reference to the disconnected
        worker. (Sean Villars)
        nodejs/node#10019
    * crypto:
      - The built-in list of Well-Known CAs (Certificate Authorities)
        can now be extended via a NODE_EXTRA_CA_CERTS environment
        variable. (Sam Roberts)
        nodejs/node#9139
    * http:
      - Remove stale timeout listeners in order to prevent a memory leak
        when using keep alive. (Karl Bohlmark)
        nodejs/node#9440
    * tls:
      - Allow obvious key/passphrase combinations. (Sam Roberts)
        nodejs/node#10294
    * url:
      - Including base argument in URL.originFor() to meet specification
        compliance. (joyeecheung)
        nodejs/node#10021
      - Improve URLSearchParams to meet specification compliance.
        (Timothy Gu) nodejs/node#9484

    PR-URL: nodejs/node#10277

Signed-off-by: Ilkka Myller <ilkka.myller@nodefield.com>
@MylesBorins
Copy link
Contributor

@addaleax this lands cleany on v6.x but no v4.x. Could you backport?

MylesBorins pushed a commit that referenced this pull request Jan 23, 2017
There was a byte-order mismatch for `buffer#fill` on big-endian
platforms. Weirdly, the tests seemed to expect that wrong behaviour.

PR-URL: #9837
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
MylesBorins pushed a commit that referenced this pull request Jan 23, 2017
Fix the fast path for `buffer.fill()` with a single-character string.

The fast path only works for strings that are equivalent to a
single-byte buffer, but that condition was not checked properly
for the `utf8` or `utf16le` encodings and is always true for the
`latin1` encoding.

This change fixes these problems.

Fixes: #9836
PR-URL: #9837
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
MylesBorins pushed a commit that referenced this pull request Jan 24, 2017
There was a byte-order mismatch for `buffer#fill` on big-endian
platforms. Weirdly, the tests seemed to expect that wrong behaviour.

PR-URL: #9837
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
MylesBorins pushed a commit that referenced this pull request Jan 24, 2017
Fix the fast path for `buffer.fill()` with a single-character string.

The fast path only works for strings that are equivalent to a
single-byte buffer, but that condition was not checked properly
for the `utf8` or `utf16le` encodings and is always true for the
`latin1` encoding.

This change fixes these problems.

Fixes: #9836
PR-URL: #9837
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
@MylesBorins MylesBorins mentioned this pull request Jan 24, 2017
@addaleax
Copy link
Member Author

I don’t think this needs to be backported, the v4.x equivalent of the tests here are passing.

MylesBorins pushed a commit that referenced this pull request Jan 31, 2017
There was a byte-order mismatch for `buffer#fill` on big-endian
platforms. Weirdly, the tests seemed to expect that wrong behaviour.

PR-URL: #9837
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
MylesBorins pushed a commit that referenced this pull request Jan 31, 2017
Fix the fast path for `buffer.fill()` with a single-character string.

The fast path only works for strings that are equivalent to a
single-byte buffer, but that condition was not checked properly
for the `utf8` or `utf16le` encodings and is always true for the
`latin1` encoding.

This change fixes these problems.

Fixes: #9836
PR-URL: #9837
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[buffer] fill() for utf16le is broken
6 participants