Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: optimize hex parsing. #7602

Closed
wants to merge 3 commits into from
Closed

buffer: optimize hex parsing. #7602

wants to merge 3 commits into from

Conversation

chjj
Copy link
Contributor

@chjj chjj commented Jul 7, 2016

Checklist
  • make -j4 test (UNIX), or vcbuild test nosign (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

buffer

Description of change

Hex parsing can be a slight bottle-neck for anyone who does Buffer.from(str, 'hex') or buf.write(str, 'hex') frequently. 6 potential comparisons as well as some arithmetic per character wasn't cutting it for me. A single pointer access per character is a lot faster (results in a near 2x speedup on 1kb hex strings).

# Current HEAD:
$ ./out/Release/node benchmark/buffers/buffer-hex.js
buffers/buffer-hex.js len="0" n="10000000": 6006950.59809
buffers/buffer-hex.js len="1" n="10000000": 1524912.32297
buffers/buffer-hex.js len="64" n="10000000": 1213341.44991
buffers/buffer-hex.js len="1024" n="10000000": 248060.16367

# With hex parsing optimized:
$ ./out/Release/node benchmark/buffers/buffer-hex.js
buffers/buffer-hex.js len="0" n="10000000": 6130801.93323
buffers/buffer-hex.js len="1" n="10000000": 1717385.59814
buffers/buffer-hex.js len="64" n="10000000": 1367649.36959
buffers/buffer-hex.js len="1024" n="10000000": 463809.34484

@nodejs-github-bot nodejs-github-bot added the c++ Issues and PRs that require attention from people who are familiar with C++. label Jul 7, 2016
return 10 + (c - 'a');
return static_cast<unsigned>(-1);
}
const int8_t unhex_table[256] =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static const int8_t?

@Trott Trott added buffer Issues and PRs related to the buffer subsystem. benchmark Issues and PRs related to the benchmark subsystem. labels Jul 7, 2016
unsigned a = hex2bin(src[i * 2 + 0]);
unsigned b = hex2bin(src[i * 2 + 1]);
unsigned a = unhex(src[i * 2 + 0]);
unsigned b = unhex(src[i * 2 + 1]);
if (!~a || !~b)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still work with unhex()’s return type changed? It seems to me some of the speedup may come from the compiler eliminating this branch completely because it knows the upper bits of a and b will never be set:

$ ./node-master -p 'Buffer(10).write("abcdxx", "hex")'
2
$ ./node -p 'Buffer(10).write("abcdxx", "hex")'
3

(aka we need more tests to cover these cases?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@addaleax, good catch. I didn't notice the type difference. It looks like you're right.

Applying this patch:

diff --git a/src/string_bytes.cc b/src/string_bytes.cc
index c216c5d..cb0e78a 100644
--- a/src/string_bytes.cc
+++ b/src/string_bytes.cc
@@ -173,9 +173,9 @@ size_t hex_decode(char* buf,
                   const size_t srcLen) {
   size_t i;
   for (i = 0; i < len && i * 2 + 1 < srcLen; ++i) {
-    unsigned a = unhex(src[i * 2 + 0]);
-    unsigned b = unhex(src[i * 2 + 1]);
-    if (!~a || !~b)
+    uint8_t a = unhex(src[i * 2 + 0]);
+    uint8_t b = unhex(src[i * 2 + 1]);
+    if ((a | b) & 0x80)
       return i;
     buf[i] = (a << 4) | b;
   }

Kills some of the gained perf, but large hex strings are still faster:

$ ./out/Release/node benchmark/buffers/buffer-hex.js
buffers/buffer-hex.js len="0" n="10000000": 5888692.25695
buffers/buffer-hex.js len="1" n="10000000": 1471220.48559
buffers/buffer-hex.js len="64" n="10000000": 1206331.66340
buffers/buffer-hex.js len="1024" n="10000000": 385088.76228

I'll see what else I can do, and probably add a test for bad hex strings.

var common = require('../common');
var assert = require('assert');

var Buffer = require('buffer').Buffer;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const here please

@Fishrock123
Copy link
Contributor

@Fishrock123
Copy link
Contributor

@chjj Please run make lint :)

assert.deepStrictEqual(buf4, new Buffer([0, 0, 0, 0]));
assert.equal(buf4.write('xxabcd', 0, 'hex'), 0);
assert.deepStrictEqual(buf4, new Buffer([0, 0, 0, 0]));
assert.equal(buf4.write('xxab', 1, 'hex'), 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: assert.strictEqual is generally preferred over assert.equal, especially for comparisons with 0. :)

@addaleax
Copy link
Member

addaleax commented Jul 8, 2016

LGTM with nits and a happy linter.

@chjj
Copy link
Contributor Author

chjj commented Jul 8, 2016

Cleaned up and linted.

@addaleax
Copy link
Member

addaleax commented Jul 8, 2016

LGTM, nice work!

@addaleax
Copy link
Member

addaleax commented Jul 8, 2016

@addaleax
Copy link
Member

Landed in 151d316, thanks!

@addaleax addaleax closed this Jul 17, 2016
addaleax pushed a commit that referenced this pull request Jul 17, 2016
PR-URL: #7602
Reviewed-By: Anna Henningsen <anna@addaleax.net>
evanlucas pushed a commit that referenced this pull request Jul 19, 2016
PR-URL: #7602
Reviewed-By: Anna Henningsen <anna@addaleax.net>
evanlucas pushed a commit that referenced this pull request Jul 20, 2016
PR-URL: #7602
Reviewed-By: Anna Henningsen <anna@addaleax.net>
evanlucas added a commit that referenced this pull request Jul 21, 2016
Notable changes:

* **buffer**:
  * Improve performance of Buffer.from(str, 'hex') and Buffer#write(str, 'hex'). (Christopher Jeffrey) #7602
  * Fix creating from zero-length ArrayBuffer. (Ingvar Stepanyan) #7176
* **deps**: Upgrade to V8 5.0.71.xx. (Ben Noordhuis) #7531
* **repl**: Fix issue with function redeclaration. (Prince J Wesley) #7794
* **util**: Fix inspecting of boxed symbols. (Anna Henningsen) #7641

PR-URL: #7782
evanlucas added a commit that referenced this pull request Jul 21, 2016
Notable changes:

* **buffer**:
  * Improve performance of Buffer.from(str, 'hex') and Buffer#write(str, 'hex'). (Christopher Jeffrey) #7602
  * Fix creating from zero-length ArrayBuffer. (Ingvar Stepanyan) #7176
* **deps**:
  * Upgrade to V8 5.0.71.xx. (Ben Noordhuis) #7531
  * Backport V8 instanceof bugfix (Franziska Hinkelmann) #7638
* **repl**: Fix issue with function redeclaration. (Prince J Wesley) #7794
* **util**: Fix inspecting of boxed symbols. (Anna Henningsen) #7641

PR-URL: #7782
evanlucas added a commit that referenced this pull request Jul 21, 2016
Notable changes:

* **buffer**:
  * Improve performance of Buffer.from(str, 'hex') and Buffer#write(str, 'hex'). (Christopher Jeffrey) #7602
  * Fix creating from zero-length ArrayBuffer. (Ingvar Stepanyan) #7176
* **deps**:
  * Upgrade to V8 5.0.71.xx. (Ben Noordhuis) #7531
  * Backport V8 instanceof bugfix (Franziska Hinkelmann) #7638
* **repl**: Fix issue with function redeclaration. (Prince J Wesley) #7794
* **util**: Fix inspecting of boxed symbols. (Anna Henningsen) #7641

PR-URL: #7782
lukesampson pushed a commit to ScoopInstaller/Scoop that referenced this pull request Jul 24, 2016
### Notable changes

* **buffer**:
  * Improve performance of Buffer.from(str, 'hex') and Buffer#write(str, 'hex'). (Christopher Jeffrey) [#7602](nodejs/node#7602)
  * Fix creating from zero-length ArrayBuffer. (Ingvar Stepanyan) [#7176](nodejs/node#7176)
* **deps**:
  * Upgrade to V8 5.0.71.xx. (Ben Noordhuis) [#7531](nodejs/node#7531)
  * Backport V8 instanceof bugfix (Franziska Hinkelmann) [#7638](nodejs/node#7638)
* **repl**: Fix issue with function redeclaration. (Prince J Wesley) [#7794](nodejs/node#7794)
* **util**: Fix inspecting of boxed symbols. (Anna Henningsen) [#7641](nodejs/node#7641)
Fishrock123 added a commit to Fishrock123/node that referenced this pull request Sep 29, 2016
This test was recently (at the time of writing) introduced in
151d316
and could be cleaned up a bit.

Refs: nodejs#7602
PR-URL: nodejs#7773
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Minwoo Jung <jmwsoft@gmail.com>
Reviewed-By: Ilkka Myller <ilkka.myller@nodefield.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
jasnell pushed a commit that referenced this pull request Sep 29, 2016
This test was recently (at the time of writing) introduced in
151d316
and could be cleaned up a bit.

Refs: #7602
PR-URL: #7773
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Minwoo Jung <jmwsoft@gmail.com>
Reviewed-By: Ilkka Myller <ilkka.myller@nodefield.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Sep 30, 2016
PR-URL: #7602
Reviewed-By: Anna Henningsen <anna@addaleax.net>
MylesBorins pushed a commit that referenced this pull request Sep 30, 2016
This test was recently (at the time of writing) introduced in
151d316
and could be cleaned up a bit.

Refs: #7602
PR-URL: #7773
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Minwoo Jung <jmwsoft@gmail.com>
Reviewed-By: Ilkka Myller <ilkka.myller@nodefield.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Fishrock123 added a commit that referenced this pull request Oct 11, 2016
This test was recently (at the time of writing) introduced in
151d316
and could be cleaned up a bit.

Refs: #7602
PR-URL: #7773
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Minwoo Jung <jmwsoft@gmail.com>
Reviewed-By: Ilkka Myller <ilkka.myller@nodefield.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
rvagg pushed a commit that referenced this pull request Oct 18, 2016
PR-URL: #7602
Reviewed-By: Anna Henningsen <anna@addaleax.net>
rvagg pushed a commit that referenced this pull request Oct 18, 2016
This test was recently (at the time of writing) introduced in
151d316
and could be cleaned up a bit.

Refs: #7602
PR-URL: #7773
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Minwoo Jung <jmwsoft@gmail.com>
Reviewed-By: Ilkka Myller <ilkka.myller@nodefield.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Oct 26, 2016
PR-URL: #7602
Reviewed-By: Anna Henningsen <anna@addaleax.net>
MylesBorins pushed a commit that referenced this pull request Oct 26, 2016
This test was recently (at the time of writing) introduced in
151d316
and could be cleaned up a bit.

Refs: #7602
PR-URL: #7773
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Minwoo Jung <jmwsoft@gmail.com>
Reviewed-By: Ilkka Myller <ilkka.myller@nodefield.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@MylesBorins MylesBorins mentioned this pull request Oct 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark Issues and PRs related to the benchmark subsystem. buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants