buffer: v0.11.x Buffer constructor and slice are 10-20x slower than v0.10.x #7633

raymondfeng · 2014-05-16T03:44:16Z

I'm running a test as follows:

var start = Date.now();
var count = 1000000;

for (var i = 0; i < count; i++) {
  var buf = new Buffer(1024);
}

console.log('Buffer.new %d ms', Date.now() - start);

buf = new Buffer(1024);
start = Date.now();
for (var i = 0; i < count; i++) {
  buf.slice(0,20);
}

console.log('Buffer.slice %d ms', Date.now() - start);

Node 0.10.28

Buffer.new 723 ms Buffer.slice 302 ms
Buffer.new 732 ms Buffer.slice 232 ms

Node 0.11.13

Buffer.new 3671 ms Buffer.slice 4719 ms
Buffer.new 3616 ms Buffer.slice 4663 ms

Please note 0.11.13 is 10-20x slower.

raymondfeng · 2014-05-16T03:49:51Z

I'm on Mac OS X 10.9.2 with 2.6 GHz Intel Core i7.

raymondfeng · 2014-05-16T03:55:38Z

There seems to be big degradation between 0.11.12 and 0.11.13 too.

node v0.11.12

Buffer.new 398 ms Buffer.slice 2761 ms
Buffer.new 389 ms Buffer.slice 2723 ms

node v0.11.13

Buffer.new 3797 ms Buffer.slice 4895 ms
Buffer.new 3743 ms Buffer.slice 4963 ms

trevnorris · 2014-05-16T23:05:53Z

This must have to do with a recent V8 upgrade. I'm investigating it.

trevnorris · 2014-05-16T23:32:04Z

Here's the flame graph for v0.11.12.

And the flame graph for current master.

Really wish I knew what was going on at the moment, but it seems like v8 is doing a crap ton more under the hood.

/cc @bnoordhuis @indutny Any help on what could be going on?

@tjfontaine This performance regression is a serious performance set back.

bnoordhuis · 2014-05-18T14:48:12Z

Part of it is handle zapping but that's easily fixed:

diff --git a/configure b/configure
index 6e48167..d38ceb2 100755
--- a/configure
+++ b/configure
@@ -587,6 +587,7 @@ def configure_v8(o):
   o['variables']['v8_no_strict_aliasing'] = 1  # Work around compiler bugs.
   o['variables']['v8_optimized_debug'] = 0  # Compile with -O0 in debug builds.
   o['variables']['v8_random_seed'] = 0  # Use a random seed for hash tables.
+  o['variables']['v8_enable_handle_zapping'] = 0  # Slow!
   o['variables']['v8_use_snapshot'] = b(not options.without_snapshot)

   # assume shared_v8 if one of these is set?

That still makes master 8-10x slower than v0.10. I did a quick revert of smalloc back to the v0.10 buffer implementation (with assorted fixes to make it work with V8 3.25, see here) and the impression I get is that the Buffer#slice() regression is primarily caused by smalloc. It's still a little slower with old style buffers, but on the order of 1.5-2x instead of 10x and it should be noted that the way I did the upgrade to the new V8 is rather crude; it calls v8::Isolate::GetCurrent() a lot, which is expensive.

I'm reserving judgment about Buffer constructor performance for now. I'm tempted to say the issue lies with V8 because I get only slightly better numbers after the revert but it's possible I botched something. perf-record shows approximately 20% CPU time spent inside v8::internal::CodeStub::GetCode() (the biggest cost factor) before the revert and 10% afterwards.

I ran benchmark/buffers/buffer-creation.js before and after and with v0.10 but that benchmark produces highly variable results. I don't think I trust it very much.

trevnorris · 2014-05-20T03:10:02Z

@bnoordhuis I've bisected down to the v8 commit that's causing the majority of the slowness: v8/v8@a101c7c

It has something to do w/ setting object properties on an object that's been allocated external array data.

See nodejs#7633

Buffer.slice can be expensive. One regression was reported by nodejs#7633. The method should be benchmarked.

Buffer.slice can be expensive. One regression was reported by #7633. The method should be benchmarked.

Mithgol · 2014-06-11T08:58:25Z

I am adding a status indicator (“open” / “closed”) of this issue to ashtuchkin/iconv-lite#65 by mentioning ashtuchkin/iconv-lite#65 here.

trevnorris · 2014-06-17T00:26:25Z

This was fixed in the upgrade to 3.26, and the cherry pick of an upstream fix in d78a378.

Buffer.slice can be expensive. One regression was reported by nodejs/node-v0.x-archive#7633. The method should be benchmarked.

raymondfeng changed the title ~~v0.11.x Buffer constructor and slice implementation are 10-20x slower than v0.10.x~~ buffer: v0.11.x Buffer constructor and slice are 10-20x slower than v0.10.x May 16, 2014

trevnorris added the buffers label May 16, 2014

trevnorris self-assigned this May 16, 2014

trevnorris added the V8 label May 16, 2014

raymondfeng pushed a commit to raymondfeng/node that referenced this issue May 23, 2014

benchmark: Add a test to measure Buffer.slice method

24142b4

See nodejs#7633

raymondfeng pushed a commit to raymondfeng/node that referenced this issue May 23, 2014

benchmark: Add a test to measure Buffer.slice perf

b8bd041

Buffer.slice can be expensive. One regression was reported by nodejs#7633. The method should be benchmarked.

raymondfeng mentioned this issue May 23, 2014

benchmark: Add a test to measure Buffer.slice perf #7674

Closed

raymondfeng pushed a commit to raymondfeng/node that referenced this issue May 27, 2014

benchmark: Add a test to measure Buffer.slice perf

510ecb5

Buffer.slice can be expensive. One regression was reported by nodejs#7633. The method should be benchmarked.

piscisaureus pushed a commit that referenced this issue May 28, 2014

benchmark: Add a test to measure Buffer.slice perf

4c672c8

Buffer.slice can be expensive. One regression was reported by #7633. The method should be benchmarked.

Mithgol mentioned this issue Jun 11, 2014

Very slow on Node v0.11.13 (previous are fine) ashtuchkin/iconv-lite#65

Closed

trevnorris closed this as completed Jun 17, 2014

SheetJSDev mentioned this issue Jun 18, 2014

astral character support in chrome and IE Stuk/jszip#142

Closed

dead-horse referenced this issue in koajs/koa Jun 25, 2014

use Object.setPrototypeOf()

80e3d78

vkurchatkin mentioned this issue Sep 12, 2014

Hanging streams due to internal pause() without corresponding resume() during pipe() #8351

Closed

fdgonthier pushed a commit to opersys/node that referenced this issue Apr 1, 2015

benchmark: Add a test to measure Buffer.slice perf

0b6032e

Buffer.slice can be expensive. One regression was reported by nodejs/node-v0.x-archive#7633. The method should be benchmarked.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer: v0.11.x Buffer constructor and slice are 10-20x slower than v0.10.x #7633

buffer: v0.11.x Buffer constructor and slice are 10-20x slower than v0.10.x #7633

raymondfeng commented May 16, 2014

raymondfeng commented May 16, 2014

raymondfeng commented May 16, 2014

trevnorris commented May 16, 2014

trevnorris commented May 16, 2014

bnoordhuis commented May 18, 2014

trevnorris commented May 20, 2014

Mithgol commented Jun 11, 2014

trevnorris commented Jun 17, 2014

buffer: v0.11.x Buffer constructor and slice are 10-20x slower than v0.10.x #7633

buffer: v0.11.x Buffer constructor and slice are 10-20x slower than v0.10.x #7633

Comments

raymondfeng commented May 16, 2014

Node 0.10.28

Node 0.11.13

raymondfeng commented May 16, 2014

raymondfeng commented May 16, 2014

node v0.11.12

node v0.11.13

trevnorris commented May 16, 2014

trevnorris commented May 16, 2014

bnoordhuis commented May 18, 2014

trevnorris commented May 20, 2014

Mithgol commented Jun 11, 2014

trevnorris commented Jun 17, 2014