-
Notifications
You must be signed in to change notification settings - Fork 7.3k
buffer: v0.11.x Buffer constructor and slice are 10-20x slower than v0.10.x #7633
Comments
I'm on Mac OS X 10.9.2 with 2.6 GHz Intel Core i7. |
There seems to be big degradation between 0.11.12 and 0.11.13 too. node v0.11.12
node v0.11.13
|
This must have to do with a recent V8 upgrade. I'm investigating it. |
Here's the flame graph for v0.11.12. And the flame graph for current master. Really wish I knew what was going on at the moment, but it seems like v8 is doing a crap ton more under the hood. /cc @bnoordhuis @indutny Any help on what could be going on? @tjfontaine This performance regression is a serious performance set back. |
Part of it is handle zapping but that's easily fixed: diff --git a/configure b/configure
index 6e48167..d38ceb2 100755
--- a/configure
+++ b/configure
@@ -587,6 +587,7 @@ def configure_v8(o):
o['variables']['v8_no_strict_aliasing'] = 1 # Work around compiler bugs.
o['variables']['v8_optimized_debug'] = 0 # Compile with -O0 in debug builds.
o['variables']['v8_random_seed'] = 0 # Use a random seed for hash tables.
+ o['variables']['v8_enable_handle_zapping'] = 0 # Slow!
o['variables']['v8_use_snapshot'] = b(not options.without_snapshot)
# assume shared_v8 if one of these is set? That still makes master 8-10x slower than v0.10. I did a quick revert of smalloc back to the v0.10 buffer implementation (with assorted fixes to make it work with V8 3.25, see here) and the impression I get is that the Buffer#slice() regression is primarily caused by smalloc. It's still a little slower with old style buffers, but on the order of 1.5-2x instead of 10x and it should be noted that the way I did the upgrade to the new V8 is rather crude; it calls v8::Isolate::GetCurrent() a lot, which is expensive. I'm reserving judgment about Buffer constructor performance for now. I'm tempted to say the issue lies with V8 because I get only slightly better numbers after the revert but it's possible I botched something. perf-record shows approximately 20% CPU time spent inside v8::internal::CodeStub::GetCode() (the biggest cost factor) before the revert and 10% afterwards. I ran benchmark/buffers/buffer-creation.js before and after and with v0.10 but that benchmark produces highly variable results. I don't think I trust it very much. |
@bnoordhuis I've bisected down to the v8 commit that's causing the majority of the slowness: v8/v8@a101c7c It has something to do w/ setting object properties on an object that's been allocated external array data. |
Buffer.slice can be expensive. One regression was reported by nodejs#7633. The method should be benchmarked.
Buffer.slice can be expensive. One regression was reported by nodejs#7633. The method should be benchmarked.
Buffer.slice can be expensive. One regression was reported by #7633. The method should be benchmarked.
I am adding a status indicator (“open” / “closed”) of this issue to ashtuchkin/iconv-lite#65 by mentioning ashtuchkin/iconv-lite#65 here. |
This was fixed in the upgrade to 3.26, and the cherry pick of an upstream fix in d78a378. |
Buffer.slice can be expensive. One regression was reported by nodejs/node-v0.x-archive#7633. The method should be benchmarked.
I'm running a test as follows:
Node 0.10.28
Node 0.11.13
Please note 0.11.13 is 10-20x slower.
The text was updated successfully, but these errors were encountered: