-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3x faster setImmediate #6436
3x faster setImmediate #6436
Conversation
use L.create() factory to create access-optimized linkedlist objects
Save the setImmediate callback arguments into an array instead of a closure, and invoke the callback on the arguments from an optimizable function. 60% faster setImmediate with 0 args (15% if self-recursive) 4x faster setImmediate with 1-3 args, 2x with > 3 seems to be faster with less memory pressure when memory is tight Changes: - use L.create() to build faster lists - use runCallback() from within tryOnImmediate - create immediate timers with a function instead of new - just save the arguments and not build closures for the callbacks
Instead of unlinking from the immediate queue immediate on clear, put off the unlink until processImmediate where it's more efficient. 3x faster clearImmediate processing The the benefits stack with the setImmediate speedups, ie total gain of 3.5x with 0 arguments and 4-5x with 1-3 args. Changed the code to defer unlinking from the immediate queue until processImmediate consumes the queue anyway.
Timings for sequential and concurren setImmediate with and without arguments, and set + clearImmediate.
@@ -6,6 +6,13 @@ function init(list) { | |||
} | |||
exports.init = init; | |||
|
|||
// create a new linked list | |||
function create() { | |||
var list = { _idleNext: null, _idlePrev: null }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be const
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder if there would be any performance benefit to instead creating a new instance of an object that doesn't inherit from Object.prototype
? For example:
function LinkedListNode() {
this._idleNext = null;
this._idlePrev = null;
}
LinkedListNode.prototype = Object.create(null);
function create() {
const list = new LinkedListNode();
init(list);
return list;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to const.
I tried the Object.create version above, and it's inclusive. Some test runs (10-20 sec runs)
show a 2% advantage one way, then changing the loop/repeat count (where loops*repeats
is the number of objects created) flips the advantage the other way.
Perhaps the benchmarks could more closely resemble the existing |
|
||
function Immediate() { } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a breaking change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, nvm, #6206 has not landed yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this would be breaking if #6206 had landed, as it doesn't get rid of the Immediate
class, and instead just moves the property assignments into the constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That code changed since the comment was left here.
@mscdex re the benchmarks, oops, I didn't realize misc already had immediate tests. Each test function is called once, the closure should be built once and reused across all callbacks.
Changing it to
halves the throughput, suggesting that the closure is reused in the above case and A different question is whether my benchmarks add enough value to keep, or if I should just ditch them |
@@ -502,24 +502,26 @@ Timeout.prototype.close = function() { | |||
}; | |||
|
|||
|
|||
var immediateQueue = {}; | |||
L.init(immediateQueue); | |||
var immediateQueue = L.create(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand why this change is needed or is helpful but I guess it does clean things up a little.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prevents a hidden class change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at top-level it was changed to match processImmediate (and it cleans things up a bit).
In processImmediate it's a speedup, L.create() returns an access-optimized object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I bet the object ends up being access-optimized after enough runs anyway. Not to mention we can make objects access-optimized explicitly.
That said - this change improves style anyway and is better coding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably, though depends on how many immediates are queued in an event loop cycle.
The immediateQueue is re-created every time, so the optimization would not persist
(and there would be a run-time cost for the conversion).
Out of curiosity, what are the ways of creating access-optimized objects? I know about
objects created with { ... }, the prototype properties of new objects, (the this. properties
as assigned in the constructor?), and assigning an object as the prototype of a function
forces optimization. Any others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrasq conveniently, here is a StackOverflow answer I wrote about a technique petkaantonov used in bluebird (with making an object as a prototype of a function).
This also works with Object.create so I guess that's another one.
this.
properties assigned in the constructor is the "standard" way though, it's even "easier" than object literals and it makes the fact it's static obvious to v8 (object literals work fine too usually).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be explicit - this specific change LGTM, even if it's not faster but it probably is given the old code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a great writeup, and a handy reference, thanks!
pushed edits (some more |
Minus Thanks for this fix. |
pushed edits (new Immediate) |
LGTM although I meant for new Immediate to take the arguments in the constructor. |
Now uses a new L.create() factory to create access-optimized linkedlist objects. PR-URL: #6436 Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Save the setImmediate() callback arguments into an array instead of a closure, and invoke the callback on the arguments from an optimizable function. 60% faster setImmediate with 0 args (15% if self-recursive) 4x faster setImmediate with 1-3 args, 2x with > 3 seems to be faster with less memory pressure when memory is tight Changes: - use L.create() to build faster lists - use runCallback() from within tryOnImmediate() - save the arguments and do not build closures for the callbacks PR-URL: #6436 Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Timings for sequential and concurren setImmediate() with and without arguments, and set + clearImmediate(). PR-URL: #6436 Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Notable changes: * buffer: Added `buffer.swap64()` to compliment `swap16()` & `swap32()`. (Zach Bjornson) #7157 * build: New `configure` options have been added for building Node.js as a shared library. (Stefan Budeanu) #6994 - The options are: `--shared`, `--without-v8-platform` & `--without-bundled-v8`. * crypto: Root certificates have been updated. (Ben Noordhuis) #7363 * debugger: The server address is now configurable via `--debug=<address>:<port>`. (Ben Noordhuis) #3316 * npm: Upgraded npm to v3.10.3 (Kat Marchán) #7515 & (Rebecca Turner) #7410 * readline: Added the `prompt` option to the readline constructor. (Evan Lucas) #7125 * repl / vm: `sigint`/`ctrl+c` will now break out of infinite loops without stopping the Node.js instance. (Anna Henningsen) #6635 * src: - Added a `node::FreeEnvironment` public C++ API. (Cheng Zhao) #3098 - Refactored `require('constants')`, constants are now available directly from their respective modules. (James M Snell) #6534 * stream: Improved `readable.read()` performance by up to 70%. (Brian White) #7077 * timers: `setImmediate()` is now up to 150% faster in some situations. (Andras) #6436 * util: Added a `breakLength` option to `util.inspect()` to control how objects are formatted across lines. (cjihrig) #7499 * v8-inspector: Experimental support has been added for debugging Node.js over the inspector protocol. (Ali Ijaz Sheikh) #6792 - *Note: This feature is experimental, and it could be altered or removed.* - You can try this feature by running Node.js with the `--inspect` flag. Refs: #7441 PR-URL: #7550
Notable changes: * buffer: Added `buffer.swap64()` to compliment `swap16()` & `swap32()`. (Zach Bjornson) #7157 * build: New `configure` options have been added for building Node.js as a shared library. (Stefan Budeanu) #6994 - The options are: `--shared`, `--without-v8-platform` & `--without-bundled-v8`. * crypto: Root certificates have been updated. (Ben Noordhuis) #7363 * debugger: The server address is now configurable via `--debug=<address>:<port>`. (Ben Noordhuis) #3316 * npm: Upgraded npm to v3.10.3 (Kat Marchán) #7515 & (Rebecca Turner) #7410 * readline: Added the `prompt` option to the readline constructor. (Evan Lucas) #7125 * repl / vm: `sigint`/`ctrl+c` will now break out of infinite loops without stopping the Node.js instance. (Anna Henningsen) #6635 * src: - Added a `node::FreeEnvironment` public C++ API. (Cheng Zhao) #3098 - Refactored `require('constants')`, constants are now available directly from their respective modules. (James M Snell) #6534 * stream: Improved `readable.read()` performance by up to 70%. (Brian White) #7077 * timers: `setImmediate()` is now up to 150% faster in some situations. (Andras) #6436 * util: Added a `breakLength` option to `util.inspect()` to control how objects are formatted across lines. (cjihrig) #7499 * v8-inspector: Experimental support has been added for debugging Node.js over the inspector protocol. (Ali Ijaz Sheikh) #6792 - *Note: This feature is experimental, and it could be altered or removed.* - You can try this feature by running Node.js with the `--inspect` flag. Refs: #7441 PR-URL: #7550
Notable changes: * buffer: Added `buffer.swap64()` to compliment `swap16()` & `swap32()`. (Zach Bjornson) #7157 * build: New `configure` options have been added for building Node.js as a shared library. (Stefan Budeanu) #6994 - The options are: `--shared`, `--without-v8-platform` & `--without-bundled-v8`. * crypto: Root certificates have been updated. (Ben Noordhuis) #7363 * debugger: The server address is now configurable via `--debug=<address>:<port>`. (Ben Noordhuis) #3316 * npm: Upgraded npm to v3.10.3 (Kat Marchán) #7515 & (Rebecca Turner) #7410 * readline: Added the `prompt` option to the readline constructor. (Evan Lucas) #7125 * repl / vm: `sigint`/`ctrl+c` will now break out of infinite loops without stopping the Node.js instance. (Anna Henningsen) #6635 * src: - Added a `node::FreeEnvironment` public C++ API. (Cheng Zhao) #3098 - Refactored `require('constants')`, constants are now available directly from their respective modules. (James M Snell) #6534 * stream: Improved `readable.read()` performance by up to 70%. (Brian White) #7077 * timers: `setImmediate()` is now up to 150% faster in some situations. (Andras) #6436 * util: Added a `breakLength` option to `util.inspect()` to control how objects are formatted across lines. (cjihrig) #7499 * v8-inspector: Experimental support has been added for debugging Node.js over the inspector protocol. (Ali Ijaz Sheikh) #6792 - *Note: This feature is experimental, and it could be altered or removed.* - You can try this feature by running Node.js with the `--inspect` flag. Refs: #7441 PR-URL: #7550
### Notable changes * **buffer**: Added `buffer.swap64()` to compliment `swap16()` & `swap32()`. (Zach Bjornson) [#7157](nodejs/node#7157) * **build**: New `configure` options have been added for building Node.js as a shared library. (Stefan Budeanu) [#6994](nodejs/node#6994) - The options are: `--shared`, `--without-v8-platform` & `--without-bundled-v8`. * **crypto**: Root certificates have been updated. (Ben Noordhuis) [#7363](nodejs/node#7363) * **debugger**: The server address is now configurable via `--debug=<address>:<port>`. (Ben Noordhuis) [#3316](nodejs/node#3316) * **npm**: Upgraded npm to v3.10.3 (Kat Marchán) [#7515](nodejs/node#7515) & (Rebecca Turner) [#7410](nodejs/node#7410) * **readline**: Added the `prompt` option to the readline constructor. (Evan Lucas) [#7125](nodejs/node#7125) * **repl / vm**: `sigint`/`ctrl+c` will now break out of infinite loops without stopping the Node.js instance. (Anna Henningsen) [#6635](nodejs/node#6635) * **src**: - Added a `node::FreeEnvironment` public C++ API. (Cheng Zhao) [#3098](nodejs/node#3098) - Refactored `require('constants')`, constants are now available directly from their respective modules. (James M Snell) [#6534](nodejs/node#6534) * **stream**: Improved `readable.read()` performance by up to 70%. (Brian White) [#7077](nodejs/node#7077) * **timers**: `setImmediate()` is now up to 150% faster in some situations. (Andras) [#6436](nodejs/node#6436) * **util**: Added a `breakLength` option to `util.inspect()` to control how objects are formatted across lines. (cjihrig) [#7499](nodejs/node#7499) * **v8-inspector**: Experimental support has been added for debugging Node.js over the inspector protocol. (Ali Ijaz Sheikh) [#6792](nodejs/node#6792) - **Note: This feature is _experimental_, and it could be altered or removed.** - You can try this feature by running Node.js with the `--inspect` flag.
@mjsalinger perhaps some of the commits got squashed; all the expected edits are there. |
I'm setting this as don't land. please change to LTS watch if you think this should land in LTS |
@thealphanerd if it cleanly applies you should be able to land it |
@Fishrock123 it does not land cleanly without other timers changes |
@thealphanerd ok, probably not important unless it is needed for other fixes. (Although it would be nice.) |
This was originally changed in 6f75b66 but it appears unnecessary and benhcmark results show little difference without the extra property. Refs: nodejs#6436
@Fishrock123 I have the same problem , did you save it?
my code is here:
var crawler = require('crawler');
var config = require('./config');
var fs = require('fs');
// var videoList = config.videoList;
//var read = require('./read');
var debug = require('debug')('nightmare:crawler');
console.time("程序运行时间");//开始运行时间
//从videoList文件中获取视频列表
var read = fs.readFileSync(__dirname+'/videoList.json',function(err){
if(err){
console.log(err);
}
debug('读取videoList文件');
console.log('successed!');
});
var videoList = JSON.parse(read);
var informationList = [];
var c = new crawler({
// maxConnection : 1000,
//rateLimit : 1000,
forceUTF8 : true,
callback : function(error,res,done){
if(error){
console.log(error);
}else{
information(res,done);
done();
}
// console.log(informationList);
// console.log(informationList.length);
}
});
// //爬取订阅号列表
for(i=0;i<videoList.length;i++){
c.queue({
uri: videoList[i].url,
proxy: 'http://127.0.0.1:61481'
});
} |
This was originally changed in 6f75b66 but it appears unnecessary and benhcmark results show little difference without the extra property. Refs: nodejs#6436
This was originally changed in 6f75b66 but it appears unnecessary and benhcmark results show little difference without the extra property. Refs: #6436 PR-URL: #16355 Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Anatoli Papirovski <apapirovski@mac.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Refael Ackermann <refack@gmail.com>
This was originally changed in 6f75b66 but it appears unnecessary and benhcmark results show little difference without the extra property. Refs: nodejs/node#6436 PR-URL: nodejs/node#16355 Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Anatoli Papirovski <apapirovski@mac.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Refael Ackermann <refack@gmail.com>
This was originally changed in 6f75b66 but it appears unnecessary and benhcmark results show little difference without the extra property. Refs: nodejs/node#6436 PR-URL: nodejs/node#16355 Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Anatoli Papirovski <apapirovski@mac.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Refael Ackermann <refack@gmail.com>
Checklist
Affected core subsystem(s)
Description of change
Sped up setImmediate processing by 60% to over 400% through faster linkedlist creation,
faster immediate object creation, not wrapping closures around callbacks, and invoking
the callbacks faster from an optimizable function without using
.call
Sped up clearImmediate by delaying the linked list update until processImmediate pulls the
immediate off the queue, where it's done more efficiently. This speeds up clearImmediate 3x,
with the benefit stacking on top of the setImmediate speedup.
Added benchmarks for setImmediate and clearImmediate.