Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

events: optimize various functions #601

Closed
wants to merge 1 commit into from

Conversation

mscdex
Copy link
Contributor

@mscdex mscdex commented Jan 25, 2015

Cache events and listeners objects where possible and loop over
Object.keys() instead of using for..in. These changes alone give
~60-65% improvement in the ee-add-remove benchmark.

The swapping out of the util type checking functions with inline
checking gives another ~5-10% improvement.

// adding it to the listeners, first emit "newListener".
if (events.newListener)
this.emit('newListener', type,
typeof listener.listener === 'function' ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

listener.listener feels awkward, can the initial naming of this be tweaked at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, I just left it as it was before. What do you suggest?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mscdex Can you put braces around the consequent?

@quantizor
Copy link

General comment: There's a lot of !(events = this._events) patterns in the code you introduced that are significantly different than what you're replacing. Not sure if there is a formal code style for this sort of expression, but it feels kind of tricky and it may be better to isolate the assignment to its own line rather than capturing the output inline as a side effect.

if (!(events = this._events))
events = this._events = {};
else {
existing = events[type];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this line events is undefined, isn't it?

events variable can be initiated with

var events = this._events;

and here can be used like this:

if (!events)
  events = this._events = {};

existing = events[type];

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is defined there. I have the separate else there because of the code following this line (events.newListener can't possibly exist if this._events isn't set, so there's no need to check in that case).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I see, I was misled by the line nr. 137, was expecting of an equality check, actually using an assignment in an if statement is a bad practice. Would you mind to try the code that I proposed in my first comment and check the performance impact of the changes made?

@mscdex
Copy link
Contributor Author

mscdex commented Jan 26, 2015

There's a lot of !(events = this._events) patterns in the code you introduced that are significantly different than what you're replacing.

I'm not sure what you mean by "significantly different?" The checks are still the same checks, except cached versions are used where possible (in addition to inline assignment).

@Fishrock123
Copy link
Contributor

Benchmarks do look good. (+ Tests pass.)

Before:

events/ee-add-remove.js n=250000: 337186.61793

After:

events/ee-add-remove.js n=250000: 529259.95116

@@ -47,7 +47,7 @@ EventEmitter.prototype.setMaxListeners = function setMaxListeners(n) {
};

function $getMaxListeners(that) {
if (util.isUndefined(that._maxListeners))
if (that._maxListeners === void 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undefined is not re-definable in ES5. Is there a good reason to use void 0 over it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's what util.isUndefined() was using.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+0.1 for undefined.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for undefined

@Fishrock123
Copy link
Contributor

Mostly LGTM, but I'd like to see other potential comments.

@micnic
Copy link
Contributor

micnic commented Jan 28, 2015

For me it also looks good, except the variable assignments inside if statements like @yaycmyk said, they are misleading while reading the code, the cached values should be assigned somewhere before the if statement as I proposed in this discussion

return this;
}

// emit removeListener for all listeners on all events
if (arguments.length === 0) {
for (key in this._events) {
var keys = Object.keys(events);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose Object.keys() could have a negative impact on EventEmitter objects with many events because of the array it creates (instead of iterating over an object like for..in does.) Probably an uncommon case but it might be good to capture it in a benchmark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could manually keep track of the length of this._events and use some length value as the cut-off between using Object.keys() and using a for-in?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is based on false assumption, V8 for-in eagerly creates an array of the properties before the loop body is even entered. In optimized case this array is cached but that's not the case for events anyway.

So basically you never want to use a for-in, even in the case where you need prototype properties you are better off doing it manually with Object.keys and Object.getPrototypeOf (which could be turned into a function getInheritedKeys to make it fast idiom everywhere)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V8 for-in eagerly creates an array of the properties before the loop body is even entered.

That's only the case for "uncommon" objects, like object proxies or objects with interceptors. The objects that EventEmitter creates are simple objects with an enum cache that for..in iterates over.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's only the case for "uncommon" objects, like object proxies or objects with interceptors. The objects that EventEmitter creates are simple objects with an enum cache that for..in iterates over.

A normalized object is also an uncommon object which this._events definitely is as there is a delete call on it literally 18 lines below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, maybe we should consider getting reed of deletes ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vkurchatkin That was tried in the past in the middle of v0.10 (unfortunately) which caused memory leaks (especially if you keep using uniquely named events).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not the point, this._events is being treated like a hash table - if it's not normalized by delete it will be soon normalized by adding enough event names on it (16 the last time I checked was the limit on the heuristic). Just forget for-in forever and be happy, you know that Object.keys uses the enum cache as well, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you know that Object.keys uses the enum cache as well, right?

It still creates a new array every time though.

What you say about delete obj.key and for..in is true for Crankshaft but not TurboFan. I'm not sure what we should be optimizing for. Crankshaft is still the default but it's effectively in maintenance mode. TurboFan is going to replace it some day but that may still be far off. Or it may be next month, I don't know. :-(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Object normalization is not related to crankshaft and you cannot do a simple map check with normalized objects to see if a property was deleted during an iteration of the loop so I am sure for-in will always be very slow for normalized objects unless it's very simple body that can be analyzed to never delete properties but I am not holding my breath at all.

@bnoordhuis
Copy link
Member

Some comments but I like the general thrust.

@mscdex mscdex force-pushed the perf-ee-add-remove branch from a13ac82 to f93f44a Compare January 28, 2015 23:20
@mscdex
Copy link
Contributor Author

mscdex commented Jan 28, 2015

Ok I have made suggested changes.

I also went ahead and optimized some of the other functions. The changes to EventEmitter.listenerCount() show ~14% improvement and the changes to emitter.listeners() show ~195% improvement for 10 listeners.

I benchmarked listenerCount() and listeners() using (new) benchmarks based on ee-add-remove (same number of listeners, same number of iterations, etc.).

All tests still pass.

EDIT: Ok, it seems now that for some reason if the array is large enough (~>=50 elements in my testing), array.slice() somehow ends up becoming faster than a manual copy. I'm looking into that.

@mscdex mscdex force-pushed the perf-ee-add-remove branch from f93f44a to d292fd2 Compare January 29, 2015 01:29
@mscdex
Copy link
Contributor Author

mscdex commented Jan 29, 2015

Ok I made a tweak to the emitter.listeners() implementation that only does the manual array copy for small numbers of listeners (<50 currently). For >=50 listeners it switches to array.slice().

@mscdex mscdex changed the title events: optimize adding and removing listeners events: optimize various functions Jan 29, 2015
@Fishrock123
Copy link
Contributor

Hmm, the new changes are slightly slower on the ee-add-remove benchmark.

Unpatched:

events/ee-add-remove.js n=250000: 337186.61793

Previous patch:

events/ee-add-remove.js n=250000: 529259.95116

Current patch:

events/ee-add-remove.js n=250000: 441635.76191

@Fishrock123
Copy link
Contributor

@mscdex where are you getting your numbers from? it may be prudent to write additional benchmarks for these. :)

@mscdex
Copy link
Contributor Author

mscdex commented Jan 29, 2015

@Fishrock123 I did write additional benchmarks to test the other functions, but was going to submit them in a separate PR. However I did not change ee-add-remove.

The results for me are the same. Not a whole lot changed in the add/remove functions anyway, merely style changes.

if (!this._events)
this._events = {};
events = this._events;
if (!events)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always include braces for conditionals please

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, one line body is ok without braces

@mscdex mscdex force-pushed the perf-ee-add-remove branch from d292fd2 to 05da7a7 Compare January 29, 2015 17:23
@mscdex
Copy link
Contributor Author

mscdex commented Jan 29, 2015

I've also now made some improvements to emit() that result in a 3x speedup for fast cases and a slight bump for the slow case for multiple handlers (single handler cases still perform the same):

Benchmark code:

var Benchmark = require('benchmark');
var ee = new (require('events').EventEmitter)();

for (var k = 0; k < 10; k += 1)
  ee.on('dummy', function() {});

var suite = new Benchmark.Suite();
suite.add('emitter#emit', function() {
  // tested code
}, { minSamples: 100 });
suite.on('cycle', function(event) {
  console.log(String(event.target));
});
suite.run({ async: false });

0 args test

ee.emit('dummy');

Results:

  • Before
    • emitter#emit x 1,891,788 ops/sec ±0.17% (197 runs sampled)
  • After
    • emitter#emit x 5,827,473 ops/sec ±0.18% (198 runs sampled)

1 arg test

ee.emit('dummy', 1);

Results:

  • Before
    • emitter#emit x 1,566,929 ops/sec ±0.41% (193 runs sampled)
  • After
    • emitter#emit x 4,017,494 ops/sec ±0.18% (197 runs sampled)

2 args test

ee.emit('dummy', 1, false);

Results:

  • Before
    • emitter#emit x 1,481,542 ops/sec ±0.30% (195 runs sampled)
  • After
    • emitter#emit x 3,863,464 ops/sec ±0.12% (197 runs sampled)

3 args test

ee.emit('dummy', 1, false, null);

Results:

  • Before
    • emitter#emit x 1,435,269 ops/sec ±0.23% (195 runs sampled)
  • After
    • emitter#emit x 3,958,400 ops/sec ±0.16% (197 runs sampled)

4 args test

(this hits the slow case code)

ee.emit('dummy', 1, false, null, 'foo');

Results:

  • Before
    • emitter#emit x 1,370,338 ops/sec ±0.14% (198 runs sampled)
  • After
    • emitter#emit x 1,589,312 ops/sec ±0.13% (198 runs sampled)

@tjconcept
Copy link
Contributor

If performance is a priority in iojs/io.js, I think @petkaantonov is the guy. He has done some amazing work by replacing node core stuff with faster (as in 100x) user land versions.
I think he got put off by the attitude towards contributors in joyent/node (petkaantonov/urlparser#10 (comment)), so he might wanna give it another shot here? 😃

@mscdex
Copy link
Contributor Author

mscdex commented Feb 1, 2015

@tjconcept There is already #643 for incorporating his url module.

Anyway, I think these changes are a good start.

@mscdex
Copy link
Contributor Author

mscdex commented Feb 1, 2015

@bnoordhuis rebased.

@bnoordhuis
Copy link
Member

@mscdex It seems to break parallel/test-event-emitter-add-listeners and message/stdin_messages:

=== release test-event-emitter-add-listeners ===
Path: parallel/test-event-emitter-add-listeners
newListener: hello
newListener: foo
start
hello
assert.js:87
  throw new assert.AssertionError({
        ^
AssertionError: [ [Function: listen1] ] deepEqual [ [Function: listen2], [Function: listen1] ]
    at Object.<anonymous> (/home/bnoordhuis/src/v1.x/test/parallel/test-event-emitter-add-listeners.js:67:8)
    at Module._compile (module.js:446:26)
    at Object.Module._extensions..js (module.js:464:10)
    at Module.load (module.js:341:32)
    at Function.Module._load (module.js:296:12)
    at Function.Module.runMain (module.js:487:10)
    at startup (node.js:111:16)
    at node.js:799:3
Command: out/Release/iojs /home/bnoordhuis/src/v1.x/test/parallel/test-event-emitter-add-listeners.js
[01:42|%  90|+ 717|-   1]: release stdin_messages length differs.
# elided, very chatty

@mscdex
Copy link
Contributor Author

mscdex commented Feb 2, 2015

That's strange ... I did a clean make test yesterday after rebasing and didn't see any errors. I will do more digging tonight.

@bnoordhuis
Copy link
Member

I landed PR #687 earlier today. It's possible that it's interacting badly with this PR.

@mscdex mscdex force-pushed the perf-ee-add-remove branch from 4107f84 to faf4841 Compare February 3, 2015 04:44
@mscdex
Copy link
Contributor Author

mscdex commented Feb 3, 2015

Ok, tests now pass again :-)

@@ -55,22 +55,67 @@ EventEmitter.prototype.getMaxListeners = function getMaxListeners() {
return $getMaxListeners(this);
};

function emitNone(handler, isFn, self) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a comment above these might be nice.

@Fishrock123
Copy link
Contributor

ee-add-remove consistently sees an 80+% gain with this patch.

events/ee-add-remove.js n=250000: ./out/Release/iojs: 505750 iojs: 278020 . 81.91%

LGTM. @trevnorris Can you take a look since you were doing #533?

(Side-note: we need some emit() benchmarks.)

@mscdex
Copy link
Contributor Author

mscdex commented Feb 5, 2015

RE: additional benchmarks, here are some that I created during my testing FWIW.

@Fishrock123
Copy link
Contributor

Mind PR-ing those soon? :)

@mscdex
Copy link
Contributor Author

mscdex commented Feb 5, 2015

@Fishrock123 Done: #730.

Cache events and listeners objects where possible and loop over
Object.keys() instead of using for..in. These changes alone give
~60-65% improvement in the ee-add-remove benchmark.

The changes to EventEmitter.listenerCount() gives ~14%
improvement and changes to emitter.listeners() gives
significant improvements for <50 listeners
(~195% improvement for 10 listeners).

The changes to emitter.emit() gives 3x speedup for the fast
cases with multiple handlers and a minor speedup for the slow
case with multiple handlers.

The swapping out of the util.is* type checking functions with inline
checks gives another ~5-10% improvement.
@trevnorris
Copy link
Contributor

Haven't taken the time to run the tests myself, but the code LGTM.

@bnoordhuis
Copy link
Member

@Fishrock123
Copy link
Contributor

Some results of the new benchmarks added in 847b9d2 with the iterations in #746

events/ee-add-remove.js n=250000: ./out/Release/iojs: 471160 iojs: 257310 ............ 83.11%
events/ee-emit-multi-args.js n=2000000: ./out/Release/iojs: 4326800 iojs: 1509300 ... 186.68%
events/ee-emit.js n=2000000: ./out/Release/iojs: 6289100 iojs: 1864600 .............. 237.28%
events/ee-listener-count.js n=50000000: ./out/Release/iojs: 234610000 iojs: 229210000 . 2.36%
events/ee-listeners-many.js n=5000000: ./out/Release/iojs: 4197000 iojs: 4659400 ..... -9.92%
events/ee-listeners.js n=5000000: ./out/Release/iojs: 18773000 iojs: 7006500 ........ 167.94%

(I frequently am getting a 5-10% regression on that ee-listeners-many.js benchmark)

@mscdex
Copy link
Contributor Author

mscdex commented Feb 6, 2015

Yeah, I wasn't quite sure about the heuristics that I used for ee.listeners(), but it seemed to work well on the system I tested on. Suggestions are welcome :-)

@bnoordhuis
Copy link
Member

I'm willing to chalk up the events/ee-listeners-many regression as a fluke. The benchmark seems to do the same amount of work with and without this pull request applied: memory usage is about the same, output of --trace_opt --trace_deopt --trace_osr --trace_gc is the same. Everything looks healthy, really.

@Fishrock123 @trevnorris Thoughts?

@mscdex mscdex mentioned this pull request Feb 9, 2015
@bnoordhuis
Copy link
Member

LGTM, for the record. Can I get one more @iojs/collaborators LGTM?

@evanlucas
Copy link
Contributor

LGTM

bnoordhuis pushed a commit that referenced this pull request Feb 9, 2015
Cache events and listeners objects where possible and loop over
Object.keys() instead of using for..in. These changes alone give
~60-65% improvement in the ee-add-remove benchmark.

The changes to EventEmitter.listenerCount() gives ~14%
improvement and changes to emitter.listeners() gives
significant improvements for <50 listeners
(~195% improvement for 10 listeners).

The changes to emitter.emit() gives 3x speedup for the fast
cases with multiple handlers and a minor speedup for the slow
case with multiple handlers.

The swapping out of the util.is* type checking functions with inline
checks gives another ~5-10% improvement.

PR-URL: #601
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Evan Lucas <evanlucas@me.com>
@bnoordhuis
Copy link
Member

Thanks Brian, landed in b677b84.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.