Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process: improve nextTick() performance #13446

Merged
merged 1 commit into from
Jun 17, 2017

Conversation

mscdex
Copy link
Contributor

@mscdex mscdex commented Jun 4, 2017

I recently noticed that with the introduction of Async Hooks also came the (perhaps unintentional) reverting of 804d57d. I actually tried a lot of different solutions, but almost all of them resulted in regressions in other nextTick() benchmarks (e.g. a perf boost in no args case, but regression in args case) which I still don't quite understand. One thing I did notice is that the ideal solution of just using a plain object and embedding the symbol properties as computed properties causes TurboFan to kick in and that causes a significant slowdown for nextTick().

I was able to improve the existing non-ES6-based constructor solution, but it turns out using an ES6 class improves performance even more (even surpassing the plain object-based solutions that performed better in whichever benchmarks), so that is what I decided on.

Results:

                                               improvement confidence      p.value
 process/next-tick-breadth-args.js millions=2     27.75 %        *** 1.271176e-20
 process/next-tick-breadth.js millions=2           7.71 %        *** 4.155765e-13
 process/next-tick-depth-args.js millions=12      47.78 %        *** 4.150674e-52
 process/next-tick-depth.js millions=12           47.32 %        *** 7.742778e-31

CI: https://ci.nodejs.org/job/node-test-pull-request/8464/

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)
  • process

@mscdex mscdex added performance Issues and PRs related to the performance of Node.js. process Issues and PRs related to the process subsystem. labels Jun 4, 2017
Copy link
Contributor

@refack refack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (just some nits)

@@ -86,8 +89,9 @@ function setupNextTick() {
_runMicrotasks();

if (tickInfo[kIndex] < tickInfo[kLength] ||
emitPendingUnhandledRejections())
emitPendingUnhandledRejections()) {
Copy link
Contributor

@refack refack Jun 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering... It this a style choice or a DEOPT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style choice, I think we tend to use braces for multi-line conditionals from what I've generally seen, this is just one "offender" I ran across while working in this file.

this.domain = process.domain || null;
this[async_id_symbol] = asyncId;
this[trigger_id_symbol] = triggerId;
if (async_hook_fields[kInit] > 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% with so much side effects in a constructor...
If it's not a performance hit I'd rather see these four lines in a static method or factory
If it is a performance hit, I think it should be documented:

// These side effects were folded into the c'tor for performance

(or something to that effect)

Copy link
Contributor Author

@mscdex mscdex Jun 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't move these other statements back into nextTick() without causing a performance regression from what I saw. I believe it's because of Crankshaft's inlining limits, but I did not want to introduce regressions at this moment (we will have to deal with it once TurboFan is everywhere).

Unfortunately internalNextTick() exceeds these limits, so there's not much I can do there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah so probably even a comment will "irritate" Crankshaft
What can you do 🤷‍♂️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully once V8 (or perhaps they already do in later versions) is able to better optimize computed properties in TurboFan, we can use plain objects again without losing performance.

Copy link
Contributor

@Fishrock123 Fishrock123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm w comments

}
}

function nextTick(callback) {
if (typeof callback !== 'function')
throw new errors.TypeError('ERR_INVALID_CALLBACK');
// on the way out, don't bother. it won't get fired anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clarify the comment rather than removing it?

Copy link
Contributor Author

@mscdex mscdex Jun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing it matches internalNextTick() and it doesn't seem like a particularly helpful comment as process._exiting is pretty self-explanatory to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the code says what but I'm wondering if it may be helpful to say why.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW Crankshaft can decide not to inline a function because it has a comment in it 😞 so if we come up with a better comment it probably should go in line 231.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, it makes little difference as the comment was already there and that will soon be an obsolete optimization as I understand.

Copy link
Member

@gibfahn gibfahn Jun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean by a 'why comment.'

AIUI it's a comment explaining why we're doing this (as opposed to explaining what this code does).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gibfahn I understand that part. What I meant was I don't understand what this is in reference to. Is it the process._exiting line? The changes I'm making? Something else? What is the suggested 'why comment' ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the suggested 'why comment' ?

(AIUI) This is the comment, it's a "why comment", it's helpful to preserve it.

    // on the way out, don't bother. it won't get fired anyway.

FWIW I'm sure putting it above the function is fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fishrock123 @gibfahn I've added something above both functions now, let me know if that is what you had in mind or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fishrock123 @gibfahn I've added something above both functions now, let me know if that is what you had in mind or not.

Clearer than the original, thanks.

tickObject);
class TickObject {
constructor(callback, args, triggerId) {
var asyncId = ++async_uid_fields[kAsyncUidCntr];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this is var over const?

Copy link
Contributor Author

@mscdex mscdex Jun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was triggering a deopt.

if (async_hook_fields[kInit] > 0)
emitInit(asyncId, 'TickObject', triggerId, this);
nextTickQueue.push(this);
tickInfo[kLength]++;
}
}

Copy link
Contributor

@refack refack Jun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// we bail fast `if (process._exiting)` since the loop has effectively stopped
// and there will never be a `nextTick`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -2,7 +2,7 @@

var common = require('../common.js');
var bench = common.createBenchmark(main, {
millions: [2]
millions: [12]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this increase? You are getting ridiculous statistical confidence. By some quick calculation, this should only improve the confidence by a factor of 2.5. Compensating for that and the previous version should also give you confidence well beyond ***.

Copy link
Contributor Author

@mscdex mscdex Jun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always choose something large enough to allow for any re-opts to occur and for results to stabilize. With 2, I was getting results that varied quite a bit in comparison because it was finishing so quickly (at least on my machine).

@mscdex mscdex force-pushed the process-nexttick-perf branch 2 times, most recently from ddc2508 to 447162b Compare June 5, 2017 20:26
@mscdex mscdex mentioned this pull request Jun 7, 2017
3 tasks
@mscdex
Copy link
Contributor Author

mscdex commented Jun 7, 2017

FWIW with V8 5.9 in master I now get these results with this PR as-is:

                                               improvement confidence      p.value
 process/next-tick-breadth-args.js millions=2      6.17 %        *** 2.725997e-04
 process/next-tick-breadth.js millions=2           6.68 %         ** 1.568356e-03
 process/next-tick-depth-args.js millions=12      46.71 %        *** 2.458707e-81
 process/next-tick-depth.js millions=12           43.66 %        *** 3.338435e-66

I'm not sure how breadth-args was so much higher previously, I would've expected results more like I am seeing now in 5.9, where the args and non-args are much closer together.

@mscdex mscdex added the wip Issues and PRs that are still a work in progress. label Jun 8, 2017
@mscdex mscdex force-pushed the process-nexttick-perf branch from 447162b to d68f8ba Compare June 8, 2017 19:16
@mscdex
Copy link
Contributor Author

mscdex commented Jun 8, 2017

Ok, I've made a couple more adjustments:

  • Moved the non-assignment code out of the TickObject class constructor as the difference in code location no longer makes any difference with TurboFan in V8 5.9
  • Switched from an array to a linked list for storing TickObject instances, this increases performance more significantly than the original changes alone. I think it may be possible to make some further optimizations to avoid use of the tickInfo counters shared with C++ land (at least the kIndex counter), however I've decided to leave it alone for now.

Here are the results as of these latest changes:

                                               improvement confidence      p.value
 process/next-tick-breadth-args.js millions=4    135.31 %        *** 3.011128e-73
 process/next-tick-breadth.js millions=4         361.40 %        *** 5.097232e-76
 process/next-tick-depth-args.js millions=12      44.08 %        *** 1.276617e-66
 process/next-tick-depth.js millions=12           43.85 %        *** 1.333610e-49

CI: https://ci.nodejs.org/job/node-test-pull-request/8561/

@mscdex mscdex removed the wip Issues and PRs that are still a work in progress. label Jun 8, 2017
@mscdex
Copy link
Contributor Author

mscdex commented Jun 8, 2017

@refack @Fishrock123 @jasnell @evanlucas Please review these recent changes.

default:
args = new Array(arguments.length - 1);
for (var i = 1; i < arguments.length; i++)
args[i - 1] = arguments[i];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is faster than spread parameters even with V8 5.9?

Copy link
Contributor Author

@mscdex mscdex Jun 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how spread could be used here. Do you mean rest parameters? If so, I tried that and while it's a little faster in one benchmark, it's slower to varying degrees in all others.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what I meant. Sorry, could never get all the terminologies right.

@refack
Copy link
Contributor

refack commented Jun 9, 2017

So I'm just writing down what I saw (writing down mostly to organize my thoughts):

  1. the linked list queue
  2. inlined setupInit X 2
  3. switch for args X 2
  4. the const microTasksTickObject

Anything else of significance?

@mscdex
Copy link
Contributor Author

mscdex commented Jun 9, 2017

@refack That's mostly it, unless you want to also count the switch to ES6 class for TickObject.

@refack
Copy link
Contributor

refack commented Jun 9, 2017

Thanks for the confirmation ✔️

@mscdex mscdex force-pushed the process-nexttick-perf branch from d68f8ba to 2126cad Compare June 16, 2017 07:24
@mscdex
Copy link
Contributor Author

mscdex commented Jun 16, 2017

Rebased. If there are no objections in the next 24 hours I will assume all who approved the original changes also approve of these latest changes that I previously outlined.

PR-URL: nodejs#13446
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Evan Lucas <evanlucas@me.com>
@mscdex mscdex force-pushed the process-nexttick-perf branch from 2126cad to 460ee75 Compare June 17, 2017 14:39
@mscdex mscdex merged commit 460ee75 into nodejs:master Jun 17, 2017
@mscdex mscdex deleted the process-nexttick-perf branch June 17, 2017 14:41
addaleax pushed a commit that referenced this pull request Jun 18, 2017
PR-URL: #13446
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Evan Lucas <evanlucas@me.com>
addaleax pushed a commit that referenced this pull request Jun 20, 2017
PR-URL: #13446
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Evan Lucas <evanlucas@me.com>
@addaleax addaleax mentioned this pull request Jun 21, 2017
addaleax pushed a commit that referenced this pull request Jun 21, 2017
PR-URL: #13446
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Evan Lucas <evanlucas@me.com>
addaleax pushed a commit that referenced this pull request Jun 24, 2017
PR-URL: #13446
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Evan Lucas <evanlucas@me.com>
rvagg pushed a commit that referenced this pull request Jun 29, 2017
PR-URL: #13446
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Evan Lucas <evanlucas@me.com>
addaleax pushed a commit that referenced this pull request Jul 11, 2017
PR-URL: #13446
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Evan Lucas <evanlucas@me.com>
addaleax pushed a commit that referenced this pull request Jul 18, 2017
PR-URL: #13446
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Evan Lucas <evanlucas@me.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issues and PRs related to the performance of Node.js. process Issues and PRs related to the process subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants