Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node: improve GetActiveRequests performance #3375

Closed
wants to merge 1 commit into from

Conversation

trevnorris
Copy link
Contributor

v8 is faster at setting object properties in JS than C++. Even when it
requires calling into JS from native code. Make
process._getActiveRequests() faster by doing this when populating the
array containing request objects.

Simple benchmark:

for (let i = 0; i < 22; i++)
  fs.open(__filename, 'r', function() { });
let t = process.hrtime();
for (let i = 0; i < 1e6; i++)
  process._getActiveRequests();
t = process.hrtime(t);
console.log((t[0] * 1e9 + t[1]) / 1e6);

Results between the two:

Previous:  4406 ns/op
Patched:    829 ns/op     4.3x faster

R=@bnoordhuis

Have another addition of improving the same for active handles, but wanted to solicit feedback on the approach early.

@trevnorris trevnorris added the c++ Issues and PRs that require attention from people who are familiar with C++. label Oct 14, 2015
@rvagg
Copy link
Member

rvagg commented Oct 15, 2015

Is "index" the right word to use here? you're just appending and not actually using an index

@ronkorving
Copy link
Contributor

While performance improvements are nice, this definitely adds more code than it removes. Is the use case here continuous monitoring? This is still an undocumented API... for what reason I don't know though, seems it would help to make it public (I've used it too, and find it very useful, but that's usually been limited to shutdown-time).

@trevnorris
Copy link
Contributor Author

@rvagg That's an artifact of an early implementation. I'll find a better name.

@ronkorving The use case is to not have a crappy implementation. This is a technique that I plan to expand through core. Search for Object::Set() and you'll see how heavily we rely on this slow operation. Here was simply the easiest location to kick off.

LOC should have little to no affect on a PR. Added complexity I can understand, and can agree with based on circumstance. This though is fairly straightforward.

@ronkorving
Copy link
Contributor

Is there any way we could help the V8 team (I say we.. as if I really could) to make Object::Set as fast as JIT compiled JavaScript?


for (auto w : *env->req_wrap_queue()) {
if (w->persistent().IsEmpty() == false) {
argv[i++ % 5] = w->object();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use ARRAY_SIZE(argv) instead of hard-coding it in several places? If it gets unwieldy, I suggest writing it as:

static const size_t argc = 5;
Local<Value> argv[argc];
// ...
argv[i++ % argc] = ...;

@bnoordhuis
Copy link
Member

This could be extended to numerous other places but I assume that's your plan anyway. :-)

@trevnorris
Copy link
Contributor Author

@bnoordhuis Comments addressed (I think). Yes, I am planning on extending this across core but figured this would be a good low impact place to start off. :)

for (auto w : *env->req_wrap_queue()) {
if (w->persistent().IsEmpty() == false) {
argv[i++ % argc] = w->object();
if ((i % argc) == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this check that i > 0? It's going to make a superfluous JS call now if I read it right.

EDIT: Never mind, didn't read it right. It's never zero.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the most succinct way I found, but even looking back at this over the weekend I needed to remember what the logic was doing. if you have something more readable in mind I'm open to suggestions. :)

@trevnorris
Copy link
Contributor Author

@bnoordhuis made most suggested changes and added a simple test.

CI: https://ci.nodejs.org/job/node-test-pull-request/536/

ary->Set(i++, w->object());
if (i % argc > 0) {
HandleScope scope(env->isolate());
fn->Call(ctx, ary, i % argc, argv).ToLocalChecked();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the repetition I mean. I'd do the i % argc only once and cache the result in a const size_t remainder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, got it. was trying to make it work w/ the for loop above.

@trevnorris
Copy link
Contributor Author

@bnoordhuis comment addressed.

@jasnell
Copy link
Member

jasnell commented Oct 21, 2015

Nice change, although it makes me sad that Object::Set is so slow. LGTM so long as CI is green and setPropByIndex is refactored to get away from the val0, ... val7 pattern.

v8 is faster at setting object properties in JS than C++. Even when it
requires calling into JS from native code. Make
process._getActiveRequests() faster by doing this when populating the
array containing request objects.

Simple benchmark:

    for (let i = 0; i < 22; i++)
      fs.open(__filename, 'r', function() { });
    let t = process.hrtime();
    for (let i = 0; i < 1e6; i++)
      process._getActiveRequests();
    t = process.hrtime(t);
    console.log((t[0] * 1e9 + t[1]) / 1e6);

Results between the two:

    Previous:  4406 ns/op
    Patched:    690 ns/op     5.4x faster
@trevnorris
Copy link
Contributor Author

@jasnell
Copy link
Member

jasnell commented Oct 21, 2015

LGTM if CI is green

@bnoordhuis
Copy link
Member

LGTM

trevnorris added a commit that referenced this pull request Oct 21, 2015
v8 is faster at setting object properties in JS than C++. Even when it
requires calling into JS from native code. Make
process._getActiveRequests() faster by doing this when populating the
array containing request objects.

Simple benchmark:

    for (let i = 0; i < 22; i++)
      fs.open(__filename, 'r', function() { });
    let t = process.hrtime();
    for (let i = 0; i < 1e6; i++)
      process._getActiveRequests();
    t = process.hrtime(t);
    console.log((t[0] * 1e9 + t[1]) / 1e6);

Results between the two:

    Previous:  4406 ns/op
    Patched:    690 ns/op     5.4x faster

PR-URL: #3375
Reviewed-By: James Snell <jasnell@gmail.com>
Reviewed-By: Ben Noordhuis <ben@strongloop.com>
@trevnorris
Copy link
Contributor Author

None of the failures are related to the PR. Landed on 494227b. Thanks much!

@trevnorris trevnorris closed this Oct 21, 2015
trevnorris added a commit that referenced this pull request Oct 22, 2015
v8 is faster at setting object properties in JS than C++. Even when it
requires calling into JS from native code. Make
process._getActiveRequests() faster by doing this when populating the
array containing request objects.

Simple benchmark:

    for (let i = 0; i < 22; i++)
      fs.open(__filename, 'r', function() { });
    let t = process.hrtime();
    for (let i = 0; i < 1e6; i++)
      process._getActiveRequests();
    t = process.hrtime(t);
    console.log((t[0] * 1e9 + t[1]) / 1e6);

Results between the two:

    Previous:  4406 ns/op
    Patched:    690 ns/op     5.4x faster

PR-URL: #3375
Reviewed-By: James Snell <jasnell@gmail.com>
Reviewed-By: Ben Noordhuis <ben@strongloop.com>
@MylesBorins
Copy link
Contributor

@trevnorris what are your thought regarding adding this commit to LTS?

@MylesBorins
Copy link
Contributor

@trevnorris ping

@trevnorris
Copy link
Contributor Author

@thealphanerd oop. sorry. It's a micro-performance optimization. Merging will doubtfully prevent future conflicts. So, can land but don't think it's necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants