Use of function section reordering in node binary for performance #16131

sathvikl · 2017-10-10T20:30:28Z

Version: node v6.9.5
Platform: Linux Ubuntu SMP x86_64 GNU/Linux
Subsystem: Linker, Build

Currently we don't use the gold linker with the node build system. The gold linker provides an option --section-ordering-file which gives an option to the user to provide the ordering of frequently used functions to the linker.

The changes would be:

diff --git a/common.gypi b/common.gypi
index 8a3179d..d2f7ee5 100644
--- a/common.gypi
+++ b/common.gypi
@@ -114,7 +114,8 @@
         'variables': {
           'v8_enable_handle_zapping': 0,
         },
-        'cflags': [ '-O3' ],
+        'cflags': [ '-O3', '-fuse-ld=gold', '-ffunction-sections', '-fdata-sections' ],
+        'ldflags': ['-fuse-ld=gold', '-Wl,--section-ordering-file=/../stable-node/hot-function-ordering.txt'],
         'conditions': [
           ['target_arch=="x64"', {
             'msvs_configuration_platform': 'x64',

The hot-function-ordering file itself was generated using the hfsort tool which is part of the Facebook/HHVM project.

This tool uses the Linux perf cycle accurate profile and executes pettis-hansen ordering algorithm on it.

I tried this methodology with the Ghost.js benchmark (which I created) and the acme-air benchmark and saw a speed-up of 2.5% - 3.5 % in throughput.

I wanted to know if it would make sense to make this change in the build system ? I could send a PR and see if there are any adverse affects on the known set of benchmarks, I haven't seen any regression on the micro benchmarks so far.

The text was updated successfully, but these errors were encountered:

joyeecheung · 2017-10-11T13:32:34Z

I think this depends on whether/how many of the platforms we support are supported by hfsort. cc @nodejs/build

jasnell · 2017-10-11T13:37:02Z

Ping @addaleax @bnoordhuis @mhdawson

mhdawson · 2017-10-11T14:07:49Z

Can you expand on what the gold linker is, and how it is different from the one we are using ?

addaleax · 2017-10-11T14:11:15Z

@mhdawson It’s an alternative linker for ELF platforms, i.e. it’s not cross-platform. It’s also what we’d use for things like LTO (#7400) if we ever want that (which I think we do)

bnoordhuis · 2017-10-11T17:26:41Z

#1409 might be relevant, that was about PGO, profile-guided optimization. It stranded on a lack of representative benchmarks but that was before we had a benchmarking WG.

gibfahn · 2017-10-12T19:11:50Z

I don't see a problem with adding it as long as:

It doesn't introduce any new dependencies for users (I'm pretty sure it won't).
People can still build node without it (so we fall back to the linkers we currently use if no gold linker).
It doesn't massively bloat the Makefile (pretty sure it won't).

If you could raise a PR that would be great. Once/if the PR lands we'd need to install the gold linker on test and release platforms, and make sure release builds are using it.

uttampawar · 2017-10-13T21:13:22Z

@gibfahn As @sathvikl said, one dependency it has is on 'hfsort' tool from the HHVM project. Linker change itself is small but we need to generated sorted hot function file for a particular workload/benchmark.

gibfahn · 2017-10-15T15:16:23Z

@uttampawar but that's a build-time dependency right? Anyone downloading the binaries won't need this.

uttampawar · 2017-10-23T17:41:03Z

@gibfahn Yes it is a build time dependency. To get the benefit of code-layout optimization, build needs to be specific for a workload so it means binary is for specific workload/benchmark. If can happen that the a binary is beneficial in general across most/all workloads or at least no observed regression then yes we can provide that binary.

sathvikl · 2017-10-24T22:24:02Z

The changes are entirely in the gyp build files to use gold linker if it exists.
If you are familiar with the gyp build scripts, can you please share thoughts on how to check what linker does the system have ?

Changes will be specific to linux-x64 OS.

gibfahn · 2017-10-26T18:38:25Z

If you are familiar with the gyp build scripts, can you please share thoughts on how to check what linker does the system have ?

I'm afraid I'm not, people like @bnoordhuis @targos @addaleax @refack might know more.

Speaking of which, we should probably have an @nodejs/gyp team for that kind of question, people normally cc/ nodejs/build, but it's not really our area of expertise.

refack · 2017-10-26T19:10:45Z

There are a couple of detections possible:

We inject gas_version and llvm_version in config.gypi
We should inject gcc_version since we already detect it in

node/configure

Lines 656 to 657 in 75f1087

    
           elif clang_version < '3.4.2' if is_clang else gcc_version < '4.9.4': 
        
             warn('C++ compiler too old, need g++ 4.9.4 or clang++ 3.4.2 (CXX=%s)' % CXX)

For platform you can use a condition similar to
[ 'OS in "linux freebsd openbsd solaris android aix"', {}]

Speaking of which, we should probably have an @nodejs/gyp team

Also for /CCing on GYP and .gyp changes

refack · 2017-10-26T19:12:01Z

P.S. ./configure could also do feature detection, and inject that into config.gypi

jasnell · 2017-10-27T18:04:05Z

fwiw, I'm +1 on moving forward with this. Obviously I'd prefer to have something that could work across all of the platforms we support but if it's limited to linux then that's ok. I am also slightly concerned about making sure the training benchmark doesn't end up optimizing for a specific type of workload at the cost of regressing on others. I highly doubt that will be a problem but it would be good to have benchmarks that watch for that.

In a chat with @uttampawar, he indicated that one option would be to expose a configure flag that allows folks to build with their own training data which would lessen the likelihood of issues there. As long as the process for capturing that training data is also documented, then I'm all for it.

Would love to see a PR soon :-)

gibfahn · 2017-10-27T23:39:11Z

I am also slightly concerned about making sure the training benchmark doesn't end up optimizing for a specific type of workload at the cost of regressing on others. I highly doubt that will be a problem but it would be good to have benchmarks that watch for that.

Adding support for it in ./configure for xLinux seems like a good first step. We can do some trial builds and see how it goes. Sounds like it might be useful to have as a build option even if we don't do it by default.

In a chat with @uttampawar, he indicated that one option would be to expose a configure flag that allows folks to build with their own training data which would lessen the likelihood of issues there. As long as the process for capturing that training data is also documented, then I'm all for it.

So people would have to build their own versions of Node to fix performance regressions? If it actually does regress certain use cases I'm not sure "build it yourself" is a reasonable answer.

jasnell · 2017-10-27T23:46:14Z

Certainly. We should make sure this doesn't add regressions but we can only do so with use cases we can anticipate.

uttampawar · 2017-10-31T00:36:06Z

@gibfahn If we can add this option and watch out for any regression for few use cases then that's great but as @jasnell said we can only do so for few cases. My suggestion is that in addition for every other use case out there we can still provide a build target (use of gold linker) for users who are willing to build, train and rebuild node.js for more performance just for their application. It's like supporting PGO compiler option in our build scripts.

sathvikl · 2017-11-08T21:38:01Z

@gibfahn PR is at #16891

apapirovski · 2018-11-29T04:42:18Z

I'm going to close this out given the lack of movement in over a year and given that the associated PR had been unattended since February.

mscdex added build Issues and PRs related to build files or the CI. performance Issues and PRs related to the performance of Node.js. labels Oct 10, 2017

sathvikl mentioned this issue Oct 30, 2017

Add Ghost.js workload to the benchmarking nodejs/benchmarking#159

Closed

sathvikl mentioned this issue Nov 8, 2017

Add support for including GNU Gold linker's section ordering #16891

Closed

1 task

apapirovski closed this as completed Nov 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use of function section reordering in node binary for performance #16131

Use of function section reordering in node binary for performance #16131

sathvikl commented Oct 10, 2017 •

edited by gibfahn

Loading

joyeecheung commented Oct 11, 2017

jasnell commented Oct 11, 2017

mhdawson commented Oct 11, 2017

addaleax commented Oct 11, 2017

bnoordhuis commented Oct 11, 2017

gibfahn commented Oct 12, 2017

uttampawar commented Oct 13, 2017 •

edited

Loading

gibfahn commented Oct 15, 2017

uttampawar commented Oct 23, 2017

sathvikl commented Oct 24, 2017

gibfahn commented Oct 26, 2017

refack commented Oct 26, 2017

refack commented Oct 26, 2017

jasnell commented Oct 27, 2017

gibfahn commented Oct 27, 2017

jasnell commented Oct 27, 2017

uttampawar commented Oct 31, 2017

sathvikl commented Nov 8, 2017

apapirovski commented Nov 29, 2018

Use of function section reordering in node binary for performance #16131

Use of function section reordering in node binary for performance #16131

Comments

sathvikl commented Oct 10, 2017 • edited by gibfahn Loading

joyeecheung commented Oct 11, 2017

jasnell commented Oct 11, 2017

mhdawson commented Oct 11, 2017

addaleax commented Oct 11, 2017

bnoordhuis commented Oct 11, 2017

gibfahn commented Oct 12, 2017

uttampawar commented Oct 13, 2017 • edited Loading

gibfahn commented Oct 15, 2017

uttampawar commented Oct 23, 2017

sathvikl commented Oct 24, 2017

gibfahn commented Oct 26, 2017

refack commented Oct 26, 2017

refack commented Oct 26, 2017

jasnell commented Oct 27, 2017

gibfahn commented Oct 27, 2017

jasnell commented Oct 27, 2017

uttampawar commented Oct 31, 2017

sathvikl commented Nov 8, 2017

apapirovski commented Nov 29, 2018

sathvikl commented Oct 10, 2017 •

edited by gibfahn

Loading

uttampawar commented Oct 13, 2017 •

edited

Loading