Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process: specialize building and storage of process.config #24816

Closed
wants to merge 1 commit into from

Conversation

joyeecheung
Copy link
Member

Instead of treating config.gypi as a JavaScript file, specialize
the processing in js2c and make the serialized result a real JSON
string (with 'true' and 'false' converted to boolean values) so
we don't have to use a custom deserializer during bootstrap.

In addition, store the JSON string separately in NativeModuleLoader,
and keep it separate from the map of the builtin source code, so
we don't have to put it onto NativeModule._source and delete it
later, though we still preserve it in process.binding('natives'),
which we don't use anymore.

This patch also makes the map of builtin source code and the
config.gypi string available through side-effect-free getters
in C++.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added the lib / src Issues and PRs related to general changes in the lib or src directory. label Dec 3, 2018
@joyeecheung
Copy link
Member Author

@joyeecheung joyeecheung requested review from addaleax and devsnek and removed request for addaleax December 3, 2018 22:34
@@ -9,7 +9,7 @@ const common = require('../common');
const assert = require('assert');

const isMainThread = common.isMainThread;
const kMaxModuleCount = isMainThread ? 58 : 80;
const kMaxModuleCount = isMainThread ? 59 : 81;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incremented because previously we use getInternalBinding to load native_module which bypasses the caching and process.moduleLoadList bookkeeping. This patch uses internalBinding instead so it is now being counted into the process.moduleLoadList

@joyeecheung
Copy link
Member Author

@joyeecheung
Copy link
Member Author

joyeecheung commented Dec 4, 2018

@bnoordhuis
Copy link
Member

If you're going through all that preprocessing trouble anyway, can't you turn config.gypi into a real JS object literal, write that to a file and include it in the build? That gets rid of the overhead of deserializing it, too.

@joyeecheung
Copy link
Member Author

joyeecheung commented Dec 4, 2018

@bnoordhuis Do you mean prefixing it with a module.exports = so that it can be required? We will still need to parse and compile the content of that string, and we still need to delete it from NativeModule._source so it always needs some special treatment, I am not sure what the benefit is compared to JSON.parse?

@bnoordhuis
Copy link
Member

I mean that you write a .js file that contains this:

module.exports = {
  target_defaults: {
    // ...
  },
  variables: {
    // ...
  },
};

I.e., don't store it as a string, store it as a JS object literal.

@joyeecheung
Copy link
Member Author

@bnoordhuis But it will still be stored, in the binary, as a string, just like the source code of other builtin modules? So essentially we go from doing this

// C++
source_[id].ToStringChecked();  // pass it into to JS
// JS
JSON.parse(someStringFromCpp);

to

// C++
auto str = source[id].ToStringChecked();
ScriptCompiler::Source source(str, ...);
auto func = ScriptCompiler::CompileFunctionInContext(...source...);  // return to JS

// JS
const fn = compileFunction(id);
fn(process, ...);  // where you get the module.exports

@bnoordhuis
Copy link
Member

Right, but the plan is to move to precompiled snapshots eventually, isn't it?

@joyeecheung
Copy link
Member Author

@bnoordhuis Eventually, maybe (I believe we are currently stuck on gyp refactoring)...but the result of JSON.parse can also be part of the snapshot? And it is somewhat clearer to me if the config gets some special treatment like this because it has to be treated specially in the end since it’s not require-able, and it’s eassentially data, not code.

@joyeecheung
Copy link
Member Author

@bnoordhuis
Copy link
Member

but the result of JSON.parse can also be part of the snapshot?

Okay, fair enough. My point was more that if you're putting in the work to make it better, why not go the extra mile to make it great? The only reason it's using JSON.parse() at all is that it was the least amount of effort back when process.config was added.

@joyeecheung
Copy link
Member Author

Okay, fair enough. My point was more that if you're putting in the work to make it better, why not go the extra mile to make it great? The only reason it's using JSON.parse() at all is that it was the least amount of effort back when process.config was added.

I think the reason would be, I personally think using JSON.parse is a better approach already? If ew do it with an object literal, aside from adding the module.exports prefix in js2c.py, I think we'd also need to restore the delete NativeModule._source.config after setting up process.config, or exclude it from the list of module that can be required in another way somehow - to me it's easier to understand if config does not show up in that list in the first place.

And this brings us back to #24816 (comment) - we will be compiling and executing code to grab the data out instead of simply parsing it as data. It does unify the processing but config is already just different from other native modules - the generated content completely depends on how you run configure, and it's only executable by us when we bootstrap, and doesn't contain any code that will be executed again after that.

Or is there any reason that it's better to make the code of config executable? Like...making it require-able if you do a require('./config') in the repl from the project directory without building the binary? But I don't know what's the use case of that..?

@joyeecheung
Copy link
Member Author

joyeecheung commented Dec 5, 2018

Also, I think if we keep it as JSON, we are still able to replace

node/Makefile

Lines 338 to 339 in 6ccc80c

node_use_openssl = $(call available-node,"-p" \
"process.versions.openssl != undefined")

with something done via python. Currently, if you touch something in the docs, and you just had a bad build, this will error when you try building the binary again via make test because the dependency in the Makefile is kind of messy ATM.

EDIT: or maybe that's still possible if we do a rough search/regexp matching..but having it as JSON is simpler

@joyeecheung
Copy link
Member Author

Also...if we are writing config.gypi as config.js, then we will need to rewrite this somehow:

node/tools/install.py

Lines 21 to 23 in 6ccc80c

def load_config():
s = open('config.gypi').read()
return ast.literal_eval(s)

Instead of treating config.gypi as a JavaScript file, specialize
the processing in js2c and make the serialized result a real JSON
string (with 'true' and 'false' converted to boolean values) so
we don't have to use a custom deserializer during bootstrap.

In addition, store the JSON string separately in NativeModuleLoader,
and keep it separate from the map of the builtin source code, so
we don't have to put it onto `NativeModule._source` and delete it
later, though we still preserve it in `process.binding('natives')`,
which we don't use anymore.

This patch also makes the map of builtin source code and the
config.gypi string available through side-effect-free getters
in C++.
@joyeecheung
Copy link
Member Author

joyeecheung commented Dec 6, 2018

@joyeecheung
Copy link
Member Author

@joyeecheung
Copy link
Member Author

@joyeecheung
Copy link
Member Author

I plan to land this after the 7 day wait for single approval is passed and when I manage to get the CI green, because it's conflicting with another patch of mine to share the code cache among worker threads.

Regarding rewriting config.gypi as a JS object literal, I think we can leave that to another patch if we want to do it, but mind that would involve changes to install.py which currently assumes that config.gypi can be evaluated as a python dict.

@targos
Copy link
Member

targos commented Dec 8, 2018

I plan to land this after the 7 day wait for single approval is passed

Let's try to ping @nodejs/python first.

@joyeecheung
Copy link
Member Author

Landed in 44a5fe1

joyeecheung added a commit that referenced this pull request Dec 10, 2018
Instead of treating config.gypi as a JavaScript file, specialize
the processing in js2c and make the serialized result a real JSON
string (with 'true' and 'false' converted to boolean values) so
we don't have to use a custom deserializer during bootstrap.

In addition, store the JSON string separately in NativeModuleLoader,
and keep it separate from the map of the builtin source code, so
we don't have to put it onto `NativeModule._source` and delete it
later, though we still preserve it in `process.binding('natives')`,
which we don't use anymore.

This patch also makes the map of builtin source code and the
config.gypi string available through side-effect-free getters
in C++.

PR-URL: #24816
Reviewed-By: Gus Caplan <me@gus.host>
@MylesBorins
Copy link
Contributor

This doesn't land cleanly on v11.x, could it please be backported

@BridgeAR
Copy link
Member

BridgeAR commented Jan 9, 2019

Ping @joyeecheung

@joyeecheung
Copy link
Member Author

joyeecheung commented Jan 11, 2019

The following ancestor commits of 44a5fe1 are not on v11.x-staging

    - 14707b681d lib: fix nits in lib/internal/bootstrap/cache.js, https://github.com/nodejs/node/pull/24581
         lib/internal/bootstrap/cache.js
    - 7778c035a0 src: use STL containers instead of v8 values for static module data, https://github.com/nodejs/node/pull/24384
         lib/internal/bootstrap/cache.js
         src/env.h
         src/node_native_module.cc
         src/node_native_module.h
         src/node_union_bytes.h
         tools/js2c.py
    - bd765d61d7 src: compile native modules and their code cache in C++, https://github.com/nodejs/node/pull/24221
         lib/internal/bootstrap/cache.js
         src/env.h
         src/node_native_module.cc
         src/node_native_module.h
    - 350bef6a10 lib: add crypto dependant modules cannotUseCache, https://github.com/nodejs/node/pull/24100
         lib/internal/bootstrap/cache.js
    - 9c0c3f1443 lib: fix code cache generation, https://github.com/nodejs/node/pull/23855
         lib/internal/bootstrap/cache.js
    - 63b06551f4 src,lib: make process.binding('config') internal, https://github.com/nodejs/node/pull/23400
         lib/internal/bootstrap/loaders.js
         lib/internal/bootstrap/node.js
    - edcb950090 src: use NativeModuleLoader to compile all the bootstrappers, https://github.com/nodejs/node/pull/24775
         lib/internal/bootstrap/loaders.js
         lib/internal/bootstrap/node.js
    - 1db808ca5f lib: fix comment nits in bootstrap\loaders.js, https://github.com/nodejs/node/pull/24641
         lib/internal/bootstrap/loaders.js
    - a89b873e63 lib: remove duplicated noop function, https://github.com/nodejs/node/pull/24770
         lib/internal/bootstrap/node.js
    - 976065d9cb lib: do not register DOMException in a module, https://github.com/nodejs/node/pull/24708
         lib/internal/bootstrap/node.js
    - 36f483b79b lib: move setupAllowedFlags() into per_thread.js, https://github.com/nodejs/node/pull/24704
         lib/internal/bootstrap/node.js
         lib/internal/process/per_thread.js
    - 7b8058a39e process: refactor the bootstrap mode branching for readability, https://github.com/nodejs/node/pull/24673
         lib/internal/bootstrap/node.js
    - 0f18a40374 lib: change callbacks to arrow function, https://github.com/nodejs/node/pull/24625
         lib/internal/bootstrap/node.js
    - 771585fcf5 lib: refactor setupInspector in bootstrap/node.js, https://github.com/nodejs/node/pull/24446
         lib/internal/bootstrap/node.js
    - 5877836a33 src: remove pushValueToArray and SetupProcessObject, https://github.com/nodejs/node/pull/24264
         lib/internal/bootstrap/node.js
         src/env.h
    - 4ebb3f35e2 process: simplify check in previousValueIsValid(), https://github.com/nodejs/node/pull/24836
         lib/internal/process/per_thread.js
    - 87b808f761 src,lib: move `natives` and `constants` to `internalBinding()`, https://github.com/nodejs/node/pull/23663
         lib/internal/process/per_thread.js
    - 61a89630ee inspector: split the HostPort being used and the one parsed from CLI, https://github.com/nodejs/node/pull/24772
         src/env.h
    - ee80aaab13 http2: set js callbacks once, https://github.com/nodejs/node/pull/24063
         src/env.h
    - fe303b9b2d tls: include elliptic curve X.509 public key info, https://github.com/nodejs/node/pull/24358
         src/env.h
    - 5f9b624766 src: re-sort the symbol macros, https://github.com/nodejs/node/pull/24382
         src/env.h
    - d4654d89be deps: introduce `llhttp`, https://github.com/nodejs/node/pull/24059
         src/env.h
    - f01518edfd src: improve StreamBase write throughput, https://github.com/nodejs/node/pull/23843
         src/env.h
    - 5c5bb36b74 src: avoid extra `Persistent` in `DefaultTriggerAsyncIdScope`, https://github.com/nodejs/node/pull/23844
         src/env.h
    - aa943d098e http: make parser choice a runtime flag, https://github.com/nodejs/node/pull/24739
         src/node_binding.cc
    - 3d668262a0 src: move C++ binding/addon related code into node_binding{.h, .cc}, https://github.com/nodejs/node/pull/24701
         src/node_binding.cc
    - 0c65314e0e src: fix type mismatch warnings from missing priv, https://github.com/nodejs/node/pull/24737
         src/node_native_module.cc
         src/node_native_module.h
    - 804138093a src: use NativeModuleLoader to compile per_context.js, https://github.com/nodejs/node/pull/24660
         src/node_native_module.cc
         src/node_native_module.h
    - 333783643e console: lazy load process.stderr and process.stdout, https://github.com/nodejs/node/pull/24534
         test/parallel/test-bootstrap-modules.js
    - f895b5a58e src: cache the result of GetOptions() in JS land, https://github.com/nodejs/node/pull/24091
         test/parallel/test-bootstrap-modules.js
    - 26b58eabc6 tools: prepare tools/js2c.py for Python 3, https://github.com/nodejs/node/pull/24798
         tools/js2c.py

addaleax pushed a commit that referenced this pull request Jan 14, 2019
Instead of treating config.gypi as a JavaScript file, specialize
the processing in js2c and make the serialized result a real JSON
string (with 'true' and 'false' converted to boolean values) so
we don't have to use a custom deserializer during bootstrap.

In addition, store the JSON string separately in NativeModuleLoader,
and keep it separate from the map of the builtin source code, so
we don't have to put it onto `NativeModule._source` and delete it
later, though we still preserve it in `process.binding('natives')`,
which we don't use anymore.

This patch also makes the map of builtin source code and the
config.gypi string available through side-effect-free getters
in C++.

PR-URL: #24816
Reviewed-By: Gus Caplan <me@gus.host>
@addaleax
Copy link
Member

I’ve backported this to v11.x while resolving a tiny merge conflict in the counter in test/parallel/test-bootstrap-modules.js in 0e2fbe4.

@refack refack added build Issues and PRs related to build files or the CI. python PRs and issues that require attention from people who are familiar with Python. labels Jan 14, 2019
refack pushed a commit to refack/node that referenced this pull request Jan 14, 2019
Instead of treating config.gypi as a JavaScript file, specialize
the processing in js2c and make the serialized result a real JSON
string (with 'true' and 'false' converted to boolean values) so
we don't have to use a custom deserializer during bootstrap.

In addition, store the JSON string separately in NativeModuleLoader,
and keep it separate from the map of the builtin source code, so
we don't have to put it onto `NativeModule._source` and delete it
later, though we still preserve it in `process.binding('natives')`,
which we don't use anymore.

This patch also makes the map of builtin source code and the
config.gypi string available through side-effect-free getters
in C++.

PR-URL: nodejs#24816
Reviewed-By: Gus Caplan <me@gus.host>
@BridgeAR BridgeAR mentioned this pull request Jan 16, 2019
BridgeAR pushed a commit to BridgeAR/node that referenced this pull request Jan 16, 2019
Instead of treating config.gypi as a JavaScript file, specialize
the processing in js2c and make the serialized result a real JSON
string (with 'true' and 'false' converted to boolean values) so
we don't have to use a custom deserializer during bootstrap.

In addition, store the JSON string separately in NativeModuleLoader,
and keep it separate from the map of the builtin source code, so
we don't have to put it onto `NativeModule._source` and delete it
later, though we still preserve it in `process.binding('natives')`,
which we don't use anymore.

This patch also makes the map of builtin source code and the
config.gypi string available through side-effect-free getters
in C++.

PR-URL: nodejs#24816
Reviewed-By: Gus Caplan <me@gus.host>
@MylesBorins MylesBorins mentioned this pull request Jan 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Issues and PRs related to build files or the CI. lib / src Issues and PRs related to general changes in the lib or src directory. python PRs and issues that require attention from people who are familiar with Python.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants