-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
workers: implement --max-worker-threads command line option #32606
Conversation
b851f4c
to
1d2c945
Compare
1d2c945
to
d1e4236
Compare
I’m adding the semver-major label because this adds a warning where previously none is emitted. Like I said above, I think it’s worth considering not turning this on by default either… I’ll think about that a bit more. /cc @nodejs/workers |
I'm not a fan of this being activated by default. I think the warning should only be emitted when someone explicitly enables the feature (via command line option or whatever future method). |
495f2a0
to
cc4bfab
Compare
I propose:
|
Ok, updated such that:
|
@addaleax ... ok, took another round of edits to clean up the atomics usage. Quite a bit more reliable now. Also added a test for the nested workers emitting warning on the main thread. Please take another look when you have a moment :-) |
1ce6f54
to
1327c0f
Compare
Utility helper that makes working with uv_cpu_info easier Signed-off-by: James M Snell <jasnell@gmail.com>
Creating too many active worker threads at one time can lead to significant performance degradation of the entire Node.js process. This adds a worker thread counter that will cause a warning to be emitted if exceeded. Workers can still be created beyond the limit, however. The warning is similar in spirit to the too many event handlers warning emitted by EventEmitter. By default, the limit is one less than four times the total number of CPUs available calculated at system start. The `--max-worker-threads` command-line option can be set to set a non-default value. The option is permitted in `NODE_OPTIONS` and must be positive number greater than zero. The counter and the option are per-process in order to account for Workers that create their own Workers. The warning will be emitted once each time the limit is exceeded, so may be emitted more than once per process. That is, if the limit is 2, and 5 workers are created, only a single warning will be emitted. If the number of active workers falls back below 2 and is subsequently exceeded again, the warning will be emitted again. Signed-off-by: James M Snell <jasnell@gmail.com>
29bc7cb
to
2c99cd5
Compare
I can see why having a circuit breaker for runaway threads is useful but:
|
The auto-calculation there is only one of the options in this. I'm kicking off a performance study that is going to be looking at multiple kinds of workloads so we can hopefully narrow in on a better heuristic. One thing we could definitely do to give us wiggle room on the heuristic is to mark this experimental for the time being.
It doesn't for the time being. Definitely open to ideas there. |
I spent a lot of time thinking about auto-tuning in the context of libuv's thread pool and I came to the conclusion that it's hopeless. A program doesn't have enough insight into the system to make educated guesses. It's hard even for the kernel and that has a perfect view of system utilization (but still misses the ability to predict the future.) |
Yep, which is why this prioritizes allowing the user to set a threshold and only uses a warning that still allows the Workers to be created. It's a diagnostic option.. which, btw, we could handle in other ways if the warning is not sufficient. Currently, there is no way of actively tracking the total number of Workers created across the process (async_hooks only provide detail on the Workers created in the current thread). Another approach we could take is to add some diagnostic tracking apis to either the worker_threads or perf_hooks modules that would allow the main thread to report on Workers across the process. |
Marking this in-progress while discussion is ongoing to keep it from landing until resolved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely +1 on the idea but I agree that the default should be 0
as any guesses here are nothing more than guesses IMO.
Also, perhaps we can name this something "weaker" than "max-worker-threads" as for me the name implies that we won't be able to create more than the specified amount of workers and not that we will just get a warning about it?
// too much CPU contention. The default max-worker-threads is | ||
// 4 times the total number of CPUs available but may be set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// too much CPU contention. The default max-worker-threads is | |
// 4 times the total number of CPUs available but may be set | |
// too much CPU contention. By default max-worker-threads | |
// check is disabled but may be set |
uv_free_cpu_info(info_, count_); | ||
} | ||
int count() const { return count_; } | ||
operator bool() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think we usually try to put empty lines in between methods. Could you also run make format-cpp
for consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sigh...
C:\Users\jasne\Projects\node>vcbuild format-cpp
Error: invalid command line option `format-cpp`.
Yeah, when I update this I'll switch over to the linux box and tweak the formatting.
|
||
// Check that when --max-worker-threads is negative, | ||
// the option value is auto-calculated based on the | ||
// number of CPUs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: (here and below)
// number of CPUs | |
// number of CPUs. |
list.push(makeWorker(workers)); | ||
await Promise.all(list); | ||
workers.forEach((i) => i.terminate()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: simplification and a bit fewer promises (and same applies to the other test if you decide to change)
function makeWorker() {
return new Promise((res) => {
const worker = new Worker(expr, { eval: true });
worker.once('online', () => res(worker));
});
}
async function doTest() {
const promises = [];
for (let n = 0; n < cpu_count; n++)
promises.push(makeWorker());
return Promise.all(promises)
.then((workers) => workers.forEach((i) => i.terminate())
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do that, then I’d go one step further and use
async function makeWorker() {
const worker = new Worker(expr, { eval: true });
await once(worker, 'online');
return worker;
}
🙂
All, I'm going to take this a slightly different direction. As I mentioned above, the intent here is largely to improve process-wide visibility of the number of workers that are being created since, currently, they are only visible to the immediate parent thread that created them. Instead of emitting a warning, here's what I'm thinking: Via the perf_hooks API, introduce a new The tracker takes it's design inspiration from async hooks. However, unlike async hooks, it is not limited to reporting on only the handles associated with the threads event loop. const { WorkerHook } = require('perf_hooks');
const hook = new WorkerHook({
init(id, parent_id, handled) {
console.log(`Worker ${id} created by thread ${parent_id}.`);
// handled is described below...
},
destroy(id, parent_id, handled) {
console.log(`Worker ${id} destroyed`);
}
});
hook.enable();
// Atomically get the number of workers process-wide without creating a hook
console.log(WorkerHook.processWorkerCount); Let's say that Main Thread creates The
const hook = WorkerHook({
init(id, parent_id, handled) { /** handled will be false **/ }
});
hook.enable() Main Thread: const hook = WorkerHook({
init(id, parent_id, handled) { /** handled will be true **/ }
});
hook.enable() However... if disable the If multiple The API here serves two key goals:
Several discussion points:
|
Closing this until I can get back to it. |
There are two commits here:
--max-worker-threads
command line flag..From the second commit message:
/cc @addaleax
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes