-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
workers: initial implementation #2133
Conversation
Awesome, thank you for picking this up! |
'use strict'; | ||
|
||
if (!process.features.experimental_workers) { | ||
throw new Error('Experimental workers are disabled'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Strictly speaking, they are not disabled, but not enabled.
Implemented Implemented Implemented |
@petkaantonov "io.js master merged the required libuv fix regarding to closing stdio handles on Windows" Could you provide a link to the libuv issue up there? :) |
|
||
// process-relative uptime base, initialized at start-up | ||
static double prog_start_time; | ||
static bool debugger_running; | ||
// Needed for potentially non-thread-safe process-globas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/globas/globals
That happened: libuv/libuv@60e515d...c619f37. I believe a libuv release is also imminent. |
@piscisaureus yeah the task means that current |
// process.isWorkerInstance | ||
READONLY_PROPERTY(process, | ||
"isWorkerInstance", | ||
Boolean::New(env->isolate(), env->is_worker_instance())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a case when isMainInstance == isWorkerInstance
?
I assume no, and probably that having both is just for convenience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the main thread is guaranteed to have a process.threadId
of 0, there is no need to have process.isMainInstance
nor process.isWorkerInstance
. It also would help reduce the API surface.
On a related point, should it be process.threadId
or process.tid
as discussed in the last PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah they are for convenience and readability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with having both. Matches how cluster
works. This is a nit, but possibly consider how to shorten the names, or maybe just swap Instance
for Thread
.
24fe97b
to
5574ea0
Compare
@kzc The issue you reported was actually known all along in this comment:
"somewhere else" means queuing the A significantly simpler solution (without this problem) now occurs to me where WorkerContexts to be |
@petkaantonov - it's been some time since I looked at your thread worker code but I vaguely recall it already had a cleanup queue that was intended be called asynchronously. The problem was a nested event loop handler within a dispose function. Nesting event loops is something that really should be avoided - it creates complexity and it is difficult to reason about the ordering of events and the correctness of a solution. |
@kzc btw did you want to know the worker's threadId from the worker object on its owner thread as well? As in And yeah I'll change process.threadId to process.tid for better symmetry with process.pid :) |
For my needs just having This brings up a good point - in your implementation can a given worker instance potentially be scheduled on different threads during its lifetime, or are workers always pinned to a specific thread? If not pinned, the worker instance could benefit from having a unique worker id (never reused for the lifetime of the process across all workers), which is different than a threadId. |
Worker is exclusively tied to a thread. I am not sure what benefit there would be from being able to schedule it on different threads, it would be very complex to implement as you need to facilitate the ownership transfer of a v8 isolate and so on. However tying a worker to a specific CPU core will be possible if/when libuv merges libuv/libuv#280. |
The use-after-free and nested event loops should be fixed now |
@petkaantonov - Just curious... instead of posting delete tasks to the main thread with
|
After |
Very cool stuff, but is this not going to be solved/replaced by Vats "some day"? I'm probably missing something, but hope this isn't overlooked. |
cf80d53
to
1e0b6b1
Compare
You seem to imply that all strawman proposals will eventually be implemented but that is not the case. |
I just assume a lot, out of ignorance :) |
@petkaantonov - I tested your "Fix use-after-free" patch on Linux with valgrind as per the instructions here. It appears to work correctly. You may consider getting rid of the async WorkerContext reaper on the main thread and adopting something like this instead which I think is easier to understand and should put less of a burden on the main thread since it would no longer have to poll the WorkerContext queue:
Unfortunately the Mac OSX BADF/select problem mentioned in the last PR still exists. I think it's a libuv issue. There's also an unrelated linux issue outlined below. Using the latest workers implementation as of 1e0b6b1fd5fc93986d056798f47804d0a15a9bec and this patch:
running this command repeatedly:
on a 4 core Linux VM it experiences this error roughly once per 50 runs:
on a 4 core Mac it experiences these errors roughly once per 20 runs:
Ignore the deprecation lines - they are of no consequence to this issue. |
return nullptr; | ||
|
||
// Allocate enough space to include the null terminator | ||
size_t len = StringBytes::StorageSize(string_value, UTF8) + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't StorageSize
take an isolate
?
@isiahmeadows Workers should be in core to support transfer ArrayBuffer or other resources in Node. But I'm not sure whether this PR implement these features. |
@bnoordhuis Is there any place to follow your work on integrating webworkers for v7? |
As Atomics and SharedArrayBuffer api landed in stage 2 of tc39 process and v8 is implementing it, I think nodejs should have thread apis in any form as it's core module, to support shared memory correctly. #bnoordhuis have you checked that sharedmem api? Can it be possible with your implementation? |
@bnoordhuis have you checked that sharedmem api? Can it be possible with your implementation? I' just wonder why i used # instead @ :P |
c133999
to
83c7a88
Compare
|
||
// -p, --print |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are a few blocks that are marked as "changes" when in actually just some formatting / ordering changed. Probably not a good idea to offer possible merge conflicts because of indentation fixes?!
I'll point out that this is a really old PR, and it would be easiest to
rewrite it from scratch again.
…On Tue, Dec 6, 2016, 04:25 Cogery bot ***@***.***> wrote:
[image: review status] <http://127.0.0.1:7000/review/nodejs/node/2133/>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2133 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AERrBFy1afzdViBRi8x8E-DgUUVsBAklks5rFSn8gaJpZM4FUYS9>
.
|
I've been working on and off on a rewrite of this pull request but after discussion with other collaborators I've come to the conclusion that multi-threading support adds too many new failure modes for not enough benefit. The primary motivation for the rewrite was improved IPC performance but I'm fairly confident by now that we can also accomplish that using more traditional means like shared memory and more efficient serialization. I'll go ahead and close this. Thanks for your hard work, Petka. |
For those who want to write Node.js code in multithread program: NW.js implemented this by enabling Node.js in Web Workers: https://nwjs.io/blog/v0.18.4/ |
Hi @bnoordhuis. Can I ask what the latest plan/thinking is for shared memory in Node (i.e. implement workers, or somehow allow transferring SharedArrayBuffers with cluster, or different API altogether)? The latest version seems to have SharedArrayBuffer (and Atomics), but there is no way to currently use this iiuc? Also, what would be the best the way to help out with this? |
Also… could you mention what exactly it is that you have discarded? Multi-threading with the full Node API available in each thread, or something more lightweight like a WebWorkers-style API? |
@addaleax I tried to summarize the state of this issue as well as the different types of concurrency and their pros and cons in the context of Node, and I also kept posting updates about this pull request (mostly thanks to comments from @matheusmoreira - thanks for that) in this answer on Stack Oveflow: Which would be better for concurrent tasks on node.js? Fibers? Web-workers? or Threads? If anything is incorrect or outdated please let me know. |
Shared memory is not my primary focus right now, reducing the overhead of serializing/deserializing is. I ran a lot of benchmarks and in most non-contrived cases the overhead of converting to and from JSON is significantly greater (as in 70/30 or 80/20 splits) than sending it to another process. Once I get the overhead of serdes down, I'm going to look into shared memory support. It's a minefield of platform-specific quirks and limitations so it's probably going to take a while to get it merged in libuv and iron out the bugs. If you want to help out, this is probably a good place to start. V8 5.5 or 5.6 will make it a lot easier to do efficient serdes so that's what I'm currently waiting for.
The former, the node-with-threads approach. WebWorkers-style parallelism is still an option and not terribly hard to implement but I didn't see a point in pursuing that in core, there are already add-ons that do. |
That'd be useful except none of the modules I've seen using true threads (instead of processes) actually support |
Out of curiosity, could you elaborate as to why this is? |
I’m pretty sure Ben is referring to the added |
@addaleax Ah nice, a serializer that doesn't use JSON strings? Sidenote: Something cluster's messaging could make use of too I imagine (would be nice as it's quite slow now imho). |
I mean, I haven’t used it myself yet, but that’s sure what it sounds like. 😄
Yeah, I had that thought, too. But as far as the current slowness is concerned: #10557 seems to fix quite a bit of that. :) |
Is this still being worked on? |
@NawarA ... not at this time. If someone wanted to volunteer to pick it up, I think that would be welcome, but it's quite a task and there would be much to do. |
@jasnell it would be useful if someone could setup a meta-tracking issue like #6980 to just break down the problem on what is required in order to get this done - then volunteers can start picking them up. Based on the above, it's not even clear what the desired end state is. It would be good to clarify how Node users will eventually be able to use SharedArrayBuffer (e.g: shared memory between workers? shared memory between processes? using cluster module?). |
#1159 retargeted to master