-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Communication between workers/windows via Streams API #244
Comments
We've finalized the ReadableStream. Let's investigate the plans again based on the up-to-date ReadableStream design. The 2 plans proposed are:
The key difference between them is whether the initiator creates/sees the stream for the peer or not. (a) has the following topics to solve
Because of these reasons, I proposed (b). Regarding the 3rd point, I think it's no longer problematic. We need to fail it when chunks and an error object passed to controller.error() are not structured clonable, but except for them, ReadableStream class itself doesn't hold anything externally given. Queuing strategy and underlying source logic are now held by the ReadableStreamDefaultController instance. So, we need to neuter (public interface part of. if difficult, maybe just error it) the ReadableStream in the original worker, create a ReadableStream in the destination worker, connect them by some mechanism to transfer the protocol between the newly created ReadableStream and ReadableStreamDefaultController (with structure cloning). We need to investigate pipeTo() optimization story before moving forward though. #359 |
I feel kind of strongly we should just allow If someone wants a WritableStream they should be able to create an identity transform, write to it, and then postMessage() the readable side. I still like just requiring postMessage() to lock the ReadableStream and then drain the underlying source as mentioned in: This lets the browser grab a c++ underlying source and optimize off thread, but also works for js underlying sources. Interposing a writable step seems like it would make this harder, not easier. |
Sorry, I see your last comment in #276 now. Maybe we have more agreement than I realized. |
I think both plans are valuable, but we've heard several times from multiple people at Mozilla that plan (a) is something they are very interested in, so we should probably prioritize it higher. I think we're in agreement on that. And as @wanderview says there's an easy polyfill of (b) on top of (a).
I think just locking it is enough of a neuter. We never need to unlock it either IMO.
Definitely the stream, as that allows the browser to grab a lock. |
What about if you |
That would be a new object, just like when you postMessage any other non-primitive back and forth. |
Actually, where should the error-checking be for when someone further up the pipe chain enqueues chunks that contain things which can't be structured-cloned? Would there be a third |
It's a good question. I think what would happen is that, as the browser reads the chunks in the background, it tries to structured clone. Any errors go to "report the exception" (i.e. fire an error event on the global scope; show up in the console). At that point we probably cancel the source readable stream, and make the readable stream on the other side of the boundary error.
That seems over-restrictive, and not really necessary. Also you can't really tell if something is cloneable until you try to clone it (e.g. it may have a throwing getter property). So we couldn't really perform this check until we're ready to do the clone/transfer. The larger issue this might be considered blocked on is whatwg/html#935; see in particular my comment whatwg/html#935 (comment). |
Is cloning synchronously on enqueue, instead of on dequeue, not feasible/performant? I guess the queue has to stay in the underlyingSource's thread so that desiredSize is always known synchronously, but can you not serialize on enqueue, and deserialize on dequeue? Actually I always wondered why structured-clone isn't ever exposed as an API to javascript. Does any implementation of it exist outside of the engines themselves? realistic-structured-clone tries but it says it's really incomplete (can't handle Map, Set, ArrayBuffer, etc). |
Structured cloning has to happen synchronously when the data is put into x = { a: 1 };
writer.write(x);
x.a = 2; Or x = { a: 1 };
writer.write(x);
someButton.onclick = () =>
{ x.a = 2; }; This is why structured cloning happens synchronously in functions like |
Yeah, that makes sense. I was focusing on the opposite case, transferred ReadableStreams, where the source enqueues things and it would be cloned/serialized synchronously at enqueue time, before the destination realm reads/dequeues it. But yeah, for a transferred WritableStream, it would be like you said. The destination realm would be enqueueing things at writer.write time, and putting the cloned/serialized chunk in the queue at the underlyingSink's thread, before it actually gets dequeued it at sink.write time. |
As [[storedError]] is observable, can be an arbitrary object, and is very likely an uncloneable Error, it can't be sent to a new realm reliably. So just forbid errored streams. Still needs clearer semantics of when structured cloning occurs and how DataCloneErrors are reported. Cloning needs polyfilling somehow too. Related to: whatwg#244, whatwg#276
As [[storedError]] is observable, can be an arbitrary object, and is very likely an uncloneable Error, it can't be sent to a new realm reliably. So just forbid errored streams. Still needs clearer semantics of when structured cloning occurs and how DataCloneErrors are reported. Cloning needs polyfilling somehow too. Related to: whatwg#244, whatwg#276
As [[storedError]] is observable, can be an arbitrary object, and is very likely an uncloneable Error, it can't be sent to a new realm reliably. So just forbid errored streams. Still needs clearer semantics of when structured cloning occurs and how DataCloneErrors are reported. Cloning needs polyfilling somehow too. Related to: whatwg#244, whatwg#276
There's still the big question of when exactly various writable stream |
Assuming we want chunks to be transferred where possible, rather than copied, we have three main options on the table: 1. Opportunistic greedy transfer.Any part of the chunk that is transferable is transferred, everything else is copied. Example: Advantages:
Disadvantages:
2. Provided by strategy.An extra function is added to the strategy which provides a list of objects that are to be transferred when the stream is transferred. Example Advantages:
Disadvantages:
3. New meta-protocol.Objects which are capable of being transferred contain metadata saying how to do it. Example: Advantages:
Disadvantages:
|
Regarding 2, TransformStream constructor has strategy options, having a strategy for postMessage doesn't sound so strange. |
For (3), I'm assuming the protocol would be a function of some sort. (Either one that returns the transferable parts, or one that somehow "does the transfer".) Doesn't that reintroduce
to (3) as well? To me, the transfer-twice problem is the worst issue here. Everything else in the disadvantages columns seems solvable. |
I've been wondering whether a meta-protocol as simple as
could work. For a nested object you'd end up with something like {
key: 'outer',
value: {
label: 'car',
data: new Uint8Array([1, 2]),
[Symbol.transferKeys]: ['data']
},
[Symbol.transferKeys]: ['value']
} It seems overly simplistic, but I haven't come up with a case where it wouldn't work yet.
Thinking about it further, I realised it can be a problem even with a single transfer. Consider these steps:
The intent is that |
Hmm, I see. That seems to basically work, as far as I can tell... So as much as I like (2)'s ergonomics, it seems like both (1) and your version of (3) are more workable. Let me comment on why I think the disadvantages are OK in both cases, before going to sleep: On (1):
Not a big deal. I think we'd integrate it into StructuredSerializeWithTransfer, which already is doing object-graph crawling.
Since this is opt-in on the stream level, this seems fine. (I.e., we aren't applying this to every ReadableStream.) It seems like an OK thing to say that passing your chunks to a transferred readable stream means that not only will you never see the readable stream again, you'll never see your chunks again. On (3):
Hmm, OK, this is an issue. Somehow I missed this earlier. Right now I'm feeling
I think you've answered this :).
Not so bad. The hardest thing is picking a place to park the symbol. (I'm not sure putting web platform stuff on Once we've done that, it's similar spec work to (1); we need to insert some steps into StructuredSerializeWithTransfer. It does have more wide-ranging effects, but the spec and implementation work should be similar, I think: just add some extra auto-discovery-of-transferableness steps inside the existing graph-crawling. |
I had a discussion with @yutakahirano about transferring strategies. He proposed the mental model of a transfer consisting of creating a special kind of TransformStream with one leg in the source context and one leg in the destination context. In the destination context you receive one of these legs directly from the MessageEvent, and in the source context the leg is piped to/from the source stream. I think this model is very helpful. We talked about various ways of attaching a strategy to a WritableStream after it had arrived in the destination context, for example with an We discussed changing I brought up the issue that const chunk = new ArrayBuffer(10);
writer.write(chunk);
console.log(chunk.byteLength); should log This means the chunk needs to be cloned into the target thread immediately. However, something still needs to be queued on the sending side in order for backpressure to work properly. I think we need some kind of placeholder that represents the chunk in the queue on the sending side until the chunk is read on the receiving side. We concluded that if we do not have We also discussed what happens if you try to transfer locked streams. Ideally |
This is interesting, but seems pretty un-ergonomic.
Even if we put it in, I'm not sure how it would solve the essential problem that you can't transfer functions across threads :(.
Agreed. This and your subsequent conclusions all make sense. An alternate approach would be to do two transfers: one same-realm that transfers from the producer's control into the internal queue on the sender side, and one that transfers from the sender side to the receiver side. Not sure if that's better.
I'm not quite sure I follow this case.
Oh, definitely. To be transferable you need a [[Detached]] slot, and the contract is you're supposed to set that to true once you get transferred, so that future attempts to transfer (or clone) throw. So on a spec level at least this is pretty much a given.
I still see IsDisturbed as kind of weird. But maybe the reasons we had for using it in fetch also apply here? I dunno. |
I'll be happy if postMessage doesn't care about IsDisturbed. |
I think in practice implementations will want to do only one transfer, to avoid walking the object graph twice. Unless it makes the standard hugely more complicated, I'd rather spec it the way browsers will implement it.
Let's say a developer transfers a readable stream to a worker, and wants to have 64 KB of buffering in the worker in addition to whatever buffering is configured in the main page. Then it would work just as well to write const rs = transferredReadableStream.pipeThrough(
new TransformStream({}, new ByteLengthStrategy({highWaterMark: 65536}))); as to write rs.addStrategy(new ByteLengthStrategy({highWaterMark: 65536})); So there's no need for an API like |
Yes. I've thought about it some more and I don't think there's a need for it to look at IsDisturbed. |
My current plan is to do the work in two stages. The first stage will only clone chunks. This will be inefficient for ArrayBuffers and transferring stream-of-streams will still not be supported. As a second stage, the The new syntax can be used with an empty transferList to get "unshared object" semantics, meaning that changing an object after passing it to a stream will have no impact on the object that is returned from the stream. This is a similar idea to how the byte stream API protects against concurrent modification of the data, and is nice from the point of view of enforcing correctness. |
Hi @ricea - What is the timeline for implementing the plan described in the comment above? |
I have a PR in progress for stage 1 #1053 which I hope to land soon. It can be tested in Chrome using the --enable-experimental-web-platform-features flag. Stage 2 is still in the early design stages and I wouldn't expect it to be implemented this year. |
Tagging @guidou for interest. |
/cc @sicking @domenic
A separate thread to continue discussion started at #97 (comment).
The text was updated successfully, but these errors were encountered: