Web streams that work across web workers and <iframe>
s.
Suppose you want to process some data that you've downloaded somewhere. The processing is quite CPU-intensive,
so you want to do it inside a worker. No problem, the web has you covered with postMessage
!
// main.js
(async () => {
const response = await fetch('./some-data.txt');
const data = await response.text();
const worker = new Worker('./worker.js');
worker.onmessage = (event) => {
const output = event.data;
const results = document.getElementById('results');
results.appendChild(document.createTextNode(output)); // tadaa!
};
worker.postMessage(data);
})();
// worker.js
self.onmessage = (event) => {
const input = event.data;
const output = process(input); // do the actual work
self.postMessage(output);
}
All is good: your processing does not block the main thread, so your web page remains responsive. However, it takes quite a long time before the results show up: first all of the data needs to be downloaded, then all that data needs to be processed, and finally everything is shown on the page. Wouldn't it be nice if we could already show something as soon as some of the data has been downloaded and processed?
Normally, you'd tackle this with by reading the input as a stream, piping it through one or more transform streams and finally displaying the results as they come in.
// main.js
(async () => {
const response = await fetch('./some-data.txt');
await response.body
.pipeThrough(new TransformStream({
transform(chunk, controller) {
controller.enqueue(process(chunk)); // do the actual work
}
}))
.pipeTo(new WritableStream({
write(chunk) {
const results = document.getElementById('results');
results.appendChild(document.createTextNode(chunk)); // tadaa!
}
}));
})();
Now you can see the first results as they come in, but your processing is blocking the main thread again! Can we get the best of both worlds: process data as it comes in, but off the main thread?
Enter: remote-web-streams
. With this libray, you can create pairs of readable and writable streams
where you can write chunks to a writable stream inside one context, and read those chunks from a readable stream
inside a different context.
Functionally, such a pair behaves just like an identity transform stream, and you can
use and compose them just like any other stream.
The basic steps for setting up a pair of linked streams are:
- Construct a
RemoteReadableStream
. This returns two objects:- a
MessagePort
which must be used to construct the linkedWritableStream
inside the other context - a
ReadableStream
which will read chunks written by the linkedWritableStream
- a
// main.js
import { RemoteReadableStream } from 'remote-web-streams';
const { readable, writablePort } = new RemoteReadableStream();
- Transfer the
writablePort
to the other context, and instantiate the linkedWritableStream
in that context usingfromWritablePort
.
// main.js
const worker = new Worker('./worker.js', { type: 'module' });
worker.postMessage({ writablePort }, [writablePort]);
// worker.js
import { fromWritablePort } from 'remote-web-streams';
self.onmessage = (event) => {
const { writablePort } = event.data;
const writable = RemoteWebStreams.fromWritablePort(writablePort);
}
- Use the streams as usual! Whenever you write something to the
writable
inside one context, thereadable
in the other context will receive it.
// worker.js
const writer = writable.getWriter();
writer.write('hello');
writer.write('world');
writer.close();
// main.js
(async () => {
const reader = readable.getReader();
console.log(await reader.read()); // { done: false, value: 'hello' }
console.log(await reader.read()); // { done: false, value: 'world' }
console.log(await reader.read()); // { done: true, value: undefined }
})();
You can also create a RemoteWritableStream
.
This is the complement to RemoteReadableStream
:
- The constructor (in the original context) returns a
WritableStream
(instead of a readable one). - You transfer the
readablePort
to the other context, and instantiate the linkedReadableStream
withfromReadablePort
inside that context.
// main.js
import { RemoteWritableStream } from 'remote-web-streams';
worker.postMessage({ readablePort }, [readablePort]);
const writer = writable.getWriter();
// ...
// worker.js
import { fromReadablePort } from 'remote-web-streams';
self.onmessage = (event) => {
const { readablePort } = event.data;
const reader = readable.getReader();
// ...
}
In the basic setup, we create one pair of streams and transfer one end to the worker. However, it's also possible to set up multiple pairs and transfer them all to a worker.
This opens up interesting possibilities. We can use a RemoteWritableStream
to write chunks to a worker,
let the worker transform them using one or more TransformStream
s, and then read those transformed chunks
back on the main thread using a RemoteReadableStream
.
This allows us to move one or more CPU-intensive TransformStream
s off the main thread,
and turn them into a "remote transform stream".
To demonstrate these "remote transform streams", we set one up to solve the original problem statement:
- Create a
RemoteReadableStream
and aRemoteWritableStream
on the main thread. - Transfer both streams to the worker. Inside the worker, connect the
readable
to thewritable
by piping it through one or moreTransformStream
s. - On the main thread, write data to be transformed into the
writable
and read transformed data from thereadable
. Pro-tip: we can use.pipeThrough({ readable, writable })
for this!
// main.js
import { RemoteReadableStream, RemoteWritableStream } from 'remote-web-streams';
(async () => {
const worker = new Worker('./worker.js', { type: 'module' });
// create a stream to send the input to the worker
const { writable, readablePort } = new RemoteWritableStream();
// create a stream to receive the output from the worker
const { readable, writablePort } = new RemoteReadableStream();
// transfer the other ends to the worker
worker.postMessage({ readablePort, writablePort }, [readablePort, writablePort]);
const response = await fetch('./some-data.txt');
await response.body
// send the downloaded data to the worker
// and receive the results back
.pipeThrough({ readable, writable })
// show the results as they come in
.pipeTo(new WritableStream({
write(chunk) {
const results = document.getElementById('results');
results.appendChild(document.createTextNode(chunk)); // tadaa!
}
}));
})();
// worker.js
import { fromReadablePort, fromWritablePort } from 'remote-web-streams';
self.onmessage = async (event) => {
// create the input and output streams from the transferred ports
const { readablePort, writablePort } = event.data;
const readable = fromReadablePort(readablePort);
const writable = fromWritablePort(writablePort);
// process data
await readable
.pipeThrough(new TransformStream({
transform(chunk, controller) {
controller.enqueue(process(chunk)); // do the actual work
}
}))
.pipeTo(writable); // send the results back to main thread
};
With this set up, we achieve the desired goals:
- Data is transformed as soon as it arrives on the main thread.
- Transformed data is displayed on the web page as soon as it is transformed by the worker.
- All of the data processing happens inside the worker, so it never blocks the main thread.
The results are shown as fast as possible, and your web page stays snappy. Great success! 🎉
The library works its magic by creating a MessageChannel
between the WritableStream
and the ReadableStream
.
The writable end sends a message to the readable end whenever a new chunk is written,
so the readable end can enqueue it for reading.
Similarly, the readable end sends a message to the writable end whenever it needs more data,
so the writable end can release any backpressure.