Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker modules #550

Closed
domenic opened this issue Jan 22, 2016 · 7 comments
Closed

Worker modules #550

domenic opened this issue Jan 22, 2016 · 7 comments
Labels
addition/proposal New features or enhancements topic: script

Comments

@domenic
Copy link
Member

domenic commented Jan 22, 2016

In the epic #443, we added <script type="module"> to allow execution of scripts using the ES Module syntactic goal and semantics. Previously, in https://www.w3.org/Bugs/Public/show_bug.cgi?id=22700, there was discussion of allowing modules to be the entry point for workers as well. That thread is pretty long though, and the worker-modules idea there was actually a tangent off of the original thread of "inline workers" (i.e. workers without an external file). So let me start a new thread over here specifically focused on worker modules.

My original thought process was to let <script type="module"> sink in for a while, get implemented, and then extend to workers. But @bterlson (Chakra team) pointed out to me that implementers would probably be better served by having an idea what's coming, either so they can implement both at once or so they can at least architect their code to be ready.

I think the best idea is to extend the Worker (and SharedWorker and ServiceWorker and...) constructors: either new Worker(scriptURL, { type: "module" }) or new Worker(scriptURL, { module: true }). The type variant is probably more future-friendly (imagine in the future new Worker(scriptURL, { type: "wasm" }) or similar). But we'd have to decide how much to parallel the <script type=""> attribute, e.g.: do we say that any JavaScript MIME type maps to a classic script worker? do we say that anything that isn't a JavaScript MIME type or "module" is ignored? or throws? Maybe it would be cleaner to pick a new word like new Worker(scriptURL, { mode: "module" }).

Here are some questions that might be debatable, and possible answers for them:

  1. Should we do MIME type checking for module workers? (Classic workers do not; module scripts do per Add <script type="module"> and module resolution/fetching/evaluation #443.) I think we should; we should in general add stricter MIME type checking to all new features.
  2. Should we make import parallel importScripts and be synchronous, instead of the async two-stage process we have for module scripts? This would in particular impact code interleaved between import statements: in import './a.js'; foo(); import './b.js';, is performing foo() blocked on fetching b, or does it execute as soon as a is fetched and executed? (It would not prevent the UA from prefetching and crawling a module tree starting at a module worker.) I think we should probably just stick with the two-stage process from module scripts, since I can't see any real use cases for the sync loading and it seems better to be consistent to avoid potential developer confusion between the different models.
  3. Should we disallow importScripts inside a module worker? I think we should probably be fine allowing it; in theory authors should probably use import instead, but in practice maybe they want to mix and match with classic scripts designed for importScripts usage.
  4. Should new worker constructors automatically be modules, possibly with ability to change? It's probably too late for ServiceWorker and SharedWorker, but some of the Houdini stuff might want to go this route.

I think the actual spec will not be very hard; worker loading is a much smaller portion of the spec, with fewer switches and dials and historical accidents, than script loading. If implementers would like to see a spec, please chime in and I can work on this. Or if you just want to know the general shape of the plans, I think the above sums it up. Let me know.

@domenic domenic added the addition/proposal New features or enhancements label Jan 22, 2016
@annevk annevk added the needs implementer interest Moving the issue forward requires implementers to express interest label Jan 22, 2016
@annevk
Copy link
Member

annevk commented Jan 22, 2016

  • Yes, MIME types.
  • Having type as an enum with module as only value seems fine. (The wasm folks want to try use content negotiation / sniffing by the way, to ease migration.)
  • It seems bad for parallelism to make import synchronous so let's not do that.
  • If we allow importScripts() you can get CORS-cross-origin scripts. Not sure we want that.

When I last discussed this I brought up the concern that it was unclear to me how you'd bootstrap a custom Loader in a worker. @dherman had the idea that we could allow loading several scripts, that the user agent executes in order. If you can indeed interweave import statements with script, some variant of importScripts() is enough to accomplish that in theory, but it does not seem very performance conscious.

@domenic
Copy link
Member Author

domenic commented Jan 22, 2016

It seems bad for parallelism to make import synchronous so let's not do that.

I don't think this is true. UAs can always preload and prefetch ahead of time as a completely separate process from the specced algorithm. But if we use the same infrastructure as script type="module", they aren't allowed to start executing until everything is loaded (since we don't want to ever block in the middle of a module script). In theory, we could allow worker modules to be different: you could start executing immediately, and when you encounter an import block until the subtree starting at that import is fully fetched, and then continue executing until you hit another import, ...

If we allow importScripts() you can get CORS-cross-origin scripts. Not sure we want that.

Good point. We could change the semantics of importScripts fetches inside module workers I guess. Possibly we could even do new Worker({ type: "module", crossorigin: "one of the three values" }) to more exactly parallel <script type="module" crossorigin="one of the three values"> (with the credentials mode propagating down the entire tree, both for import and importScripts.) Maybe it is simpler to just disallow importScripts though...

@DigiTec
Copy link

DigiTec commented Jan 22, 2016

I think developers get pretty confused when API behavior changes just based on context. Look how many developers get confused when SVG elements don't have the HTMLElement properties and methods on them. Its just confusing. Especially once we started embedding SVG inside of HTML.

If we want to do things like maybe hide APIs (importScripts and only allow import) or change the defaults on how things are fetched, then it may end up being better to define a ModuleWorker or something like that rather than tweaking a bunch of existing APIs.

I think, as we fully spec this, the decision would fall out from how many APIs we end up tweaking. I would be interested in a specification here. As we are implementing some of the module stuff based on your script type="module" spec and I'm implementing proper micro-task support for Promises in our engine, I'm finding my work runs into workers. Will probably save me some time if I can build the Worker event loop knowing how Modules will plug into it.

@domenic
Copy link
Member Author

domenic commented Jan 23, 2016

I think developers get pretty confused when API behavior changes just based on context.

I generally agree. I guess it depends on what you count as "context". I'm not sure I personally see any difference between new WorkerModule("script.js") and new Worker("script.js", { type: "module" }), especially since WorkerModule and Worker instances would have identical interfaces. But I would prefer to keep importScripts, perhaps with security tweaks for CORS.

I think, as we fully spec this, the decision would fall out from how many APIs we end up tweaking. I would be interested in a specification here.

Sounds good! I'll try to give it a shot next-next week. I plan to spec the following decisions, all negotiable through further discussion:

  • Constructor signature new Worker(scriptURL, { type: "module", crossOrigin }). When using type module, the following apply:
  • Added MIME type checking
  • Both import and importScripts use fetch-mode "cors" and credentials mode according to the crossOrigin option.
  • Two-stage fetching/evaluation, like for <script type="module">, for consistency. (So import is unlike importScripts.)

@DigiTec
Copy link

DigiTec commented Jan 23, 2016

Look forward to the spec!

One thing to think about. Since you are allowing the worker to be loaded as a module, it means that its imports will be resolved before it ever executes. When it does execute, all of the imports will be ready to execute as well and that will be when importScripts gets to run. So it'll lead to things like:

import (a);
importScripts(...); // synchronous
import (b);

Where import a/b will be found and linked and fetched before we ever get to the importScripts due to the two stage fetching/evaluation.

Now once we start executing, we have no way of say injecting a script type="module". In fact we can ONLY run importScripts. Does this then mean the importScripts continues to import modules or does it import loose scripts instead? Do we need to upgrade importScripts or have an importModules capability to balance this out?

We do a lot of our testing using a single worker and using postMessage to inform the work to load new scripts. I would imagine the same for modules. We'd want to postMessage and have the worker import additional modules.

@domenic
Copy link
Member Author

domenic commented Jan 23, 2016

Does this then mean the importScripts continues to import modules or does it import loose scripts instead? Do we need to upgrade importScripts or have an importModules capability to balance this out?

I wasn't planning to change the semantics of importScripts; it still does the same thing as before. I think we either keep it the same so that it continues to work with existing libraries that are designed to work with importScripts (like the cache polyfill for service worker), or we just disable it entirely.

I also wasn't planning to add new importModules functionality, but if that's a feature request then let's see if there's interest from vendors across the board. In general runtime loading of modules has been pushed off to the loader API work happening over in whatwg/loader, which still has a ways to go. We could hack in importModules; it's reasonably clear how to spec that. But I think maybe it's better to wait. You can always dynamically construct a worker using data or blob URLs, or create nested workers, for the sort of use cases you describe.

@domenic domenic removed the needs implementer interest Moving the issue forward requires implementers to express interest label Feb 1, 2016
domenic added a commit that referenced this issue Feb 2, 2016
Closes #550. The Worker and SharedWorker constructors now get an options
object, which can be specified as { type: "module", crossOrigin }, where
crossOrigin specifies the credentials mode used for the initial fetch
and for subsequent imports. The module fetching and execution machinery
is entirely reused from that for <script type="module">.

This commit does not include any modifications to importScripts; those
are in a subsequent commit.
domenic added a commit that referenced this issue Feb 2, 2016
Closes #550. The Worker and SharedWorker constructors now get an options
object, which can be specified as { type: "module", crossOrigin }, where
crossOrigin specifies the credentials mode used for the initial fetch
and for subsequent imports. The module fetching and execution machinery
is entirely reused from that for <script type="module">.

This commit does not include any modifications to importScripts; those
are in a subsequent commit.
domenic added a commit that referenced this issue Feb 11, 2016
Closes #550. The Worker and SharedWorker constructors now get an options
object, which can be specified as { type: "module", credentials }, where
credentials specifies the credentials mode used for the initial fetch
and for subsequent imports. (This only applies to module workers; in
the future we could extend it to classic workers if we wished.)

importScripts will always throw a TypeError in modules, per discussion
in the pull request.

The module fetching and execution machinery is entirely reused from that
for <script type="module">.
domenic added a commit that referenced this issue Feb 11, 2016
Closes #550. The Worker and SharedWorker constructors now get an options
object, which can be specified as { type: "module", credentials }, where
credentials specifies the credentials mode used for the initial fetch
and for subsequent imports. (This only applies to module workers; in
the future we could extend it to classic workers if we wished.)

importScripts will always throw a TypeError in modules, per discussion
in the pull request.

The module fetching and execution machinery is entirely reused from that
for <script type="module">. This also refactors the machinery for
fetching a classic worker script out into its own sub-algorithm of the
"Fetching scripts" section.
domenic added a commit that referenced this issue Feb 11, 2016
Closes #550. The Worker and SharedWorker constructors now get an options
object, which can be specified as { type: "module", credentials }, where
credentials specifies the credentials mode used for the initial fetch
and for subsequent imports. (This only applies to module workers; in
the future we could extend it to classic workers if we wished.)

importScripts will always throw a TypeError in modules, per discussion
in the pull request.

The module fetching and execution machinery is entirely reused from that
for <script type="module">. This also refactors the machinery for
fetching a classic worker script out into its own sub-algorithm of the
"Fetching scripts" section.
@subhranshudas
Copy link

I have a question related to es6 modules and web workers (dedicated). lets say i have a module called MyModule which exposes few functions like meth1(), meth2() etc.
So how can i workerize my entire module MyModule in Web Workers like -

import MyModule from 'my-module';

const workerizedMod = workerize(MyModule);

workerizedMod.meth1();
workerizedMod.meth2();

where workerizedMod.meth1() and workerizedMod.meth2()run in a worker thread??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements topic: script
Development

No branches or pull requests

4 participants