Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add <script type="module"> and module resolution/fetching/evaluation #443

Merged
merged 1 commit into from
Jan 20, 2016

Conversation

domenic
Copy link
Member

@domenic domenic commented Dec 22, 2015

This adds support for <script type="module"> for loading JavaScript modules, as well as all the infrastructure necessary for resolving, fetching, parsing, and evaluating module graphs rooted at such <script type="module">s.

This was spurred on by a request over in whatwg/loader#83 (comment). This should take care of whatwg/loader#83, whatwg/loader#84, and whatwg/loader#82. There is not much overlap with the current loader spec, which is primarily concerned with reflective modules and the author-customizable loading pipeline. This patch is much more about changing HTML's processing model for the script element, and integrating module execution the same way HTML currently integrates script execution. It deals with questions like "when do modules execute relative to HTML parsing" or "how does module fetching/parsing/evaluation integrate with the event loop".

This needs substantial review! Preferably from implementers! It is a complicated topic and I am sure I got some details wrong, in addition to the discussion points noted below.

Here are the decisions that I incorporated while speccing this, which might not be immediately obvious. Some are bolded to indicate they need discussion/help/bikeshedding.

  • Always use the UTF-8 decoder, ignoring the charset="" attribute or the Content-Type header.
  • Disallow module responses which don't have a JavaScript MIME type for their Content-Type header. This is basically building in X-Content-Type-Options: nosniff behavior by default for module scripts.
  • Always use the "cors" fetch mode, with the crossorigin="" attribute controlling the credentials mode: omitted => "omit", "anonymous" => "same-origin", "use-credentials" => "include".
  • Modules are memoized per realm.
  • Use <script defer>-like semantics: do not execute until parsing is finished, and execute in order. The async attribute can be used to opt in to execute as soon as possible (but still no blocking).
  • For module resolution:
    • Allow absolute URLs and anything starting with "./" or "../" or "/", with the latter set being then interpreted as relative URLs. Everything else (e.g. bare specifiers like "jquery") fails for now, and no automatic ".js" is appended.
    • We throw a TypeError for module specifiers that are not parseable as absolute URLs and do not start with "./" or "../". (Honestly I could see arguments for any of EvalError, ReferenceError, SyntaxError, or URIError [sic]. Maybe we should repurpose EvalError.)
  • We use the page base URL as the base URL for resolving relative import specifiers for inline module scripts, and the response's URL (not the request's!) as the base URL for resolving relative import specifiers in external module scripts or imported modules in the tree.
  • In contrast, we dedupe fetches based on the request's URL.

To make review easier, I am hosting a compiled version: singlepage, multipage. Sections of note (links go to singlepage):

  • Main <script> section: largely contains authoring guidance updates. New diagram is also available, although the dropbox view doesn't show it inline since we are using absolute URLs.
  • <script> processing model section
    • Prepare a script is significantly different, and asynchronously creates "the script's script", instead of letting that be done by "execute a script block".
    • Execute a script block is simpler, and mostly delegates to the "run a classic script" or "run a module script" algorithms, surrounded by DOM-related stuff.
    • There is a new concept of "the script is ready" which generalizes the current spec's "The task that the networking task source places on the task queue once fetching has completed must do x", to allow it to work for module trees additionally.
  • The scripting section gets a few subsections to do the heavy lifting:
    • Definitions contains definitions for classic scripts and module scripts as separate types, and adds the module map to all environment settings objects.
    • Fetching scripts now includes algorithms that "asynchronously complete". Let me know what you think of this formulation. I need the algorithms to exit quickly, then run a bunch of steps in parallel or in reaction to a task, but then call back to their caller with a result.
    • Creating scripts has a pretty straightforward "create a module script" and "create a classic script"
    • Calling scripts has the old "run a classic script" algorithm but also the new "run a module script" algorithm. Fetching is completed by that point.
    • Integration with the JavaScript module system contains HostResolveImportedModule that is just a lookup in the module map manipulated inside "fetching scripts."
  • Script settings for browsing contexts was not changed but was moved under browsing contexts instead of under scripting.

/cc @whatwg/loader @bterlson @ajklein @Constellation

@caridy
Copy link

caridy commented Dec 23, 2015

Notes from our conversation via IRC, plus some extra thoughts:

  • somehow, this has to connect to the default loader per realm.
  • it seems that <script type="module"> does not go through the loader for resolving or fetching but import statements always do (this separation is already covered by this patch).
  • module's map should be the loader.registry where the url is the key, and this can be set once the url is resolved, so other import statements can resolve to the same registry entry (mostly for circular dependencies).
  • when storing new entries in loader.registry we need to create a new entry to wire up the loader and the key, (e.g.: let entry = new ModuleStatus(<loader>, <key:url>);) and adding it to the registry, (e.g.: <loader>.registry.set(<key:url>, entry);).
  • all operations on the registry instance are sync.
  • once the source is fetched, and the source text module record is created, the entry can be resolved, (e.g.: entry.resolve('instantiate', <source-text-module-record>););
  • errors can also be tracked (e.g.: entry.reject('fetch', <error-fetching>); or entry.reject('instantiate', <error-creating-new-source-text-module-record>);), and this could happen before or after inserting it into the registry.
  • once the source text module record is created and resolved into the entry (as described above), the final step is to put the evaluation phase in motion by calling entry.load("ready"); (we might add an alternative way to do so), this defers the rest of the pipeline to the loader, returning a promise of the evaluation that resolves to an exotic namespace object, which, for the purpose of this patch, is irrelevant.

@domenic
Copy link
Member Author

domenic commented Dec 23, 2015

And, most importantly:

  • All of the above can be taken care of in the loader spec, or as follow-up PRs here to improve loader spec integration once the loader spec is implementable, and do not block reviewing or merging this PR to complete the "milestone 0" work in the loader roadmap and give implementers and authors something they can work with. :)

@domenic domenic added the addition/proposal New features or enhancements label Dec 24, 2015
@DigiTec
Copy link

DigiTec commented Dec 28, 2015

I got hung up on type=module. Is it a good idea to bind a loading behavior to the type which today is used by so many tools to specify the content type itself? This is kind of leaning overly heavy on a bunch of toolchains doing the right thing.

Maybe use a new Boolean attribute?

<script module ...>

@domenic
Copy link
Member Author

domenic commented Dec 28, 2015

We need to use the type attribute so that downlevel browsers do not interpret the contents as scripts.

type="" has never had anything to do with the content type, in reality. It has a number of values which signify "JavaScript", only one of which is the JS mimetype, and any other value signifies "do not execute this". This patch adds a third value, "module", with its own semantics.

@DigiTec
Copy link

DigiTec commented Dec 28, 2015

Mozilla and Microsoft both use this attribute though and so do tools. Some of which validate it or rewrite it.

Legacy MS browsers even hit the registry to look up custom scripting languages. So adding a value here is a bit dangerous to older browsers.

I understand the desire to prevent parsing but I think it may have worse side effects than it is saving from.

@domenic
Copy link
Member Author

domenic commented Dec 28, 2015

I don't think we have any other choice, unless we want to disallow inline module scripts entirely.

I don't think any danger posed by script type="module" could possibly be greater than the danger posed by the many existing pages with type="handlebars". If legacy browsers/Mozilla/Microsoft are already coping with the web at large, which contains many such instances, they will be able to cope with the web using type="module".

@DigiTec
Copy link

DigiTec commented Dec 28, 2015

There are many usages of type, for instance, all of the WebGL samples generally use a script block with a type set to something which identifies them as either a vertex or fragment shader.

And at least in IE/Edge our mechanism for type is to parse out a text/ application/ prefix, then treat the remainder as a language. For Edge we'll look this up against known languages, which all bind to Chakra. For legacy IE we'll hit the registry and see if we can find an IActiveScript*

If we are bent on using the type for this, should we not identify the module language itself? Saying module seems to imply the one and only, forever, etc... It doesn't imply room for versioning or a language change of the syntax in the future. If this is the ES6 modules proposal, then should we consider a type that at least identifies the version and parsing semantics? es6-module for instance?

You made one other statement about disallowing inline module scripts. Is that a terrible idea? inline scripts are pretty bad in general. They cause stalls in the pre-parser, etc... This can be mitigated with various defer/async attributes, and we are starting to imply those by saying type=module per your prosal, but this now means that the concept of modules has to be baked into a lot of places, including the pre-parser (at least in our case, and I realize this is a technical detail that may not apply to other implementations). I haven't thought through all of the conditions of using inline scripts but since they don't have a download context, it may be non-trivial for us to implement the remainder of the algorithms. Will have to spend some time thinking through that once I'm back in the office, if Travis hasn't already jumped on it by then.

@domenic
Copy link
Member Author

domenic commented Dec 28, 2015

There are many usages of type, for instance, all of the WebGL samples generally use a script block with a type set to something which identifies them as either a vertex or fragment shader.

Sure. Nothing about this proposal prevents that from continuing. As long as you don't use one of the reserved values, you can just treat that as a data-storage attribute like data-whatever. That's not changed by introducing one new reserved value.

If we are bent on using the type for this, should we not identify the module language itself? Saying module seems to imply the one and only, forever, etc... It doesn't imply room for versioning or a language change of the syntax in the future. If this is the ES6 modules proposal, then should we consider a type that at least identifies the version and parsing semantics? es6-module for instance?

Versioning on the web is a classic antipattern, so there's no reason to include that here. Every version must be backward-compatible, so there's no need to specify the version. And pretending that "es6" is a coherent conceptual entity is not good; in reality there are varying levels of support in various engines, including parts of ES6 and parts of ES7 and so on. Even some ES5 is not implemented.

We could name it "js-module" if you really think that the word "module" is too generic. But I don't think we should do that. Similarly to how <script> by itself means "JavaScript legacy script", <script type="module"> should mean "JavaScript module script". And those extra few characters do hurt ergonomics slightly. The vast majority of developers will just curse us for making them type them every time---"of course it's a JS module; what other type of module is there?"

If we need in the future to add new ways of processing script texts, we can invent new reserved values for them (e.g. <script type="wasm">). Hopefully those new script texts won't go down JavaScript's path and will support modules from the beginning, so there won't be any need to specify wasm vs. wasm-module or similar.

You made one other statement about disallowing inline module scripts. Is that a terrible idea?

I was at first sympathetic to this idea, but upon reflection I think it'd be surprising to developers and hurt rapid experimentation. E.g. you could not write modules in a jsbin-like environment. Or I guess you could, using data: URLs in src, but that's just punishing authors.

I haven't thought through all of the conditions of using inline scripts but since they don't have a download context, it may be non-trivial for us to implement the remainder of the algorithms.

The spec largely follows the existing spec for non-module inline scripts, so it would be surprising if this causes you problems. I'd love to hear about them if so, so that we can fix them!!

@johnjbarton
Copy link

I am hoping some discussion of how this proposal fits in to how developers will use modules will be helpful.

It's my understanding that <script type="module"> as proposed here allows devs to use import statements inside of JS embedded in HTML files. This creates a 'root' module, parsed as an ES module, that loads, parses, and registers all of the imports, the runs the body of the script to, eg initialize a web app.

The resulting root module is a bit odd: it's anonymous (correct?). What -- if any -- relationship exists between multiple <script type="module"> modules?

As I don't see any exception for the keyword export I suppose it would be allowed, but it seems like it would be confusing to devs: the statement has no meaning since there is no mechanism to import this anonymous module (correct?).

I have seen requests for named modules created with <script type="module"> . There might be arguments for such a feature in future, but I think we should avoid it initially.

Similarly I think support for defer and async should be delayed. We have existing solutions to support the use cases covered by these features.

In fact, the feature unique to this proposal is blocking rendering and synchronously loaded code. To say this another way, <script type="module" src="./foo.js"> is not equivalent to what we can do absent this proposal:

<script>
System.import('./foo.js');
</script>

because the latter is async.

@domenic
Copy link
Member Author

domenic commented Dec 28, 2015

The resulting root module is a bit odd: it's anonymous (correct?).

In the sense that it cannot be imported, yes.

What -- if any -- relationship exists between multiple <script type="module"> modules?

None. Each creates a separate tree of module dependencies.

As I don't see any exception for the keyword export I suppose it would be allowed, but it seems like it would be confusing to devs: the statement has no meaning since there is no mechanism to import this anonymous module (correct?).

That's a great point. We should be able to make that fail pretty easily. I'll work on that.

I have seen requests for named modules created with <script type="module">. There might be arguments for such a feature in future, but I think we should avoid it initially.

Agreed. It could be useful but it's not necessary to get things off the ground.

Similarly I think support for defer and async should be delayed. We have existing solutions to support the use cases covered by these features.

defer behavior is the default. We could consider adding async-style behavior in the future to allow execution before parsing of the entire page is finished. But yeah, future stuff.

In fact, the feature unique to this proposal is blocking rendering and synchronously loaded code.

That's not accurate. Nothing in this proposal blocks rendering or synchronously loads code.

System.import('./foo.js');

There is no spec for what this would do in a browser, so I don't know quite what you're referring to. It's certainly not something you can do absent this proposal.

@caridy
Copy link

caridy commented Dec 28, 2015

As I don't see any exception for the keyword export I suppose it would be allowed, but it seems like it would be confusing to devs: the statement has no meaning since there is no mechanism to import this anonymous module (correct?).

That's a great point. We should be able to make that fail pretty easily. I'll work on that.

We could do this by relying on the source text module record returned by ParseModule(), which contains the internal slot [[LocalExportEntries]], [[StarExportEntries]] and [[IndirectExportEntries]]. But I will argue against this. We can expect a module that executes some initialization, and exports some stuff in case they are used in a composited way, and that is ok.

@domenic
Copy link
Member Author

domenic commented Dec 28, 2015

We can expect a module that executes some initialization, and exports some stuff in case they are used in a composited way, and that is ok.

Hmm, now I am not sure. Could you find any existing Node module or similar that can be used in this dual manner? In practice I cannot see when this would be a good idea...

@johnjbarton
Copy link

defer behavior is the default.

Ok I see that in your proposal now. That's too bad. It makes complexity the default behavior. No reason to argue on this point, I know it's a lost cause.

System.import('./foo.js');

There is no spec for what this would do in a browser,

I was referring to
http://whatwg.github.io/loader/#system-loader-instance and http://whatwg.github.io/loader/#loader-import

@caridy
Copy link

caridy commented Dec 29, 2015

We can expect a module that executes some initialization, and exports some stuff in case they are used in a composited way, and that is ok.

Hmm, now I am not sure. Could you find any existing Node module or similar that can be used in this dual manner? In practice I cannot see when this would be a good idea...

I have seen this before with express apps, while they create an express app and listen for incoming traffic, and exporting the app instance for test or for other more generic aggregation.

Aside from that, I think about this as a reflective form of import "foo.js";, which is perfectly valid whether or not foo.js exports something.

@domenic
Copy link
Member Author

domenic commented Dec 29, 2015

I guess that is convincing. OK, I'll leave it as-is.

@johnjbarton
Copy link

Would you consider restricting <script type="module"> to appear after <body>? That would ensure that the implicit defer matched the declaration order. The resulting error message could be a great way for new devs to learn that normal script blocks.

@domenic
Copy link
Member Author

domenic commented Dec 29, 2015

Would you consider restricting <script type="module"> to appear after <body>?

How? You mean in a validator or something?

That would ensure that the implicit defer matched the declaration order.

What does that mean? Are you also proposing doing the same for <script defer>?

The resulting error message could be a great way for new devs to learn that normal script blocks.

Why would an error message for <script type="module"> be a good teaching tool for something about legacy scripts?

@johnjbarton
Copy link

Overall I believe your goal with the implicit defer setting is to avoid blocking rendering the way HTML's <script> tag blocks. I am suggesting that you can achieve that goal without "secretly" changing the default behavior of the <script> tag as your proposal requires. We would achieve your goal with less mystery for developers and in a way more consistent with HTML's original design.

Would you consider restricting <script type="module"> to appear after <body>?
How? You mean in a validator or something?

Detecting renderable elements in the HTML parser after the module tag does not seem very difficult. The approach most consistent with HTML's accept-all parsing strategy is for the error to result in the script tag emitting console error and failing to execute.

That would ensure that the implicit defer matched the declaration order.
What does that mean?

As you know, defer on the <script> tag means

This Boolean attribute is set to indicate to a browser that the script is meant to be executed after the document has been parsed. 

The clearest way to express this execution order in a language like HTML is to place the tag at the bottom of the document. Using an attribute to express order was lame; making that lameness a default in some cases but not others does not create less lameness.

Are you also proposing doing the same for <script defer>?

I am proposing that the <script> tag's defer attribute be unaffected by your proposal. Specifically, the <script> tag would not have a different default value for defer when type="module". Instead, the effect you wish to create would be achieved by a different and simpler (for developers) solution.

The resulting error message could be a great way for new devs to learn that normal script blocks.

Why would an error message for <script type="module"> be a good teaching tool for something about legacy scripts?

You and I both expect this feature to be widely adopted. Many naive developers will place <script type="module"> in their document the same way they would place legacy script tags. If the result is an error such as "Module tags cannot block HTML rendering, they must appear after all renderable elements", they will be forced to place the new form at the bottom. As a side effect, some will learn that script blocks HTML and even more will simply adopt the habit of placing script tags at the bottom. Both of these secondary benefits further your goal.

@johnjbarton
Copy link

Sorry one more quick comment: <script defer type="module"> would be valid anywhere and its semantics would be consistent with <script defer>. Only the default case would be forced to the end of the document.

@domenic
Copy link
Member Author

domenic commented Dec 29, 2015

I don't see any benefits in making the source document order impact execution order, so I think we have a fundamental disagreement.

@johnjbarton
Copy link

Source document order affects execution order now, so we can't disagree about that.

The only issue here is whether the script tag default for defer should change. I don't think it should.

@meandmycode
Copy link

Just to play naive a bit here but since this is the first I've seen of a topic I and I'm sure many others have been waiting on, this doesn't restrict developers from doing <script src="es6.js"></script> correct? I know you mentioned avoiding older browsers from trying to run them but.. who cares?

They'll just bail out on the first import or export they find, why is this any different from any other new introduced syntax like arrow functions, get/sets, destructuring or future things like async?

Developers already have to "cope" with that today and it'll always be a minor pain point.. (but we transpile and like any other web feature, watch adoption rates of capable browsers) so what in the future when the import/export spec is upgraded or we add more syntax, add another type for that? it seems like this is going down a route of becoming yet another weird web thing in a few years time.

Thanks,

@domenic
Copy link
Member Author

domenic commented Dec 29, 2015

this doesn't restrict developers from doing

Developers can do that, but the resulting file will be parsed as a legacy script, instead of a module script. Thus there will be a syntax error for the first import or export statement, and sloppy mode will be the implicit default, and top-level declarations will create global variables, a few other minor differences. I'm not sure if that's what you meant by "bail out". You can still use new ES features implemented in the browser you're programming against though.

why is this any different from any other new introduced syntax

Because it's not just new syntax: it's drastically new semantics, per the above list. It also has a greatly changed execution model, as it's impossible to execute before dependencies are loaded, so you can't execute in a blocking fashion as is done with legacy scripts.

in the future when the import/export spec is upgraded or we add more syntax, add another type for that

There are no plans to add any new script types to JavaScript, although indeed other languages like wasm might want to do so.

@domenic
Copy link
Member Author

domenic commented Dec 29, 2015

@johnjbarton it sounds like what you want to do is develop a validator or lint tool for your project's HTML that imposes your preferred ordering of scripts-after-elements, since you think it's important that source order reflect execution order. I don't plan to enforce that style in this spec.

@johnjbarton
Copy link

Please read what I wrote. I am not discussing style. I am objecting -- in as nice a way as I can ;-) -- to your proposal to change the default behavior of the <script> tag. I did this by proposing an alternative with the same effect. You can disagree without characterizing my suggestion as some style thing.

@domenic
Copy link
Member Author

domenic commented Dec 29, 2015

I am not changing the default behavior of the script tag. I am introducing a new type of script, module scripts, in addition to the two existing ones---legacy script, which executes in a blocking fashion (potentially modified by boolean defer/async modifiers), and opaque script, which does not execute (and does not have boolean modifiers). Module scripts have their own separate third semantics, of executing after parsing is complete and all their dependencies are loaded and executed. (They also do not have boolean modifiers.)

@meandmycode
Copy link

Thanks for the prompt reply @domenic, much appreciated, by bail out I meant older browsers would error rather than wrongly interpreting a modern script.

I agree that there are big changes to how execution works when import and export are introduced, but I didn't think that would mean developers needed to tip off the browser when that was happening, from my understanding, import and export had to be top level and the first expressions within the source, as such I expect that script loading logic would change so the parser would only determine the execution model once it reaches the first expression, if it encountered an import (or export if that matters) then it would follow the newer rules of execution, otherwise the legacy method.

I'm sure this has been discussed to death elsewhere and there is sound logic as to why, but just wanted to try understand.

@domenic
Copy link
Member Author

domenic commented Dec 29, 2015

Yeah, developers definitely need to tip off the browser as to which type of script they are using. import and export do not need to be the first expressions in a module, and in fact do not need to appear in modules at all; you can still have a module that is used for its side effects, but gets the other benefits of modules like automatic strict mode and no global variables from top-level declarations.

@domenic domenic force-pushed the script-type-module branch 2 times, most recently from 9fc8d00 to 9afdf40 Compare January 20, 2016 18:46
This adds support for <script type="module"> for loading JavaScript modules, as well as all the infrastructure necessary for resolving, fetching, parsing, and evaluating module graphs rooted at such <script type="module">s.

Some decisions encoded here include:

- Always use the UTF-8 decoder, ignoring the charset="" attribute or the Content-Type header parameter.
- Always use the "cors" fetch mode, and use the crossorigin="" attribute to decide between "omit"/"same-origin"/"include" for the credentials mode.
- Require that the response be served with a JavaScript MIME type for its Content-Type header.
- Use <script defer>-like semantics by default: do not execute until parsing is finished, and execute in order. The async="" attribute can be used to opt in to an "execute as soon as possible, in any order" semantic. Unlike for classic scripts, in no cases do we block further parsing or script execution until the module tree is finished loading. This applies to both inline and external scripts.
- For module resolution, allow absolute URLs and anything that starts with "./", "../", or "/", with this latter set being interpreted as relative URLs. Everything else (e.g. bare specifiers like "jquery") fails for now. No automatic ".js" is appended.
- Modules are memoized based on their request URL, whereas import specifiers inside the module are resolved based on the module's response URL.

In the course of adding this functionality, large parts of script execution, fetching, and creation were refactored, including moving around some sections and reorganizing others. Conceptually, scripts are now either "classic scripts", "module scripts", or "data blocks". The result should generally be clearer and easier to follow.
@hax
Copy link
Contributor

hax commented May 30, 2016

Is there any way can feature detect script type=module support?

@graingert
Copy link

@hax no need. Every browser that supports <script type="module" src="/static/new-src.js></script> should support ignoring <script nomodule src="/static/legacy-src.js"></script>

@hax
Copy link
Contributor

hax commented Sep 11, 2017

@graingert When I asked this question there was no nomodule attribute.

@stefaneidelloth
Copy link

stefaneidelloth commented Dec 26, 2017

Is it possible to import the module that has been defined in a <script type="module"> tag to another <script type="module"> tag in the same html file? Or to another JavaScript file that has been loaded by the html file? Also see following SO question: https://stackoverflow.com/questions/47982205/how-to-import-es6-module-that-has-been-defined-in-script-type-module-tag-ins

@annevk
Copy link
Member

annevk commented Jan 3, 2018

The answer you got there seems correct. https://whatwg.org/faq#adding-new-features and https://whatwg.org/working-mode might be of interest if you wish to pursue the idea of making it possible somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements topic: script
Development

Successfully merging this pull request may close these issues.