Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimum viable module source #36

Merged
merged 47 commits into from
May 9, 2023
Merged

Conversation

guybedford
Copy link
Collaborator

@guybedford guybedford commented May 1, 2023

Here's a first pass at what we've been discussing with respect to a minimum viable module source.

We define %AbstractModuleSource% as an intrinsic base class for module source class instances. It defines the toStringTag getter returning the internal [[SourceModuleRecord]]'s [[SourceClassName]] string (where [[SourceModuleRecord]] also has a [[HostSourceMetadata]] slot). This allows carrying host-context through this object back to the compilation phase with an abstract method that can be called by other specs within HostLoadImportedModule, where [[HostDefined]] on the instance record can be recreated on new instances in this way.

A new GetModuleSource object is used to provide the module source via Cyclic Module Record, and can be redefined by suitable cyclic module subclasses as necessary.

Feedback very welcome!

@guybedford guybedford requested review from nicolo-ribaudo and lucacasonato and removed request for nicolo-ribaudo May 1, 2023 05:50
README.md Outdated Show resolved Hide resolved
spec.emu Show resolved Hide resolved
spec.emu Outdated Show resolved Hide resolved
spec.emu Outdated Show resolved Hide resolved
spec.emu Show resolved Hide resolved
@guybedford guybedford force-pushed the minimum-viable-module-source branch from 13ac1d6 to 2121671 Compare May 1, 2023 22:59
@guybedford
Copy link
Collaborator Author

@nicolo-ribaudo I've updated this PR to use your phase generalization model.

@guybedford guybedford force-pushed the minimum-viable-module-source branch from dae2a89 to 1b222fe Compare May 1, 2023 23:09
README.md Outdated Show resolved Hide resolved
spec.emu Outdated Show resolved Hide resolved
Copy link
Member

@nicolo-ribaudo nicolo-ribaudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have to catch up with the discussion that lead to the .sourceType accessor (#36 (comment)), but in the meanwhile I want to write down what I believe was the conclusion from yesterday. I also left some other comments that are mostly unrelated to this.


Yesterday we agreed that:

  • all module sources should extend form the same base class AbstractModuleSource (name tbd? I don't remember if it was the final name).
  • AbstractModuleSource is "abstract", i.e. it cannot be directly instantiated
  • AbstractModuleSource.prototype[@@toStringTag] is a getter that throws when it's called on a receiver that doesn't have the [[ThisIsAModuleSource]] internal slot (name tbd)
  • AbstractModuleSource.prototype[@@toStringTag] would return the string "[object AbstractModuleSource]"
  • All the "concrete" Module Source classes should define their own [@@toStringTag] prototype property, with AbstractModuleSource.prototype[@@toStringTag] returning "[object AbstractModuleSource]" being the fallback for when it's not shadowed (thanks to prototypical inheritance)

This satisfies the following goals:

  • it is be possible to brand-check module sources (i.e. is this object a module source?) using the AbstractModuleSource.prototype[@@toStringTag] getter
  • if a library cares about opaquely handling module sources, it is able to do so without having to use XxxModuleSource/YyyModuleSource-specific APIs
  • if a library only wants to support XxxModuleSource but not YyyModuleSource, it doesn't need to hard-code any YyyModuleSource-specific methods. However, it should still use XxxModuleSource-specific methods (for example, to check that the module source it received is actually an XxxModuleSource and not something else, or for example to handle custom Wasm instantiation).

In practice, this is expressed by the following implementation:

class AbstractModuleSource {
  #isModuleSource;

  constructor() {
    if (new.target === AbstractModuleSource) {
      throw new Error("Abstract class, you must extend it");
  }

  get [Symbol.toStringTag]() {
    if (!(#isModuleSource in this)) throw new Error("Invalid receiver");
    return "[object AbstractModuleSource]";
  }
}

WebAssembly.Module = class Module extends AbstractModuleSource {
  [Symbol.toStringTag] = "WebAssembly.Module";
}

class ModuleSource extends AbstractModuleSource {
  [Symbol.toStringTag] = "ModuleSource";
}

There was then the discussion about how to expose AbstractModuleSource.prototype[@@toStringTag] through ECMA-262 objects, with two different alternatives:

  1. The proposal only defines AbstractModuleSource, and it's exposed as a property of the global object
  2. The proposal defines both AbstractModuleSource and (for JS) ModuleSource, and only ModuleSource is exposed as a property of the global object. AbstractModuleSource.prototype would still be reachable using Object.getPrototypeOf(ModuleSource.prototype), similarly to how it's for AsyncFunction&co.

Personal opinions

  • We never discussed about only having AbstractModuleSourcePrototype and not the AbstractModuleSource function itself, but if there is already a precedent for this and if we go with option (2) it's a neat simplification that I might like.
  • As an aside, I really dislike the source-text string for JS modules because any language that is not compiled to a binary blob is "source text".
  • If we can't go with what I believed we agreed on yesterday, we should really try to avoid having two methods/accessors just for the purpose of checking "is this a module source? which type?" -- we should just go with what %TypedArray% does (https://tc39.es/ecma262/#sec-get-%typedarray%.prototype-@@tostringtag)

spec.emu Outdated Show resolved Hide resolved
spec.emu Outdated Show resolved Hide resolved
spec.emu Outdated Show resolved Hide resolved
@ljharb
Copy link
Member

ljharb commented May 2, 2023

@nicolo-ribaudo not precisely; the best approach for my use cases is if the base abstract class has a getter that returns the string contents of a slot, and subclasses override the slot but not the getter.

guybedford and others added 6 commits May 4, 2023 08:02
@guybedford
Copy link
Collaborator Author

@nicolo-ribaudo I've handled the requests problem by piping the ModuleRequest record through the host resolve hooks.

Instead of explicitly creating ModuleSourceObject, the expectation of the HostLoadImportedModule hook is that when the phase is source, it must then populate the ModuleSourceObject. If it doesn't we then throw the reference error.

As far as I can tell this refactoring does in fact better align with import attributes. It also seems useful to provide the phase information to the host hook, so far as the preloader needs to be aware of the phasing, which seems critical information as well.

Some other minor edits were needed to get the piping to work, there may well be further future simplifications on this stuff (eg ModuleSource Records as first class host hooks, resources / internal registry as first class keys), but for now this seems like a reasonable compromise on the design while aligning with the other spec changes for the most part.

Let me know your thoughts further!

@guybedford
Copy link
Collaborator Author

Per @lucacasonato's feedback I've removed the global ModuleSource for now in this PR. His argument was that defining an object that does not have any use at all would not be reasonable at this point. At the very least we can still define AbstractModuleSource and make this reachable through WebAssembly.Module for the WebAssembly use case. JS sources would then continue to throw for now during link time, while having a path for support in future.

Copy link
Member

@nicolo-ribaudo nicolo-ribaudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked about this pull request with @guybedford privately, specifically about the signature of HostLoadImportedModule, and there is a disagreement between us that we have not yet been able to solve.

(tl;dr: scroll down to the end of this comment, after the <hr>, for concrete next steps)

I do not think that this proposal should change the signature of that host hook, and that its parameters should continue being:

  • referrer: the Module Record of the module that contains the import
  • specifier: a String representing the import (e.g. "foo" in import { x } from "foo"). The import attributes proposal will extend this to be a (specifier, attributes) pair, because attributes are necessary to identify the requested module as much as the specifier string.
  • hostDefined: some host-specific data, opaque to ECMA-262, to be propagated across the whole loading process. This is currently needed by HTML (step 3 of https://html.spec.whatwg.org/#fetch-the-descendants-of-and-link-a-module-script), but I would eventually like to try refactoring the HTML spec to not rely on this.
  • payload: some ECMA-262-specific data, opaque to the host, that is needed by ECMA-262 to continue the loading process after that the host finishes loading the requested module.

Module requests (a fancy word used by the spec to describe the intention behind import statements) with the same specifier should be deduplicated, and we should call the host hook once per (specifier, referrer) pair.

@guybedford would prefer to change the hook signature as follows:

  • referrer: the Module Record of the module that contains the import
  • (specifier, phase): a pair containing the string specifier of the import and the corresponding phase (with this proposal, ~source~ or ~evaluation~).
  • hostDefined: some host-specific data, opaque to ECMA-262, to be propagated across the whole loading process. This is currently needed by HTML (step 3 of https://html.spec.whatwg.org/#fetch-the-descendants-of-and-link-a-module-script), but I would eventually like to try refactoring the HTML spec to not rely on this.
  • payload: some ECMA-262-specific data, opaque to the host, that is needed by ECMA-262 to continue the loading process after that the host finishes loading the requested module.

Module requests with the same specifier and with the same phase should be deduplicated, and we should call the host hook once per (specifier, referrer, phase) triple.

My motivation is that, as we have discussed/discovered when deeply analyzing what "phases" are and how they are different from import attributes, the phase should not affect how an individual module is loaded and thus we should not pass it to the host. This is exactly the same reason as to why we will pass attributes but not the phase to the import hook in the compartments proposal. The responsibility of handling the import phase is completely on ECMA-262, and we should avoid blurring the layering line.

Guy motivations are that:

  1. Passing the phase would allow the host to not populate the [[ModuleSourceObject]] slot of the returned module record unless it's going to be used, so that we avoid allocating an unnecessary object.
  2. Browsers have a preloader that eagerly loads dependencies of modules, to optimize loading times. This preloader needs to know if a module is "fully imported" or if we are only importing its source, to avoid creating HTTP requests for its dependencies unless they are necessary.
  3. The phase is needed by ECMA-262 to resume the loading process, so we need to pass it through.

I believe (1) has been solved by replacing [[ModuleSourceObject]] with a GetModuleSourceObject() "getter". I also believe that this was not necessary, because allocating an object is unobservable and thus implementations could have already deferred that work until the [[ModuleSourceObject]] internal slot was accessed, implementing it internally as a getter. However, I don't dislike the current solution since most of the other extension points of Module Record are already based on abstract methods overwritten by the subclasses.

For (3), we already have the payload whose purpose is exactly that.

My answer for (2) is probably the one that has been less convincing, but:

  • this preloader is not defined anywhere; neither in ECMA-262 nor in HTML. It's something defined internally in the engines, that is not observable (unless you observe the HTTP requests received by your server, obviously). Host hooks are meant to provide an interface for host specs, or to provide an integration API for ECMA-262 (I know that some delegates disagree in this second point, don't hold me accountable on this :P). Implementations that have magic optimizations are still free to read any internal state defined in the ECMA-262 spec, even if that's not officially exposed through host hooks, unless these optimizations would have observable effects from within ECMA-262.
  • the preloader is already speculative and based on heuristics, and it not always "correct" (where by correct I mean "do the optimal amount of work; no more and no less"). As an example, if a long file has a bunch of import statements at the top and a syntax error at the bottom, the preloader might start loading the dependencies declared in those import statements even if ECMA-262 will never call HostLoadImportedModule to load them. There would need to be one more heuristic to skip preloading dependencies of modules imported just as module sources (potentially, since the dependencies might still be needed in case later the source is instantiated with the Module constructor from compartments), but this heuristic can be based on data not directly exposed by ECMA-262 to host specs.

Additionally, given that module types that we initially expect to work with this proposal (i.e. WebAssembly.Module, and nothing else) do not support loading dependencies thorugh the current module loader, there will be no preloaded involved with their dependencies. If passing the information about the phase ends up being necessary (and I don't think it will, due to the reasons above) we can always update the host hook signature once there is a concrete need for it.


I recommended to @guybedford to merge this PR even if I disagree with this layering interface, because this proposal is going to be presented at the next TC39 meeting in one week and we should have the observable semantics merged as soon as possible so that other delegates can read them in advance.

However, I re-reviewed the changes pushed in the past few commits and I don't think that this PR can be merged as-is (read my comments below as for why).

@guybedford would you be ok with reverting all the commits after fab8784 (i.e. git reset --hard fab8784b), only delete the ModuleSource constructor on top of that, merge this PR, and re-open a separate PR with the remaining changes where we can continue the discussion?
If it helps, I have prepared those changes in https://github.com/nicolo-ribaudo/proposal-import-reflection/tree/prepare-pr-for-merging so you could just force-push that branch to this PR.

spec.emu Outdated Show resolved Hide resolved
spec.emu Outdated Show resolved Hide resolved
spec.emu Outdated Show resolved Hide resolved
1. <del>Set _state_.[[PendingModulesCount]] to _state_.[[PendingModulesCount]] + _requestedModulesCount_.</del>
1. <del>For each String _required_ of _module_.[[RequestedModules]], do</del>
1. <ins>For each ModuleRequest Record _required_ of _module_.[[RequestedModules]], do</ins>
1. If _module_.[[LoadedModules]] contains a Record whose [[Specifier]] is _required_<ins>.[[Specifier]]</ins>, then
Copy link
Member

@nicolo-ribaudo nicolo-ribaudo May 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR now duplicates entries in the [[ReqeustedModules]] list in case there are different phases (i.e. import source s from "x"; import x from "x"), however this check still skips calling HostLoadImporteedModule if that module has already been loaded even if for a different phase. Is this intentional?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is by design.

@kriskowal
Copy link
Member

I’m reminded that we at Agoric take a strong position that every proposed new intrinsic like %AbstractModuleSource% or %AbstractModuleSource.prototype% must be reachable by transitive property access starting with globalThis and that this is a good enough reason for globalThis.ModuleSource to exist even if it is initially inert.

@littledan
Copy link
Member

How are we doing on making the call of sourceType vs @@toStringTag type checks? I'd be OK with either outcome personally (as long as we don't have multiple strong type checks).

@guybedford
Copy link
Collaborator Author

@littledan the way that ended up last week was that we now have a toStringTag getter function on AbstractModuleRecord which checks a [[SourceClassName]] internal slot from the object and returns that as its value, exactly mirroring TypedArray as suggested by @nicolo-ribaudo.

@ljharb
Copy link
Member

ljharb commented May 9, 2023

(most builtins have multiple slot checks, but any additional ones added to module things would, i hope, have an actual purpose beyond that)

@guybedford
Copy link
Collaborator Author

@nicolo-ribaudo I believe I have addressed all of your feedback, I think it would be useful to carefully distinguish editorial suggestions from normative changes at this point in the review process. Specifically editorial changes can be discussed after this PR is merged, but I know you'll agree getting this merged is the priority at this point.

I will go ahead and land this today, unless there are any further objections.

Copy link
Member

@nicolo-ribaudo nicolo-ribaudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very reluctantly approving 😛

After this PR is merged, I plan to propose some changes to the layering. I still have to figure exactly what would satisfy me while still meeting your goals.

I just left a comment deleting a line that you probably copy-pasted from an older version of the PR, if @lucacasonato could commit it and merge this PR it would help us moving quicker.

spec.emu Outdated Show resolved Hide resolved
@nicolo-ribaudo
Copy link
Member

@erights @kriskowal I added exposing %AbstractModuleSource% to the agenda for the next SES meeting, if that's ok.

lucacasonato and others added 2 commits May 9, 2023 12:26
Co-authored-by: Nicolò Ribaudo <nribaudo@igalia.com>
Copy link
Member

@lucacasonato lucacasonato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Landing because it's semantically correct now.

@nicolo-ribaudo will address some editorial things in a follow up.

</li>
<li>
<p><ins>When `[[Phase]]` is set to ~source~, this should not affect the identity or state of the resultant Module Record as if it had been called for the ~evaluation~ phase.</ins></p>
<p><ins>In addition, the ~source~ phase indicates that preloading should not take place in the host for this module.</ins></p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Preloading" is not an ECMA262 thing, so this should be a note (ie not normative).

I still question the need for passing phase but this does not impact observable semantics to users, as long as there is note that identity and load is not affected. Let's consider this in a follow up.

@lucacasonato lucacasonato merged commit 4defdd7 into main May 9, 2023
@lucacasonato lucacasonato deleted the minimum-viable-module-source branch May 9, 2023 10:29
@bakkot
Copy link

bakkot commented May 10, 2023

Heads up that 262 editors haven't had / probably aren't going to have time to review this before next week, so it's probably not going to be able to achieve stage 3 at this meeting for that reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants