Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add context option to webpack config #479

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

samcic
Copy link

@samcic samcic commented Dec 27, 2021

Fixes #478

@ef4
Copy link
Collaborator

ef4 commented Dec 31, 2021

After updating everything for ember 4 compatibility, the unrelated CI failures are taken care of. The failures that remain are pointing out a breaking change here.

Previously, the default context would be the project directory itself. This changes it to the tempdir, which is indeed what I suggested in #478 (comment), but the tests have pointed out that this breaks some supported use cases.

Users can add to the webpack config, and when they do they expect their config to refer to their own project as the context. For example, this test:

https://github.com/ef4/ember-auto-import/blob/1d24f61a3a6f468689e488c5c8e657ad9e22247f/test-scenarios/custom-html-test.ts#L19-L23

covers the case of app adding an additional webpack entrypoint. That relative path breaks under this PR because it's not found in the tempdir.

In another example, an app can add a webpack loader:

https://github.com/ef4/ember-auto-import/blob/1d24f61a3a6f468689e488c5c8e657ad9e22247f/test-scenarios/merged-test.ts#L116

and expect to resolve it relative to the project.

I don't think we should break these use cases, so the context needs to stay as the user's project.

Our challenge is that webpack needs to see files both in the real project directory and in our temp stagingDir (where we synthesize the entrypoint so that webpack doesn't need to try to understand a whole non-embroider ember app, which it cannot do). We want the paths to all those files to be stable so that build output is deterministic. Setting the context to either means the full path to the other will alter the build.

We could go back to the idea of making the tempdir name as stable as possible, but even if we pick the name ourselves, people doing production builds across a fleet of different machines could still get nondeterminism. This problem is probably already present as of your commit 3acee5c, because even though files under the stagingDir are completely relative, files under the real project dir still have a full operating-system path in them that doesn't fall inside the webpack context. It's "more stable" in that it doesn't include a random tempdir, but it does include the true OS path to your project, so if you build it under /Users/a/projects/app vs /Uesrs/b/projects/app you'd get different production hashes.

I think I see an alternative solution, which would be to write a webpack plugin that does the same work as the stagingDir virtually, by directly resolving and loading the things we would have put there.

@samcic
Copy link
Author

samcic commented Jan 1, 2022

@ef4 So as a test I wrote the following plugin with some basic temporary logging:

class DeterministicModuleIdsPlugin {
  apply(compiler) {
    const context = compiler.options.context;
    compiler.hooks.compilation.tap("DeterministicModuleIdsPlugin", compilation => {
      compilation.hooks.moduleIds.tap("DeterministicModuleIdsPlugin", modules => {
        const chunkGraph = compilation.chunkGraph;
        console.log(`\r\n\r\n`);
        for (const module of modules){
          if (module.libIdent) {
            const origId = module.libIdent({ context });
            const knownEndOfTmpStagingDirectoryRoot = "webpack_bundler_ember_auto_import_webpack";
            if (origId.indexOf(knownEndOfTmpStagingDirectoryRoot) > -1) {
              const newId = `.${origId.substring(origId.indexOf(knownEndOfTmpStagingDirectoryRoot) + knownEndOfTmpStagingDirectoryRoot.length)}`;
              chunkGraph.setModuleId(module, newId);
              console.log(`old id: ${origId}`);
              console.log(`new id: ${newId}\r\n`);
            }
          }
        }
        console.log(`\r\n\r\n`);
      });
    });
  }
}

module.exports = DeterministicModuleIdsPlugin;

...and included it in the webpack config via my app:

const DeterministicModuleIdsPlugin = require("./deterministic-module-ids-plugin");

...

  let app = new EmberApp(defaults, {
    autoImport: {
      webpack: {
        plugins: [
          new DeterministicModuleIdsPlugin()
        ],
      }
    }
  });

The build then shows...

image

...and the chunk files are built deterministically 👍 .

The plugin could be considered a bit hacky because it's just removing the temp-directory part of the original module ID. It does this by looking for the location of the "webpack_bundler_ember_auto_import_webpack" string in the module name, which from all I can gather seems to be static.

Regardless, this seems to give module IDs which are the same as if we set the context: stagingDir (as originally implemented in this PR).

Perhaps you have a suggestion for how to more cleanly replace the module IDs (rather than a string replace).

At a high level though, if I understand correctly, we'd just write this plugin as part of ember-auto-import and include it in the default config that we pass to webpack in webpack.ts.

What do you think?

@ef4
Copy link
Collaborator

ef4 commented Jan 1, 2022

Yes, that is the right general direction. The stagingDir currently gives us two things:

  1. It contains a several Javascript files that we synthesize to use as webpack entrypoints.
  2. It contains symlinks to NPM packages that we know the app is using. This step is needed because the deps are not necessarily resolvable normally from the project, because they could be interior dependencies that aren't hoisted by NPM.

To take care of "1", we could use something like https://github.com/sysgears/webpack-virtual-modules to provide the entrypoint files. They can live inside a virtual path under the project, like ./__ember_auto_import__/l.js, etc.

To take care of "2", we can replace the symlinks with resolve.alias or resolve.fallback in the webpack config. We just need to be careful that newly discovered dependencies get into the configuration correctly, without restarting the dev server.

@ef4
Copy link
Collaborator

ef4 commented Jan 2, 2022

Another thought: we might be able to handle "2" with virtual modules too. The idea would be to synthesize a module for each package that's using ember auto import, and place it virtually within that package so it can resolve that package's dependencies.

@samcic
Copy link
Author

samcic commented Jan 3, 2022

@ef4 Thanks again for the help! Perhaps you can help me understand the big picture here. Why do we need to replace the two things that stagingDir is already doing for us? I was under the impression that it's "just" the module ids that are broken (non-deterministic), so I'm wondering why we can't just focus on making them deterministic by using a simple plugin similar to the one I prototyped above that removes the non-deterministic part of the stagingDir path which would normally be used for the module ids.

Said otherwise: Is there some case that my prototype above wouldn't solve?

@ef4
Copy link
Collaborator

ef4 commented Jan 3, 2022

I was responding to your question about "how to more cleanly replace the module IDs". Re-orienting things so that there's no need to replace them seems safest to me.

We control these modules on both "sides": we can put whatever names we want into the webpack config as entry points, and we can provide their actual implementations from wherever we want. So there's no need to temporarily use an id we don't want to keep. We can use a single consistent ID throughout, so that even if other webpack plugins tap into many spots in the build they will never see the wrong ID.

And if we can stick to the public API of a package like webpack-virtual-modules, we will have less to maintain as webpack evolves and changes its internals.

@samcic
Copy link
Author

samcic commented Jan 4, 2022

@ef4 Ok thanks for the clarification, makes sense now 👍 . Many of the concepts involved here are new to me, so I'll need a bit of time to dive into this and produce the next iteration. Will keep you posted!

@samcic
Copy link
Author

samcic commented Jan 12, 2022

@ef4 So I spent some hours just now attempting to get started here, but I think I need a bit of a push in the right direction. Perhaps you can take a look at my latest changed files here. When looking at webpack-virtual-modules, it seems it takes actual javascript code as input for its virtual modules, e.g. from their examples:

var virtualModules = new VirtualModulesPlugin({
  'node_modules/module-foo.js': 'module.exports = { foo: "foo" };',
  'node_modules/module-bar.js': 'module.exports = { bar: "bar" };'
});

I'm wondering how to get this information in our case. In my first attempt I noticed that we're currently doing this in webpack.ts:

    this.opts.bundles.names.forEach((bundle) => {
      entry[bundle] = [
        join(stagingDir, 'l.js'),
        join(stagingDir, `${bundle}.js`),
      ];
    });

...so I thought that the files with paths like join(stagingDir, ${bundle}.js) would actually be on the disk and that I could read their contents to use for generating the virtual modules. After trying that I got "file not found" errors, so I guess that assumption was wrong.

So...I guess I need a tip for what to actually generate the virtual modules from...can you help?

@ef4
Copy link
Collaborator

ef4 commented Jan 13, 2022

Sure, the code you're looking for is here:

https://github.com/ef4/ember-auto-import/blob/63bb10d7755d5c9883a3e275f44d606232758481/packages/ember-auto-import/ts/webpack.ts#L292-L296

That block is responsible for writing the current stagingDir. We would be changing it so that instead of writing those things out of the stagingDir, it sets them up as virtual modules.

@samcic
Copy link
Author

samcic commented Jan 13, 2022

@ef4 Great, thanks for the tip. I'll keep working on it!

@samcic
Copy link
Author

samcic commented Jan 15, 2022

@ef4 Hey again! May I request some more guiding comments? I think I'm getting closer, at least on aspect (1). Would you mind taking a look at my latest changes in webpack.ts and commenting on how best to take the next step?

As it stands, the changes don't seem to have "broken" anything yet, but it's also not yet doing anything useful because the virtual modules aren't yet used as entry points.

I attempted to just replace the stagingDir entry points with the virtual modules (currently commented out on lines 133-135), but when I do so, the tests start to just hang for me (i.e. no error, just endlessly waiting). I considered trying to start debugging what was happening, but I don't have much confidence that the current attempt is "complete" because I'm a bit lost on the part (2) aspect you talked about. Perhaps you could elaborate on that one for me again as well!

Thanks, and apologies for my newbie-behavior here! I'm enjoying the challenge and look forward to being able to make this useful contribution to the project...

@@ -127,8 +130,12 @@ export default class WebpackBundler extends Plugin implements Bundler {
entry[bundle] = [
join(stagingDir, 'l.js'),
join(stagingDir, `${bundle}.js`),
// If I understand correctly, our virtual modules will need to go in here...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the virtual modules would replace the existing join(stagingDir, ...) lines.

noParse: (file: string) => file === join(stagingDir, 'l.js'),
// I guess webpack should also be prevented from parsing our virtual modules, so presumably we could
// filter them out like this?
noParse: (file: string) => file === join(stagingDir, 'l.js') || file === `./__ember_auto_import__/l.js`,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and this would now only need to match the l.js in the virtual modules.

this.stagingDir should be able to be entirely refactored away.

@@ -232,7 +245,10 @@ export default class WebpackBundler extends Plugin implements Bundler {
//
// And we otherwise defer to the `skipBabel` setting as implemented by
// `@embroider/shared-internals`.
return dirname(filename) !== stagingDir && shouldTranspile(filename);
return dirname(filename) !== stagingDir
// I presume we also don't want to apply babel to our virtual modules, which I guess could be done as follows...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -391,6 +424,7 @@ export default class WebpackBundler extends Plugin implements Bundler {
packageName: string;
packageRoot: string;
}): void {
// I guess this is "part 2" of what stagingDir is being used for. Not yet sure how to replace this part...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The files like ./__ember_auto_import__/${bundle}.js contain imports, and webpack will try to resolve those imported packages relative to that path. It needs to find the right ones. In the stagingDir we did that by making symlinks.

You could instead use resolve.alias or resolve.fallback. See https://webpack.js.org/configuration/resolve/ This is probably a small amount of code to write, but it could have confusing effects on people who were already doing some aliasing.

The alternative I mentioned in #479 (comment) is to actually locate the virtual modules such that they appear to be inside the real package that did the importing, so that the existing dependency resolution would work. This would be very nice because it means there's no confusion about aliases and we'd be closer to how a "normal" webpack build would work (one where webpack looks at the actual app source, instead of ember-auto-import's summary of the app source). This might need wider code changes though, so that you could know the needed imports on a per-package (meaning the app and any addons that are using ember-auto-import) basis.

@samcic
Copy link
Author

samcic commented Feb 11, 2022

@ef4 Thanks for the info! Just wanted to let you know that, although we just ended up using the size option in our project in order to upgrade to Ember 4 with deterministic builds, I'm still committed to contributing to this project. Hoping to find some time this weekend to dive into it again...

@samcic
Copy link
Author

samcic commented Apr 1, 2022

Just an FYI that I'm unfortunately struggling to get back to this work, mainly because we found a very satisfactory workaround (using the size option) and because, as a newbie on this project and all things webpack, I feel that this task is a bit above me. Anyone else that's willing to contribute here is welcome!

@ef4
Copy link
Collaborator

ef4 commented Apr 1, 2022

Yup, agreed, the scope of this is bigger than it looked.

I'm keeping this PR open though because we will definitely want to keep working on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Chunk file output not deterministic
2 participants