Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build: Generate stable module and chunk IDs and a better chunk hash #11867

Merged
merged 9 commits into from
Mar 14, 2017

Conversation

blowery
Copy link
Contributor

@blowery blowery commented Mar 7, 2017

The goal of the PR is to make chunk hashes, chunk IDs and module IDs stable over time. Stable chunk hashes require stable module IDs and stable chunks IDs. Webpack uses monotonically increasing IDs by default; this results in instability at multiple levels.

Module IDs

react may map to module ID 4 in one build and then map to module ID 5 in the next. Modules that import react would then change, even if they themselves did not. This, in turn, should make the chunk hash change when it should not.

This PR moves us to a stable module name -> module ID mapping, based on the md5 of the module path. This makes module IDs stable across builds.

A couple folks asked if we could we use the NamedModulesPlugin instead. We definitely could, though the Webpack folks are now recommending the HashedModulesPlugin for production, which is what this replicates.

Chunk IDs

As we add more and more chunks with asyncRequire, we add more and more chunks to the mix. Chunk IDs, like Module IDs, are just an increasing ID by default. This means that Chunk A may map to ID 3 in one build and chunk 5 in the next. This means that other chunks that depend on Chunk A would need to update their hash, even if nothing within the chunk changes.

This PR moves us to a stable chunk name -> chunk ID mapping, based on the chunk's name property. It requires named chunks, which all of our chunks are, once commons is removed.

Now when we introduce new chunks, those chunks get a new unique ID and references remain stable

Chunk Hashes

The default webpack chunk hash generator is pretty complicated and depends on things that are no longer necessary when we have stable module IDs and stable chunk IDs.

This PR adds a new chunk hasher that uses an MD5 hash of the content plus the hash seed to generate the hash.

Further Explanation

The module ID generator is lifting mostly from webpack-stable-module-id-and-hash. It works quite well, and could be replaced with the new HashedModuleIdPlugin when we move to webpack@2. There's no way to turn off the chunk hash generator in webpack-stable-module-id-and-hash though, so I just pulled it. Plus we have control if we need to modify it.

The chunk ID generator is based on it, but uses the chunk name instead of
the module path as a key into the chunk ID.

The chunk hash generator is based on webpack-chunk-hash. We could probably just use webpack-chunk-hash directly, but having the code locally will likely be useful.

This also moves the vendor dll into a commons chunk. Otherwise we can't control the module IDs generated by the DLL Reference plugin. The module IDs within the vendor DLL were fine, but the DllRefererncePlugin aliases them to a monotonically increasing ID which is not stable across builds. We need all the module and chunk IDs to be stable to make this all work.

I also removed the commons chunk since it was now down to 4k zipped and only really needed by a few sections. This saves 4k for a bunch of section loads.

Testing

To test this, first make a base build with CALYPSO_ENV=production make build. Copy off public to a safe place (cp -r public base-public). Now start fiddling with things in the source tree and npm dependencies and see how the built code changes (diff -rqu base-public public).

Build calypso off the commit prior to this one (not current master). Compare to this PR. Do the hashes for all of the chunks change? They should.

When changing something in vendor, only vendor should change. Try updating react or moment or react-redux. Expect to see a new vendor chunk and no other changes.

When something in an asyncRequire chunk changes, only that chunk should change. Try modifying code in reader/sidebar/expandable.jsx. What chunks get new hashes? It should be limited to async-load-reader-sidebar.

When you add a new module to the build chunk, only the build chunk should change. Try importing something new from client/boot. What changes? It should just be the build chunk. The new module should get a new unique ID and other module IDs should not be affected.

@matticbot
Copy link
Contributor

@matticbot matticbot added the [Size] XL Probably needs to be broken down into multiple smaller issues label Mar 7, 2017
@blowery blowery added Build [Status] Needs Review The PR is ready for review. This also triggers e2e canary tests and wp-desktop tests automatically. Framework labels Mar 7, 2017
@blowery blowery force-pushed the fix/build/chunk-hash-issues branch from 6d7a988 to 3a7f27f Compare March 7, 2017 20:31
@matticbot matticbot added the [Size] XL Probably needs to be broken down into multiple smaller issues label Mar 7, 2017
@blowery
Copy link
Contributor Author

blowery commented Mar 7, 2017

To test:

@blowery blowery requested a review from ockham March 7, 2017 20:39
@matticbot matticbot added the [Size] XL Probably needs to be broken down into multiple smaller issues label Mar 7, 2017
@blowery blowery requested review from ehg, mtias, gwwar, aduth and samouri March 7, 2017 22:11
return chunkIds;
}

function collectChunkIds( chunk ) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop this? not needed.

@blowery
Copy link
Contributor Author

blowery commented Mar 11, 2017

I added a bunch more explanation to the summary of the PR that should help with testing (and I hope understanding).

@matticbot matticbot added the [Size] XL Probably needs to be broken down into multiple smaller issues label Mar 13, 2017
@blowery blowery force-pushed the fix/build/chunk-hash-issues branch from 0e0278f to 2aa04d7 Compare March 13, 2017 15:37
@matticbot matticbot added the [Size] XL Probably needs to be broken down into multiple smaller issues label Mar 13, 2017
@blowery blowery force-pushed the fix/build/chunk-hash-issues branch from 2aa04d7 to d4582bb Compare March 13, 2017 16:20
@matticbot matticbot added the [Size] XL Probably needs to be broken down into multiple smaller issues label Mar 13, 2017
blowery added 5 commits March 13, 2017 18:45
…generators.

The module ID generator is lifting mostly from webpack-stable-module-id-and-hash. It works quite well, and could be replaced with the new HashedModuleIdPlugin when we move to webpack 2.

The chunk ID generator is based on it, but uses the chunk name instead of the module path as a key into the chunk ID.

The chunk hash generator is based on webpack-chunk-hash, with the addition of the dependant chunks in the hash content. This breaks the hash when the chunk ids of dependent chunks change. With the stable chunk module this shouldn't happen, but in case we drop it, add it for safety.
@ockham ockham force-pushed the fix/build/chunk-hash-issues branch from d4582bb to ac9f8a2 Compare March 13, 2017 17:46
@matticbot matticbot added the [Size] XL Probably needs to be broken down into multiple smaller issues label Mar 13, 2017
@ockham
Copy link
Contributor

ockham commented Mar 13, 2017

Rebased on current master, and did the same to #11753 so we can more easily compare their build dirs.

@ockham
Copy link
Contributor

ockham commented Mar 13, 2017

Apply react-redux@5 and see if we see hash movement (#11753)

git co fix/build/chunk-hash-issues
make clean
CALYPSO_ENV=production make build
cp -r public before
git co upgrade/react-redux-take-four
git rebase fix/build/chunk-hash-issues
make clean
CALYPSO_ENV=production make build
cp -r public after
diff -rqu before after
Only in before: async-load-components-post-schedule.4d8ed09eb72085d8724c.js
Only in before: async-load-components-post-schedule.4d8ed09eb72085d8724c.m.js
Only in after: async-load-components-post-schedule.7c30c240386c43af350e.js
Only in after: async-load-components-post-schedule.7c30c240386c43af350e.m.js
Only in after: devdocs.317af305d33c1970aae5.js
Only in after: devdocs.317af305d33c1970aae5.m.js
Only in before: devdocs.ec5db93c9543ae13ffa1.js
Only in before: devdocs.ec5db93c9543ae13ffa1.m.js
Only in before: manifest.ab44162de13bf77b908f.js
Only in before: manifest.ab44162de13bf77b908f.m.js
Only in after: manifest.d2fdc777ace1b52a483a.js
Only in after: manifest.d2fdc777ace1b52a483a.m.js
Files before/style-debug.css and after/style-debug.css differ
Files before/style-debug.css.map and after/style-debug.css.map differ
Files before/style-rtl.css and after/style-rtl.css differ
Files before/style.css and after/style.css differ
Only in after: theme.0bcced3ed0f92569f992.js
Only in after: theme.0bcced3ed0f92569f992.m.js
Only in before: theme.d246ec722ca5ea132e8c.js
Only in before: theme.d246ec722ca5ea132e8c.m.js
Only in before: vendor.116112c83596de24b5be.js
Only in before: vendor.116112c83596de24b5be.m.js
Only in after: vendor.eb8eb98035cb2cb45201.js
Only in after: vendor.eb8eb98035cb2cb45201.m.js

@ockham
Copy link
Contributor

ockham commented Mar 13, 2017

And, as per @blowery's request, diff -rqu master before:

https://gist.github.com/ockham/e5175929d87194b3779aba02457facbd

@blowery blowery self-assigned this Mar 13, 2017
@blowery blowery requested a review from gziolo March 14, 2017 14:20
@samouri
Copy link
Contributor

samouri commented Mar 14, 2017

Tested this out locally in a docker build. All LGTM 👍.

@@ -29,11 +29,11 @@ ChunkFileNames.prototype.apply = function( compiler ) {
this.indent( [
"script.onerror = script.onload = script.onreadystatechange = null;",
"delete installedChunks[ chunkId ];",
"window.__chunkErrors[ " + JSON.stringify( chunkMaps.name ) + "[chunkId]||chunkId ]=new Error();",
"window.__chunkErrors[ " + JSON.stringify( chunkMaps.name, null, ' ' ) + "[chunkId]||chunkId ]=new Error();",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does the null do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the null is the custom formatter. you have to pass it to get at the third arg, which is the string to use when pretty printing. Dropped it all and went back to plain JSON printing. This was really just for debugging. Makes the JSON easier to read.

@@ -3,6 +3,7 @@
*/
const path = require( 'path' );
const webpack = require( 'webpack' );
const WebpackStableModuleId = require( './server/bundler/webpack/stable-module-id' );
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't even need this file now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropped

@@ -18,7 +18,7 @@ ChunkFileNames.prototype.apply = function( compiler ) {
"// start chunk loading",
"installedChunks[ chunkId ] = [ callback ];",
"window.__chunkErrors = window.__chunkErrors || {};",
"window.__chunkErrors[ " + JSON.stringify( chunkMaps.name ) + "[chunkId]||chunkId ]=null;",
"window.__chunkErrors[ " + JSON.stringify( chunkMaps.name, null, ' ' ) + "[chunkId]||chunkId ]=null;",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix these to not use spaces for formatting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@blowery blowery merged commit 7a98dbd into master Mar 14, 2017
@blowery blowery deleted the fix/build/chunk-hash-issues branch March 14, 2017 18:55
@matticbot matticbot removed the [Status] Needs Review The PR is ready for review. This also triggers e2e canary tests and wp-desktop tests automatically. label Mar 14, 2017
@matticbot matticbot added the [Size] XL Probably needs to be broken down into multiple smaller issues label Mar 23, 2017
@samouri samouri mentioned this pull request Jun 12, 2017
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Framework [Size] XL Probably needs to be broken down into multiple smaller issues [Type] Performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants