fix(gatsby): refresh browser if webpack rebuild occurs #13871

Moocar · 2019-05-05T22:05:01Z

Note: merges to per-page-manifest, not master. See #13004 for more info

Description

The is the final new piece of functionality in #13004. It covers the case where a user is browsing the site, and a rebuild occurs in the background. In the old data.json world, the user would never see the new changes until they manually refreshed, since everything is statically linked. But in the de-globalized approach, the user might click a link after a rebuild occurs and get a page-data.json that is incompatible with the browser's component code.

To avoid this, we save the webpack stats.hash which is a hash of the entire webpack build. So whenever any source code in the site changes, there will be a new hash. This is then added to every page-data and html file, so that when the browser loads a new page, it can compare the html's hash to the page's hash, and force a refresh if they're different.

Since the page-datas now rely on the webpack compilation hash, I've moved query running out of bootstrap and into build.js so that the page querys can be run after the javascript bundle has been built.

Related Issues

Sub-PR of [WIP] Replace global data.json with page-data.json per page #13004

Moocar · 2019-05-05T22:56:28Z

@KyleAMathews @pieh @wardpeet This is the last big PR in the series. Would love a review when you get a chance.

I'm looking into the e2e_tests_production_runtime test. They're all passing on my machine (TM) so not sure what's going on yet.

Moocar · 2019-05-05T23:06:14Z

Ahh, I see now, it's the yarn test:offline variant. Thank goodness for these integration tests.

Moocar · 2019-05-07T00:21:55Z

ok, tests fixed. I was stuck by the interdependencies between 1-production.js and the rest of the cypress tests in e2e-tests/production-runtime. Should be working now. @davidbailey00 would you mind having a look at the code around resetting the whitelist in navigation.js? Not sure if this code should be here or in the offline plugin. Also, I know you have some WIP to improve the offline plugin, so feel free to rewrite all this stuff after it's merged.

DSchau

Nice work with this functionality and updating the e2e tests ⭐️ 🚀

It looks good to my eyes, but would appreciate a second review from @wardpeet and/or @pieh since there's a fair amount of surface area, here. That being said--the e2e test helps out, and will continue to help out going forward 👌

KyleAMathews

Looks fantastic!

This is a huge improvement over what we have now as people will now always be loading the latest data and code where currently people can browse around a site seeing old data and running out-dated code. <3

KyleAMathews · 2019-05-16T01:36:50Z

packages/gatsby/src/utils/page-data.js

+  pagePaths,
+  webpackCompilationHash
+) => {
+  const workerPool = new Worker(require.resolve(`./page-data-worker`), {


I'm surprised a big that this makes things faster? What was the difference? Node does use a thread pool for FS. Hmmm looking at this https://stackoverflow.com/questions/20346097/does-node-js-use-threads-thread-pool-internally there is limits so for mass updates I can see the limits on fs threads causing trouble.

So, I haven't actually run benchmarks on this, but my thinking is that the JSON deserialization/serialization is computationally expensive, so running that work on multiple workers will result in a performance increase. Especially since no data needs to be passed between processes (except the file path). Basically, this is a CPU optimization, not an I/O one.

Yeah — seems reasonable — does this reuse the workers for rendering pages? It'd be nice not to spawn extra Node.js processes if we don't have to.

No, it's a different worker pool. Interesting idea on sharing them. The challenge is that we'd have to write a new worker.js that references the code for both html rendering and this code, and then export the functions for each. That's totally doable, but might add a bit too much complexity for now. I'd say that if we use jest-worker in any other parts of gatsby, it might make sense to consolidate at that point.

this sounds a lot like a "job" api that handles things to a worker pool.

@Moocar we'd just have to merge the two files right and have different exports for the two? Am I missing something? Doesn't seem too complex.

@KyleAMathews Yep, and then make sure we start and end the worker pool at the right place. More than happy to code this up for this PR.

Thanks, let's do that — this will make things more efficient + it'll make it really easy to add the worker pool to other tasks as they come up as we'll have the basic infra in place + since the workers are already spun up, sending task to them is really cheap.

KyleAMathews · 2019-05-16T01:38:07Z

@Moocar lemme know about my question but otherwise I'll merge this in later.

packages/gatsby/cache-dir/__tests__/__snapshots__/static-entry.js.snap

Moocar · 2019-05-16T06:53:40Z

Hmm, not sure about that production-runtime error. Can't reproduce it locally.

wardpeet

I approve this! 😄 great work, one small nit about a comment

wardpeet · 2019-05-16T09:15:17Z

packages/gatsby/src/utils/page-data.js

+  pagePaths,
+  webpackCompilationHash
+) => {
+  const workerPool = new Worker(require.resolve(`./page-data-worker`), {


this sounds a lot like a "job" api that handles things to a worker pool.

packages/gatsby/cache-dir/navigation.js

wardpeet · 2019-05-16T10:09:25Z

@Moocar maybe just merge master? Didn't want to use the bot to make sure I don't break anything.

Circle ci:

Your plugins must export known APIs from their gatsby-browser.js.
The following exports aren't APIs. Perhaps you made a typo or your plugin is outdated?

See https://www.gatsbyjs.org/docs/browser-apis/ for the list of Gatsby Browser APIs

The plugin "gatsby-plugin-offline@2.1.0" is exporting a variable named "onPostPrefetchPathname" which isn't an API.

KyleAMathews · 2019-05-17T00:55:45Z

* move query running into build and develop * save build compilation hash * write compilationHash to page data * reload browser if rebuild occurs in background * add test to ensure that browser is reloaded if rebuild occurs * update page-datas when compilation hash changes * use worker pool to udpate page data compilation hash * update tests snapshot * reset plugin offline whitelist if compilation hash changes * prettier: remove global Cypress * separate page for testing compilation-hash * fix case typo * mock out static entry test webpackCompilationHash field * consolidate jest-worker calls into a central worker pool

* feat(gatsby): Page data without compilation hash (#13139) **Note: merges to `per-page-manifest`, not master. See #13004 for more info** ## Description This PR saves the `page-data.json` during query running. In order to minimize the size of PRs, my strategy is to save the page-data.json along with the normal query results, and then gradually shift functionality over to use `page-data.json` instead of `data.json`. Once all those PRs are merged, we'll be able to go back and delete the static query results, jsonNames, dataPaths, data.json etc. It does mean that we'll be storing double the amount of query results on disk. Hopefully that's ok in the interim. Compilation-hash will be added in future PRs. ## Related Issues - Sub-PR of #13004 * Websocket manager use page data (#13389) * add utils/page-data.read * read websocket page data from utils/page-data * Loader use page data (#13409) * page-data loader working for production-app * get new loader.getPage stuff working with gatsby develop * fix static-entry.js test * remove loadPageDataSync (will be in next PR) * use array of matchPaths * Deprecate various loader methods * remove console.log * document why fetchPageData needs to check that response is JSON * in offline, use prefetchedResources.push(...resourceUrls) * root.js remove else block * loader.js make* -> create* * loader drop else block * pass correct path/resourceUrls to onPostPrefetch * s/err => null/() => null/ * Extract loadComponent from loadPage * remove pageData from window object * update jest snapshots for static-entry (to use window.pagePath) * add loadPageOr404 * preload 404 page in background * normalize page paths * comment out resource-loading-resilience.js (will fix later) * Remove data json from guess (#13727) * remove data.json from plugin-guess * add test for plugin-guess * Gatsby serve use page data (#13728) * use match-paths.json for gatbsy serve * remove pages.json * move query/pages-writer to bootstrap/requires-writer (#13729) And delete data.json * fix(gatsby): refresh browser if webpack rebuild occurs (#13871) * move query running into build and develop * save build compilation hash * write compilationHash to page data * reload browser if rebuild occurs in background * add test to ensure that browser is reloaded if rebuild occurs * update page-datas when compilation hash changes * use worker pool to udpate page data compilation hash * update tests snapshot * reset plugin offline whitelist if compilation hash changes * prettier: remove global Cypress * separate page for testing compilation-hash * fix case typo * mock out static entry test webpackCompilationHash field * consolidate jest-worker calls into a central worker pool * page-data.json cleanup PR. Remove jsonName and dataPath (#14167) * remove json-name and don't save /static/d/ page query results * in loader.cleanAndFindPath, use __BASE_PATH__. Not __PATH_PREFIX__ * loader getPage -> loadPageSync (#14264) * Page data loading resilience (#14286) * fetchPageHtml if page resources aren't found * add page-data to production-runtime/resource-loading-resilience test Also use cypress tasks for blocking resources instead of npm run chunks * fetchPageHtml -> doesPageHtmlExist * remove loadPageOr404Sync * revert plugin-offline to master (#14385) * Misc per-page-manifest fixes (#14413) * use path.join for static-entry page-data path * add pathContext back to static-entry props (fix regression) * Remove jsonDataPaths from pre-existing redux state (#14501) * Add context to sentry xhr errors (#14577) * Add extra context to Sentry xhr errors * Don't define Sentry immediately as it'll always be undefined then * mv cpu-core-count.js test to utils/worker/__tests__ (fix bad merge) * remove sentry reporting

* feat(gatsby): Page data without compilation hash (gatsbyjs#13139) **Note: merges to `per-page-manifest`, not master. See gatsbyjs#13004 for more info** ## Description This PR saves the `page-data.json` during query running. In order to minimize the size of PRs, my strategy is to save the page-data.json along with the normal query results, and then gradually shift functionality over to use `page-data.json` instead of `data.json`. Once all those PRs are merged, we'll be able to go back and delete the static query results, jsonNames, dataPaths, data.json etc. It does mean that we'll be storing double the amount of query results on disk. Hopefully that's ok in the interim. Compilation-hash will be added in future PRs. ## Related Issues - Sub-PR of gatsbyjs#13004 * Websocket manager use page data (gatsbyjs#13389) * add utils/page-data.read * read websocket page data from utils/page-data * Loader use page data (gatsbyjs#13409) * page-data loader working for production-app * get new loader.getPage stuff working with gatsby develop * fix static-entry.js test * remove loadPageDataSync (will be in next PR) * use array of matchPaths * Deprecate various loader methods * remove console.log * document why fetchPageData needs to check that response is JSON * in offline, use prefetchedResources.push(...resourceUrls) * root.js remove else block * loader.js make* -> create* * loader drop else block * pass correct path/resourceUrls to onPostPrefetch * s/err => null/() => null/ * Extract loadComponent from loadPage * remove pageData from window object * update jest snapshots for static-entry (to use window.pagePath) * add loadPageOr404 * preload 404 page in background * normalize page paths * comment out resource-loading-resilience.js (will fix later) * Remove data json from guess (gatsbyjs#13727) * remove data.json from plugin-guess * add test for plugin-guess * Gatsby serve use page data (gatsbyjs#13728) * use match-paths.json for gatbsy serve * remove pages.json * move query/pages-writer to bootstrap/requires-writer (gatsbyjs#13729) And delete data.json * fix(gatsby): refresh browser if webpack rebuild occurs (gatsbyjs#13871) * move query running into build and develop * save build compilation hash * write compilationHash to page data * reload browser if rebuild occurs in background * add test to ensure that browser is reloaded if rebuild occurs * update page-datas when compilation hash changes * use worker pool to udpate page data compilation hash * update tests snapshot * reset plugin offline whitelist if compilation hash changes * prettier: remove global Cypress * separate page for testing compilation-hash * fix case typo * mock out static entry test webpackCompilationHash field * consolidate jest-worker calls into a central worker pool * page-data.json cleanup PR. Remove jsonName and dataPath (gatsbyjs#14167) * remove json-name and don't save /static/d/ page query results * in loader.cleanAndFindPath, use __BASE_PATH__. Not __PATH_PREFIX__ * loader getPage -> loadPageSync (gatsbyjs#14264) * Page data loading resilience (gatsbyjs#14286) * fetchPageHtml if page resources aren't found * add page-data to production-runtime/resource-loading-resilience test Also use cypress tasks for blocking resources instead of npm run chunks * fetchPageHtml -> doesPageHtmlExist * remove loadPageOr404Sync * revert plugin-offline to master (gatsbyjs#14385) * Misc per-page-manifest fixes (gatsbyjs#14413) * use path.join for static-entry page-data path * add pathContext back to static-entry props (fix regression) * Remove jsonDataPaths from pre-existing redux state (gatsbyjs#14501) * Add context to sentry xhr errors (gatsbyjs#14577) * Add extra context to Sentry xhr errors * Don't define Sentry immediately as it'll always be undefined then * mv cpu-core-count.js test to utils/worker/__tests__ (fix bad merge) * remove sentry reporting

Moocar requested review from a team as code owners May 5, 2019 22:05

Moocar mentioned this pull request May 5, 2019

[WIP] Replace global data.json with page-data.json per page #13004

Closed

30 tasks

Moocar requested review from KyleAMathews, pieh and wardpeet May 5, 2019 22:56

DSchau approved these changes May 14, 2019

View reviewed changes

DSchau added the status: awaiting reviewer response A pull request that is currently awaiting a reviewer's response label May 14, 2019

KyleAMathews approved these changes May 16, 2019

View reviewed changes

DSchau changed the title ~~Refresh browser if webpack rebuild occurs~~ fix(gatsby): refresh browser if webpack rebuild occurs May 16, 2019

wardpeet reviewed May 16, 2019

View reviewed changes

packages/gatsby/cache-dir/__tests__/__snapshots__/static-entry.js.snap Outdated Show resolved Hide resolved

Moocar added 13 commits May 16, 2019 16:33

move query running into build and develop

8f257a0

save build compilation hash

d3acfca

write compilationHash to page data

b2ed703

reload browser if rebuild occurs in background

84b9edd

add test to ensure that browser is reloaded if rebuild occurs

8e624c7

update page-datas when compilation hash changes

6006970

use worker pool to udpate page data compilation hash

cb72879

update tests snapshot

d5854b2

reset plugin offline whitelist if compilation hash changes

8ecee09

prettier: remove global Cypress

fba9778

separate page for testing compilation-hash

eae9401

fix case typo

bbfa7b1

mock out static entry test webpackCompilationHash field

ed4a42a

Moocar force-pushed the add-compilation-hash branch from 868fdd3 to ed4a42a Compare May 16, 2019 06:34

wardpeet approved these changes May 16, 2019

View reviewed changes

wardpeet reviewed May 16, 2019

View reviewed changes

packages/gatsby/cache-dir/navigation.js Show resolved Hide resolved

consolidate jest-worker calls into a central worker pool

70cd0f2

KyleAMathews merged commit efca3e8 into gatsbyjs:per-page-manifest May 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gatsby): refresh browser if webpack rebuild occurs #13871

fix(gatsby): refresh browser if webpack rebuild occurs #13871

Moocar commented May 5, 2019 •

edited

Loading

Moocar commented May 5, 2019

Moocar commented May 5, 2019

Moocar commented May 7, 2019

DSchau left a comment

KyleAMathews left a comment

KyleAMathews May 16, 2019

Moocar May 16, 2019

KyleAMathews May 16, 2019

Moocar May 16, 2019

wardpeet May 16, 2019

KyleAMathews May 16, 2019

Moocar May 16, 2019

KyleAMathews May 16, 2019

KyleAMathews commented May 16, 2019

Moocar commented May 16, 2019

wardpeet left a comment

wardpeet May 16, 2019

wardpeet commented May 16, 2019

KyleAMathews commented May 17, 2019

fix(gatsby): refresh browser if webpack rebuild occurs #13871

fix(gatsby): refresh browser if webpack rebuild occurs #13871

Conversation

Moocar commented May 5, 2019 • edited Loading

Description

Related Issues

Moocar commented May 5, 2019

Moocar commented May 5, 2019

Moocar commented May 7, 2019

DSchau left a comment

Choose a reason for hiding this comment

KyleAMathews left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KyleAMathews commented May 16, 2019

Moocar commented May 16, 2019

wardpeet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wardpeet commented May 16, 2019

KyleAMathews commented May 17, 2019

Moocar commented May 5, 2019 •

edited

Loading