-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Replace global data.json with page-data.json per page #13004
Conversation
cleanup bootstrap moved page-query-runner to index use finishBootstrap() fix query-runner type annotation refactor query-runner/index doc sections queryjobs refactor make query refactor enqueueQueryID -> enqueueExtractedQueryId
Also moved pages-writer.js and redirects-writer.js to src/bootstrap
Nice! 📣
Did it make a difference on build times at all?
…On Thu, May 23, 2019, 10:50 PM Matthew Miller ***@***.***> wrote:
It was on 96 for performance, but the main win is in the Performance
metrics
[image: image]
<https://user-images.githubusercontent.com/546754/58305503-abae2e00-7e3b-11e9-9cff-f8085baa58d6.png>
These are now all ticks, and the diagnostics section is empty.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#13004?email_source=notifications&email_token=AAARLBYJWF36BK4RAGKRZODPW563TA5CNFSM4HC2VINKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWEHCOA#issuecomment-495481144>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAARLB5VZN2SDQ7IVXFNXNLPW563TANCNFSM4HC2VINA>
.
|
Actually yeah, 15 minutes down to 9. And I except it to have massively lowered deploy times due to less changed files. I haven't ran multiple deploys yet to test though. |
@me4502 oh wow! That's kinda surprising actually — where'd the 6 minutes get saved? |
**Note: merges to `per-page-manifest`, not master. See gatsbyjs#13004 for more info** ## Description This PR saves the `page-data.json` during query running. In order to minimize the size of PRs, my strategy is to save the page-data.json along with the normal query results, and then gradually shift functionality over to use `page-data.json` instead of `data.json`. Once all those PRs are merged, we'll be able to go back and delete the static query results, jsonNames, dataPaths, data.json etc. It does mean that we'll be storing double the amount of query results on disk. Hopefully that's ok in the interim. Compilation-hash will be added in future PRs. ## Related Issues - Sub-PR of gatsbyjs#13004
OK, we're ready to finally merge
Basically it's a breaking change. And we need users to update both |
@KyleAMathews Seems to just be generally all over the build:
New:
|
Building JavaScript is faster as expected — interesting that running queries & SSR is so much faster 🤔
…--
Kyle Mathews
Blog: http://bricolage.io
Twitter: http://twitter.com/kylemathews
On Mon, May 27, 2019 at 4:56 PM, Matthew Miller < ***@***.*** > wrote:
@ KyleAMathews ( https://github.com/KyleAMathews ) Seems to just be
generally all over the build:
Old:
[2019-05-20T03:34:23.683Z] success open and validate gatsby-configs —
0.015 s [2019-05-20T03:34:23.944Z] success load plugins — 4.015 s
[2019-05-20T03:34:23.944Z] success onPreInit — 0.006 s
[2019-05-20T03:34:23.944Z] success delete html and css files from previous
builds — 0.007 s [2019-05-20T03:34:23.944Z] success initialize cache —
0.009 s [2019-05-20T03:34:23.944Z] success start nodes db — 0.005 s
[2019-05-20T03:34:24.203Z] success copy gatsby files — 0.042 s
[2019-05-20T03:34:24.203Z] success onPreBootstrap — 0.014 s
[2019-05-20T03:34:24.203Z] Starting to fetch data from Prismic
[2019-05-20T03:34:24.462Z] Fetch Prismic data: 387.842ms
[2019-05-20T03:35:03.976Z] success source and transform nodes — 35.679 s
[2019-05-20T03:35:03.976Z] success building schema — 1.061 s
[2019-05-20T03:38:25.535Z] success createPages — 193.127 s
[2019-05-20T03:38:25.536Z] success createPagesStatefully — 0.075 s
[2019-05-20T03:38:25.536Z] success onPreExtractQueries — 0.027 s
[2019-05-20T03:38:25.536Z] success update schema — 0.000 s
[2019-05-20T03:38:25.536Z] success extract queries from components — 0.389
s [2019-05-20T03:38:25.536Z] success run static queries — 0.099 s — 3/3
30.93 queries/second [2019-05-20T03:39:31.986Z] success run page queries —
75.684 s — 55490/55490 736.00 queries/second [2019-05-20T03:39:33.397Z]
success write out page data — 2.811 s [2019-05-20T03:39:33.397Z] success
write out redirect data — 0.001 s [2019-05-20T03:42:10.295Z]
[2019-05-20T03:42:10.295Z] success Build manifest and related icons —
0.257 s [2019-05-20T03:42:10.295Z] success onPostBootstrap — 0.260 s
[2019-05-20T03:42:10.295Z] [2019-05-20T03:42:10.295Z] info bootstrap
finished - 467.029 s [2019-05-20T03:42:10.295Z] [2019-05-20T03:43:47.136Z]
success Building production JavaScript and CSS bundles — 76.443 s
[2019-05-20T03:47:23.755Z] success Building static HTML for pages —
210.809 s — 55490/55490 276.87 pages/second [2019-05-20T03:47:23.755Z]
Generated public/sw.js, which will precache 12 files, totaling 402207
bytes. [2019-05-20T03:47:23.755Z] info Done building in 782.100299522 sec
[2019-05-20T03:47:23.755Z] Done in 841.10s. [2019-05-20T03:47:23.755Z]
Done in 841.64s.
New:
[2019-05-24T04:15:58.931Z] success open and validate gatsby-configs —
0.010 s [2019-05-24T04:16:03.113Z] success load plugins — 4.288 s
[2019-05-24T04:16:03.113Z] success onPreInit — 0.005 s
[2019-05-24T04:16:03.113Z] success delete html and css files from previous
builds — 0.006 s [2019-05-24T04:16:03.113Z] success initialize cache —
0.008 s [2019-05-24T04:16:03.113Z] success start nodes db — 0.004 s
[2019-05-24T04:16:03.113Z] success copy gatsby files — 0.033 s
[2019-05-24T04:16:03.113Z] success onPreBootstrap — 0.014 s
[2019-05-24T04:16:03.113Z] Starting to fetch data from Prismic
[2019-05-24T04:16:03.371Z] Fetch Prismic data: 391.532ms
[2019-05-24T04:16:36.213Z] success source and transform nodes — 29.421 s
[2019-05-24T04:16:36.213Z] warning Deprecation warning -
"noDefaultResolvers" is deprecated. In Gatsby 3, defined fields won't get
resolvers, unless explicitly added with a directive/extension.
[2019-05-24T04:16:36.213Z] warning Deprecation warning -
"noDefaultResolvers" is deprecated. In Gatsby 3, defined fields won't get
resolvers, unless explicitly added with a directive/extension.
[2019-05-24T04:16:36.213Z] warning Deprecation warning -
"noDefaultResolvers" is deprecated. In Gatsby 3, defined fields won't get
resolvers, unless explicitly added with a directive/extension.
[2019-05-24T04:16:36.213Z] warning Deprecation warning -
"noDefaultResolvers" is deprecated. In Gatsby 3, defined fields won't get
resolvers, unless explicitly added with a directive/extension.
[2019-05-24T04:16:36.213Z] warning Deprecation warning -
"noDefaultResolvers" is deprecated. In Gatsby 3, defined fields won't get
resolvers, unless explicitly added with a directive/extension.
[2019-05-24T04:16:36.213Z] warning Deprecation warning -
"noDefaultResolvers" is deprecated. In Gatsby 3, defined fields won't get
resolvers, unless explicitly added with a directive/extension.
[2019-05-24T04:16:36.213Z] success building schema — 0.507 s
[2019-05-24T04:19:27.717Z] success createPages — 163.235 s
[2019-05-24T04:19:27.717Z] success createPagesStatefully — 0.071 s
[2019-05-24T04:19:27.717Z] success onPreExtractQueries — 0.004 s
[2019-05-24T04:19:27.717Z] success update schema — 0.000 s
[2019-05-24T04:19:27.717Z] success extract queries from components — 0.376
s [2019-05-24T04:19:27.717Z] success write out requires — 0.291 s
[2019-05-24T04:19:27.717Z] success write out redirect data — 0.001 s
[2019-05-24T04:19:27.717Z] success Build manifest and related icons —
0.117 s [2019-05-24T04:19:27.717Z] success onPostBootstrap — 0.119 s
[2019-05-24T04:19:27.717Z] info bootstrap finished - 200.574 s
[2019-05-24T04:19:27.717Z] success run static queries — 0.011 s — 2/2
194.43 queries/second [2019-05-24T04:19:35.850Z] success Building
production JavaScript and CSS bundles — 17.572 s
[2019-05-24T04:19:36.108Z] success Rewriting compilation hashes — 0.066 s
[2019-05-24T04:21:42.166Z] success run page queries — 122.355 s —
55588/55588 454.93 queries/second [2019-05-24T04:24:40.233Z] success
Building static HTML for pages — 157.017 s — 55588/55588 381.18
pages/second [2019-05-24T04:24:40.233Z] Generated public/sw.js, which will
precache 12 files, totaling 404337 bytes. [2019-05-24T04:24:40.233Z] info
Done building in 515.287176085 sec [2019-05-24T04:24:40.233Z] Done in
578.07s. [2019-05-24T04:24:40.233Z] Done in 578.55s.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub (
#13004?email_source=notifications&email_token=AAARLB6MPD3PPQS4XQWCASTPXRYKTA5CNFSM4HC2VINKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWKURIQ#issuecomment-496322722
) , or mute the thread (
https://github.com/notifications/unsubscribe-auth/AAARLB4XFBQY5GMFWDIPSH3PXRYKTANCNFSM4HC2VINA
).
|
OK, I'm satisfied again that this is ready to merge to master. I've published a new gatsby@per-page-manifest. I've created the PR to merge I'm also running https://github.com/pieh/develop-runner to make sure that a bunch of sites at least build. |
**Note: merges to `per-page-manifest`, not master. See gatsbyjs#13004 for more info** ## Description This PR saves the `page-data.json` during query running. In order to minimize the size of PRs, my strategy is to save the page-data.json along with the normal query results, and then gradually shift functionality over to use `page-data.json` instead of `data.json`. Once all those PRs are merged, we'll be able to go back and delete the static query results, jsonNames, dataPaths, data.json etc. It does mean that we'll be storing double the amount of query results on disk. Hopefully that's ok in the interim. Compilation-hash will be added in future PRs. ## Related Issues - Sub-PR of gatsbyjs#13004
Hey folks, I'm closing this PR, as the actual code to be merged to master is on #14359 (branch |
**Note: merges to `per-page-manifest`, not master. See gatsbyjs#13004 for more info** ## Description This PR saves the `page-data.json` during query running. In order to minimize the size of PRs, my strategy is to save the page-data.json along with the normal query results, and then gradually shift functionality over to use `page-data.json` instead of `data.json`. Once all those PRs are merged, we'll be able to go back and delete the static query results, jsonNames, dataPaths, data.json etc. It does mean that we'll be storing double the amount of query results on disk. Hopefully that's ok in the interim. Compilation-hash will be added in future PRs. ## Related Issues - Sub-PR of gatsbyjs#13004
* feat(gatsby): Page data without compilation hash (#13139) **Note: merges to `per-page-manifest`, not master. See #13004 for more info** ## Description This PR saves the `page-data.json` during query running. In order to minimize the size of PRs, my strategy is to save the page-data.json along with the normal query results, and then gradually shift functionality over to use `page-data.json` instead of `data.json`. Once all those PRs are merged, we'll be able to go back and delete the static query results, jsonNames, dataPaths, data.json etc. It does mean that we'll be storing double the amount of query results on disk. Hopefully that's ok in the interim. Compilation-hash will be added in future PRs. ## Related Issues - Sub-PR of #13004 * Websocket manager use page data (#13389) * add utils/page-data.read * read websocket page data from utils/page-data * Loader use page data (#13409) * page-data loader working for production-app * get new loader.getPage stuff working with gatsby develop * fix static-entry.js test * remove loadPageDataSync (will be in next PR) * use array of matchPaths * Deprecate various loader methods * remove console.log * document why fetchPageData needs to check that response is JSON * in offline, use prefetchedResources.push(...resourceUrls) * root.js remove else block * loader.js make* -> create* * loader drop else block * pass correct path/resourceUrls to onPostPrefetch * s/err => null/() => null/ * Extract loadComponent from loadPage * remove pageData from window object * update jest snapshots for static-entry (to use window.pagePath) * add loadPageOr404 * preload 404 page in background * normalize page paths * comment out resource-loading-resilience.js (will fix later) * Remove data json from guess (#13727) * remove data.json from plugin-guess * add test for plugin-guess * Gatsby serve use page data (#13728) * use match-paths.json for gatbsy serve * remove pages.json * move query/pages-writer to bootstrap/requires-writer (#13729) And delete data.json * fix(gatsby): refresh browser if webpack rebuild occurs (#13871) * move query running into build and develop * save build compilation hash * write compilationHash to page data * reload browser if rebuild occurs in background * add test to ensure that browser is reloaded if rebuild occurs * update page-datas when compilation hash changes * use worker pool to udpate page data compilation hash * update tests snapshot * reset plugin offline whitelist if compilation hash changes * prettier: remove global Cypress * separate page for testing compilation-hash * fix case typo * mock out static entry test webpackCompilationHash field * consolidate jest-worker calls into a central worker pool * page-data.json cleanup PR. Remove jsonName and dataPath (#14167) * remove json-name and don't save /static/d/ page query results * in loader.cleanAndFindPath, use __BASE_PATH__. Not __PATH_PREFIX__ * loader getPage -> loadPageSync (#14264) * Page data loading resilience (#14286) * fetchPageHtml if page resources aren't found * add page-data to production-runtime/resource-loading-resilience test Also use cypress tasks for blocking resources instead of npm run chunks * fetchPageHtml -> doesPageHtmlExist * remove loadPageOr404Sync * revert plugin-offline to master (#14385) * Misc per-page-manifest fixes (#14413) * use path.join for static-entry page-data path * add pathContext back to static-entry props (fix regression) * Remove jsonDataPaths from pre-existing redux state (#14501) * Add context to sentry xhr errors (#14577) * Add extra context to Sentry xhr errors * Don't define Sentry immediately as it'll always be undefined then * mv cpu-core-count.js test to utils/worker/__tests__ (fix bad merge) * remove sentry reporting
Thank-you everyone!! amazing job!! |
* feat(gatsby): Page data without compilation hash (gatsbyjs#13139) **Note: merges to `per-page-manifest`, not master. See gatsbyjs#13004 for more info** ## Description This PR saves the `page-data.json` during query running. In order to minimize the size of PRs, my strategy is to save the page-data.json along with the normal query results, and then gradually shift functionality over to use `page-data.json` instead of `data.json`. Once all those PRs are merged, we'll be able to go back and delete the static query results, jsonNames, dataPaths, data.json etc. It does mean that we'll be storing double the amount of query results on disk. Hopefully that's ok in the interim. Compilation-hash will be added in future PRs. ## Related Issues - Sub-PR of gatsbyjs#13004 * Websocket manager use page data (gatsbyjs#13389) * add utils/page-data.read * read websocket page data from utils/page-data * Loader use page data (gatsbyjs#13409) * page-data loader working for production-app * get new loader.getPage stuff working with gatsby develop * fix static-entry.js test * remove loadPageDataSync (will be in next PR) * use array of matchPaths * Deprecate various loader methods * remove console.log * document why fetchPageData needs to check that response is JSON * in offline, use prefetchedResources.push(...resourceUrls) * root.js remove else block * loader.js make* -> create* * loader drop else block * pass correct path/resourceUrls to onPostPrefetch * s/err => null/() => null/ * Extract loadComponent from loadPage * remove pageData from window object * update jest snapshots for static-entry (to use window.pagePath) * add loadPageOr404 * preload 404 page in background * normalize page paths * comment out resource-loading-resilience.js (will fix later) * Remove data json from guess (gatsbyjs#13727) * remove data.json from plugin-guess * add test for plugin-guess * Gatsby serve use page data (gatsbyjs#13728) * use match-paths.json for gatbsy serve * remove pages.json * move query/pages-writer to bootstrap/requires-writer (gatsbyjs#13729) And delete data.json * fix(gatsby): refresh browser if webpack rebuild occurs (gatsbyjs#13871) * move query running into build and develop * save build compilation hash * write compilationHash to page data * reload browser if rebuild occurs in background * add test to ensure that browser is reloaded if rebuild occurs * update page-datas when compilation hash changes * use worker pool to udpate page data compilation hash * update tests snapshot * reset plugin offline whitelist if compilation hash changes * prettier: remove global Cypress * separate page for testing compilation-hash * fix case typo * mock out static entry test webpackCompilationHash field * consolidate jest-worker calls into a central worker pool * page-data.json cleanup PR. Remove jsonName and dataPath (gatsbyjs#14167) * remove json-name and don't save /static/d/ page query results * in loader.cleanAndFindPath, use __BASE_PATH__. Not __PATH_PREFIX__ * loader getPage -> loadPageSync (gatsbyjs#14264) * Page data loading resilience (gatsbyjs#14286) * fetchPageHtml if page resources aren't found * add page-data to production-runtime/resource-loading-resilience test Also use cypress tasks for blocking resources instead of npm run chunks * fetchPageHtml -> doesPageHtmlExist * remove loadPageOr404Sync * revert plugin-offline to master (gatsbyjs#14385) * Misc per-page-manifest fixes (gatsbyjs#14413) * use path.join for static-entry page-data path * add pathContext back to static-entry props (fix regression) * Remove jsonDataPaths from pre-existing redux state (gatsbyjs#14501) * Add context to sentry xhr errors (gatsbyjs#14577) * Add extra context to Sentry xhr errors * Don't define Sentry immediately as it'll always be undefined then * mv cpu-core-count.js test to utils/worker/__tests__ (fix bad merge) * remove sentry reporting
Description
Removes the global
data.json
file and instead provides apage-data.json
for each page on a gatsby site. See #11982 for background.This PR is a draft and should not be merged. I'll be providing a series of smaller PRs instead. Reviewers can always reference this PR to see the full context of the changes. That said, if you're reading this PR and spot any problems, then please do leave some feedback!
Benefits/Trade offs
pros
data.json
to the browser. For 10'000 page sites, this file was already approaching 500kb compressed.data.json
to search throughproduction-app.js
webpack build no longer references pages or their data. Which will make incremental builds much easier to implement in the futurecons
page-data.json
, then the component. Though prefetched pages still do this in the background.loader.getResourceURLsForPathname
Sup-PRs
Breaking up these PRs is hard. To make it a bit easier, I'm going to have gatsby saving two sources of data to disk.
data.json
andstatic/d/...
, as well as the newpage-data.json
. This will require duplication of memory and disk, so we won't be able to merge to master. But we will be able to gradually introduce new functionality, and once it's ready, delete the old data sources and merge to master. I'm not a fan of long lived non-master branches, but I think it makes sense here.The long lived branch is per-page-manifest. The PRs that will merge directly into that are:
TODO
breadcrumbs.js
file.networkFirst
for workbox for all json files not in/static
. Reason: page-data.json is currently staleWhileRevalidate, which doesn't make sense since it's dynamic. See offline-plugin/gatsby-node.js. Double check ifnetworkFIrst
is correct, given we're on Workbox 3.TODO but won't block release
public/static/d/....
isn't used except for static query results. So we should ask the user to delete theirpublic
directory when they run gatsby with this code for the first time? It's not required.Already reviewed and Merged:
Merge build-html/develop-htmlgatsby develop: cleanup port detectionexpicitly call db saveState and autoSavemove redirects-writer to bootstrapwrite match-paths.jsonmove src/internal-plugins/query-runner to src/queryexplicitly watch delete page from gatsby developMake query queue constructableRefactor query calc query idsGet list of /dev-404-page/ pages via graphqlSplit static/page queriesMove query/page-query-runner to query/indexMerged against
per-page-manifest
Page data without compilation hash(ready for review)Websocket manager use page dataloader-use-page-dataremove-data-json-from-guessgatsby-serve-use-page-dataremove-data-jsonRefresh browser if webpack rebuild occursRemove json-nameloader.js getPage -> loadPageSyncPage data loading resilienceRelated Issues