refactor(gatsby): new dirty tracking implementation for queries #27504

vladar · 2020-10-16T19:39:52Z

Description

Currently, our expectation in develop is that all "dirty" queries are executed on the next "running queries" step. Then we reset data dependencies and clear the "dirty" state for all of them.

But to make running queries on demand possible we must track query "dirtiness" individually (per-page/static query). This PR makes exactly this. The query is kept dirty until it is executed.

I also took a chance to refactor/remove old code that was a bit obsolete after state machine (e.g. page-component machine and related stateful utils in query/index.js).

Related Issues

See #7348

…tend (and queryId in node)

…on for actual query running to compare)

…the queries reducer)

…educer now)

vladar · 2020-10-16T19:46:58Z

packages/gatsby/src/redux/reducers/queries.ts

+/**
+ * Tracks query dirtiness. Dirty queries are queries that:
+ *
+ * - depend on nodes or node collections (via `actions.createPageDependency`) that have changed.
+ * - have been recently extracted (or their query text has changed)
+ * - belong to newly created pages (or pages with modified context)
+ *
+ * Dirty queries must be re-ran.
+ */
+export function queriesReducer(


The reducer is the main piece of this PR. Now we track query dirtiness in a single place.

vladar · 2020-10-16T19:49:49Z

packages/gatsby/src/redux/reducers/queries.ts

+    case `CREATE_PAGE`: {
+      const { path, componentPath } = action.payload
+      if (!state.trackedQueries.has(path) || action.contextModified) {
+        const query = registerQuery(state, path)
+        query.dirty = setFlag(query.dirty, FLAG_DIRTY_PAGE)
+      }
+      registerComponent(state, componentPath).pages.add(path)
+      return state


Marking page query as dirty when a new page is created (i.e. clean cache, new page added statefully, etc). Existing pages (i.e. loaded from cache or on refresh) are not marked as dirty unless their context has changed.

vladar · 2020-10-16T19:53:33Z

packages/gatsby/src/redux/reducers/queries.ts

+    case `QUERY_EXTRACTED`: {
+      // Note: this action is called even in case of
+      // extraction error or missing query (with query === ``)
+      // TODO: use hash instead of a query text
+      const { componentPath, query } = action.payload
+      const component = registerComponent(state, componentPath)
+      if (component.query !== query) {
+        // Invalidate all pages associated with a component when query text changes
+        component.pages.forEach(queryId => {
+          const query = state.trackedQueries.get(queryId)
+          if (query) {
+            query.dirty = setFlag(query.dirty, FLAG_DIRTY_TEXT)
+          }
+        })
+        component.query = query
+      }


Mark all query pages associated with the component as dirty when the query text changes (i.e. for the first run, when the query was edited, when restoring from babel error).

Page queries remain marked as dirty until they are executed.

Does this change behaviour from current?

As in - if user have babel error and then fix it without changing query itself - this would rerun query (unnecessarily?). Do we do this in master too?

You are right (as always). Addressed here: 089f898

Also tested locally and now it won't re-run queries for recovered components (unless they've actually changed).

vladar · 2020-10-16T20:15:59Z

packages/gatsby/src/redux/reducers/queries.ts

+    case `REPLACE_STATIC_QUERY`: {
+      // Only called when static query text has changed, so no need to compare
+      // TODO: unify the behavior?
+      const query = registerQuery(state, action.payload.id)
+      query.dirty = setFlag(query.dirty, FLAG_DIRTY_TEXT)
+      return state


This is similar to QUERY_EXTRACTED but for static queries. That's where we mark them dirty when static query text changes.

vladar · 2020-10-16T20:22:37Z

packages/gatsby/src/redux/reducers/queries.ts

+    case `CREATE_NODE`:
+    case `DELETE_NODE`: {
+      const node = action.payload
+      const queriesByNode = state.byNode.get(node.id) ?? []
+      const queriesByConnection =
+        state.byConnection.get(node.internal.type) ?? []
+
+      queriesByNode.forEach(queryId => {
+        const query = state.trackedQueries.get(queryId)
+        if (query) {
+          query.dirty = setFlag(query.dirty, FLAG_DIRTY_DATA)
+        }
+      })
+      queriesByConnection.forEach(queryId => {
+        const query = state.trackedQueries.get(queryId)
+        if (query) {
+          query.dirty = setFlag(query.dirty, FLAG_DIRTY_DATA)
+        }
+      })
+      return state
+    }


Marking both - static and page queries as dirty when nodes queried during the previous query execution change. Query <-> Node dependencies are registered above via CREATE_COMPONENT_DEPENDENCY action (that's the same logic as in the old component-data-dependencies, just combined with other actions in this reducer)

During bootstrap, this is a no-op as we have a clean cache and no data dependencies.

vladar · 2020-10-16T20:33:32Z

packages/gatsby/src/query/index.js

@@ -176,37 +57,14 @@ const createStaticQueryJob = (state, queryId) => {
  const component = state.staticQueryComponents.get(queryId)
  const { hash, id, query, componentPath } = component
  return {
-    id: hash,
+    id: queryId,


Without this change PAGE_QUERY_RUN action receives hash in path property of the payload so we can't invalidate the dirty state for static queries. This also seems more consistent.

vladar · 2020-10-16T20:34:39Z

packages/gatsby/src/query/query-runner.ts

+  boundActionCreators.queryStart({
+    path: queryJob.id,
+    componentPath: queryJob.componentPath,
+    isPage: queryJob.isPage,
+  })
+


Using this new action to reset previous data dependencies per-query.

vladar · 2020-10-16T20:35:38Z

packages/gatsby/src/query/queue.ts

@@ -62,7 +62,7 @@ const createDevelopQueue = (getRunner: () => GraphQLRunner): Queue => {
          if (!queryJob.isPage) {
            websocketManager.emitStaticQueryData({
              result,
-              id: queryJob.id,
+              id: queryJob.hash,


We actually use hash when storing static query results on the frontend (and id/path in the node). Previously it was working because id was equal to hash but now it's not the case.

vladar · 2020-10-16T20:37:04Z

packages/gatsby/src/redux/actions/internal.ts

-/**
- * Delete dependencies between an array of pages and data. Probably for
- * internal use only. Used when deleting pages.
- * @private
- */
-export const deleteComponentsDependencies = (
-  paths: Array<string>
-): IDeleteComponentDependenciesAction => {
-  return {
-    type: `DELETE_COMPONENTS_DEPENDENCIES`,
-    payload: {
-      paths,
-    },
-  }
-}
-


Not needed anymore because we use the new action QUERY_START to consistently reset data dependencies before running a query.

vladar · 2020-10-16T20:39:14Z

packages/gatsby/src/redux/reducers/components.ts

-        // TODO we want to keep track of whether there's any outstanding queries still
-        // running as this will mark queries as complete immediately even though
-        // a page component could have thousands of pages will processing.
-        // This can be done once we start modeling Pages as well.


Hehe, this TODO sums up this PR pretty well :)

This reverts commit 066a4e1

packages/gatsby/src/redux/reducers/queries.ts

pieh

This is looking good. Done quite a lot of manual tests and seems like all bases are covered

vladar added 20 commits October 13, 2020 19:24

Remove dead code

7c8de4e

Consistently use job hash for static-query identification on the fron…

34613f9

…tend (and queryId in node)

New reducer to track the state of queries

b9a545f

tmp: output newly calculated dirty queries (still using old calculati…

52191d6

…on for actual query running to compare)

Add new QUERY_START action

350c3a5

Remove redundant component-data-dependencies reducer (now handled in …

875402f

…the queries reducer)

Actually use the new query tracking (and remove the old one)

f21d447

Fix data-tracking test

38ea80b

Shape of tracked component state should match component reducer

7e278e0

remove page-component machine (as we track query state in queries r…

0f3a5b6

…educer now)

Remove DELETE_COMPONENTS_DEPENDENCIES action

fad497f

Cleanup

3dadbdd

Cleanup

f1a8fcd

Re-enable previously skipped test

7f5b8dd

Cleanup

2f0949e

Do-not re-run queries with babel extraction errors

2013e1f

WIP: tests for the queries reducer

267e50f

Track babel errors per component (not per page/static query)

d909675

tests for the queries reducer

ab6b2e8

rename test

3146e34

gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Oct 16, 2020

vladar removed the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Oct 16, 2020

Cleanup / update snapshots

e5f1f31

vladar commented Oct 16, 2020

View reviewed changes

vladar added 2 commits October 17, 2020 03:49

Add missing snapshot

5ed8f4f

fix integration tests?

066a4e1

vladar mentioned this pull request Oct 19, 2020

fix(gatsby): notify when Gatsby cache is incomplete #27549

Merged

vladar added 3 commits October 19, 2020 22:37

Merge branch 'master' into vladar/query-state-refactor

afe30f9

Revert "fix integration tests?"

8e14546

This reverts commit 066a4e1

Restore DELETE_COMPONENTS_DEPENDENCIES as a no-op for BC

a43a0ea

vladar added 2 commits October 22, 2020 03:53

Take into account deletePage/createPage pattern in onCreatePage

16aa4fa

Update test snapshot

02e00fa

vladar marked this pull request as ready for review October 22, 2020 13:26

pieh reviewed Nov 3, 2020

View reviewed changes

packages/gatsby/src/redux/reducers/queries.ts Show resolved Hide resolved

vladar mentioned this pull request Nov 3, 2020

refactor(gatsby): utility that always returns up-to-date page-data #27807

Closed

vladar added 2 commits November 5, 2020 17:50

Do not mark page query as dirty when component has babel errors

089f898

Use flag constants vs. literal values in tests

cb1536e

vladar requested a review from a team as a code owner November 5, 2020 10:54

pieh reviewed Nov 5, 2020

View reviewed changes

packages/gatsby/src/redux/reducers/queries.ts Outdated Show resolved Hide resolved

Rename FLAG_ERROR_BABEL to FLAG_ERROR_EXTRACTION

f3c056f

pieh approved these changes Nov 5, 2020

View reviewed changes

pieh merged commit 9d322a4 into master Nov 5, 2020

delete-merged-branch bot deleted the vladar/query-state-refactor branch November 5, 2020 18:55

This was referenced Nov 9, 2020

fix(gatsby): account for edge case when payload of DELETE_NODE is undefined #27929

Merged

refactor(gatsby): get-page-data util #27939

Merged

perf(gatsby): fix performance regression with query dependency cleaning #28032

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(gatsby): new dirty tracking implementation for queries #27504

refactor(gatsby): new dirty tracking implementation for queries #27504

vladar commented Oct 16, 2020 •

edited

Loading

vladar Oct 16, 2020

vladar Oct 16, 2020

vladar Oct 16, 2020 •

edited

Loading

pieh Nov 3, 2020

vladar Nov 5, 2020

vladar Oct 16, 2020

vladar Oct 16, 2020

vladar Oct 16, 2020 •

edited

Loading

vladar Oct 16, 2020

vladar Oct 16, 2020 •

edited

Loading

vladar Oct 16, 2020

vladar Oct 16, 2020

pieh left a comment

refactor(gatsby): new dirty tracking implementation for queries #27504

refactor(gatsby): new dirty tracking implementation for queries #27504

Conversation

vladar commented Oct 16, 2020 • edited Loading

Description

Related Issues

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vladar Oct 16, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vladar Oct 16, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vladar Oct 16, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pieh left a comment

Choose a reason for hiding this comment

vladar commented Oct 16, 2020 •

edited

Loading

vladar Oct 16, 2020 •

edited

Loading

vladar Oct 16, 2020 •

edited

Loading

vladar Oct 16, 2020 •

edited

Loading