Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(gatsby): new dirty tracking implementation for queries #27504

Merged
merged 31 commits into from
Nov 5, 2020

Conversation

vladar
Copy link
Contributor

@vladar vladar commented Oct 16, 2020

Description

Currently, our expectation in develop is that all "dirty" queries are executed on the next "running queries" step. Then we reset data dependencies and clear the "dirty" state for all of them.

But to make running queries on demand possible we must track query "dirtiness" individually (per-page/static query). This PR makes exactly this. The query is kept dirty until it is executed.

I also took a chance to refactor/remove old code that was a bit obsolete after state machine (e.g. page-component machine and related stateful utils in query/index.js).

Related Issues

See #7348

@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Oct 16, 2020
@vladar vladar removed the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Oct 16, 2020
Comment on lines +44 to +53
/**
* Tracks query dirtiness. Dirty queries are queries that:
*
* - depend on nodes or node collections (via `actions.createPageDependency`) that have changed.
* - have been recently extracted (or their query text has changed)
* - belong to newly created pages (or pages with modified context)
*
* Dirty queries must be re-ran.
*/
export function queriesReducer(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reducer is the main piece of this PR. Now we track query dirtiness in a single place.

Comment on lines 61 to 68
case `CREATE_PAGE`: {
const { path, componentPath } = action.payload
if (!state.trackedQueries.has(path) || action.contextModified) {
const query = registerQuery(state, path)
query.dirty = setFlag(query.dirty, FLAG_DIRTY_PAGE)
}
registerComponent(state, componentPath).pages.add(path)
return state
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking page query as dirty when a new page is created (i.e. clean cache, new page added statefully, etc). Existing pages (i.e. loaded from cache or on refresh) are not marked as dirty unless their context has changed.

Comment on lines 79 to 94
case `QUERY_EXTRACTED`: {
// Note: this action is called even in case of
// extraction error or missing query (with query === ``)
// TODO: use hash instead of a query text
const { componentPath, query } = action.payload
const component = registerComponent(state, componentPath)
if (component.query !== query) {
// Invalidate all pages associated with a component when query text changes
component.pages.forEach(queryId => {
const query = state.trackedQueries.get(queryId)
if (query) {
query.dirty = setFlag(query.dirty, FLAG_DIRTY_TEXT)
}
})
component.query = query
}
Copy link
Contributor Author

@vladar vladar Oct 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mark all query pages associated with the component as dirty when the query text changes (i.e. for the first run, when the query was edited, when restoring from babel error).

Page queries remain marked as dirty until they are executed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change behaviour from current?

As in - if user have babel error and then fix it without changing query itself - this would rerun query (unnecessarily?). Do we do this in master too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right (as always). Addressed here: 089f898

Also tested locally and now it won't re-run queries for recovered components (unless they've actually changed).

Comment on lines 106 to 111
case `REPLACE_STATIC_QUERY`: {
// Only called when static query text has changed, so no need to compare
// TODO: unify the behavior?
const query = registerQuery(state, action.payload.id)
query.dirty = setFlag(query.dirty, FLAG_DIRTY_TEXT)
return state
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is similar to QUERY_EXTRACTED but for static queries. That's where we mark them dirty when static query text changes.

Comment on lines +143 to +163
case `CREATE_NODE`:
case `DELETE_NODE`: {
const node = action.payload
const queriesByNode = state.byNode.get(node.id) ?? []
const queriesByConnection =
state.byConnection.get(node.internal.type) ?? []

queriesByNode.forEach(queryId => {
const query = state.trackedQueries.get(queryId)
if (query) {
query.dirty = setFlag(query.dirty, FLAG_DIRTY_DATA)
}
})
queriesByConnection.forEach(queryId => {
const query = state.trackedQueries.get(queryId)
if (query) {
query.dirty = setFlag(query.dirty, FLAG_DIRTY_DATA)
}
})
return state
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking both - static and page queries as dirty when nodes queried during the previous query execution change. Query <-> Node dependencies are registered above via CREATE_COMPONENT_DEPENDENCY action (that's the same logic as in the old component-data-dependencies, just combined with other actions in this reducer)

During bootstrap, this is a no-op as we have a clean cache and no data dependencies.

@@ -176,37 +57,14 @@ const createStaticQueryJob = (state, queryId) => {
const component = state.staticQueryComponents.get(queryId)
const { hash, id, query, componentPath } = component
return {
id: hash,
id: queryId,
Copy link
Contributor Author

@vladar vladar Oct 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change PAGE_QUERY_RUN action receives hash in path property of the payload so we can't invalidate the dirty state for static queries. This also seems more consistent.

Comment on lines +78 to +83
boundActionCreators.queryStart({
path: queryJob.id,
componentPath: queryJob.componentPath,
isPage: queryJob.isPage,
})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this new action to reset previous data dependencies per-query.

@@ -62,7 +62,7 @@ const createDevelopQueue = (getRunner: () => GraphQLRunner): Queue => {
if (!queryJob.isPage) {
websocketManager.emitStaticQueryData({
result,
id: queryJob.id,
id: queryJob.hash,
Copy link
Contributor Author

@vladar vladar Oct 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually use hash when storing static query results on the frontend (and id/path in the node). Previously it was working because id was equal to hash but now it's not the case.

Comment on lines 50 to 65
/**
* Delete dependencies between an array of pages and data. Probably for
* internal use only. Used when deleting pages.
* @private
*/
export const deleteComponentsDependencies = (
paths: Array<string>
): IDeleteComponentDependenciesAction => {
return {
type: `DELETE_COMPONENTS_DEPENDENCIES`,
payload: {
paths,
},
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed anymore because we use the new action QUERY_START to consistently reset data dependencies before running a query.

Comment on lines -126 to -129
// TODO we want to keep track of whether there's any outstanding queries still
// running as this will mark queries as complete immediately even though
// a page component could have thousands of pages will processing.
// This can be done once we start modeling Pages as well.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hehe, this TODO sums up this PR pretty well :)

@vladar vladar marked this pull request as ready for review October 22, 2020 13:26
@vladar vladar requested a review from a team as a code owner November 5, 2020 10:54
Copy link
Contributor

@pieh pieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good. Done quite a lot of manual tests and seems like all bases are covered

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants