-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate "generated" cache IDs to avoid normalizing objects without meaningful IDs. #5146
Conversation
We will need to figure out how to serialize and deserialize these Reference instances, but that can come later.
According to the removed comment in writeToStore.ts, the only reason for escaping arbitrary opaque data as JSON was to avoid potential confusion with IdValue objects, but now that we use makeReference and isReference everywhere, there is no risk of that confusion, so we can just store the data directly.
The tests changed in this commit were mistaken, and have been mistaken for a long time, because they used generated IDs that did not start with a $ character. Ultimately, we want to eliminate the concept of generated IDs, but it's worth fixing existing tests in the meantime.
This paves the way for much more sophisticated cache reconciliation logic, configurable on a per-type, per-field basis.
These changes really show off the improvement that comes from inlining non-entity data, rather than generating fake IDs for such data.
These changes really show off the improvement that comes from inlining non-entity data, rather than generating fake IDs for such data.
b7554e0
to
3adb970
Compare
After originally writing this test, I moved the shallow copying logic inside this.merge, which now always copies its first argument (unless that argument is a previous copy, in which case it remains untouched).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benjamn it is incredibly exciting to see that there is a concrete path forward towards our new store design within the existing test suite and codebase. The work here is excellent and I suspect will clean up an entire class of 1) store bugs and 2) performance bottlenecks especially for teams with lots of data that doesn't need normalization
let typename: string; | ||
if (isReference(objectOrReference)) { | ||
object = execContext.contextValue.store.get(objectOrReference.id); | ||
typename = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if we currently handle this case, but the root query type doesn't have to have a typename called Query
. So object.__typename
could be something like RootQuery
here. I guess it doesn't matter given we force it to be called Query
at the root, but it does mean that further updates to the root type probably won't be tracked:
schema {
query: RootQuery
}
type RootQuery {
siteName: String
}
type Mutation {
updateSiteName(siteName: String): RootQuery
}
query GetSiteName {
siteName
}
# then later
mutation updateSiteName {
updateSiteName(name: "Ben's awesome site") {
__typename
siteName
}
}
The mutation would return a payload here of { data: { updateSiteName: { __typename: "RootQuery", siteName: "Ben's awesome site" } } }
which should ideally overwrite the original store data of { data: { __typename: "RootQuery", siteName: "Ben's site" } }
but I don't think this behaviour is supported?
This may be out of the scope of this PR (most likely is!) but if we are adjusting our normalization strategy to try and eliminate non entity normalization (YAY), the root query still feels like an interesting point since it is an entity by some regards (always the entry point so fields are stable to that object) but not in others, not referenceable by a set of primary keys, only the fact that there can only be one root "Query" type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we should get this right. If someone names the root query RootQuery
and then wants to define per-field policies for that type, they should be able to call it RootQuery
and not Query
in their configuration.
I'll see if I can remove the Query
assumption, though I suspect we will need to start adding the __typename
field to the root query fields, like we do for nested selection sets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbaxleyiii After playing with this a bit, I think I'd like to split it out into a separate PR, building on this one.
Although the Reference class was convenient within a single runtime, it posed some unnecessary challenges for serialization and deserialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benjamn This all looks incredible! I've been testing things out in a few sample apps, throwing a few curve balls here and there, and everything is working as expected. And wow, does this ever make a difference when inspecting cache contents! A big 👍 from me - thanks for tackling this!
As @jbaxleyiii pointed out in this comment, the root query and mutation types do not necessarily have to be called "Query" or "Mutation", and the only way to find their real names is to ask for the __typename property: #5146 (comment)
These changes really show off the improvement that comes from inlining non-entity data, rather than generating fake IDs for such data.
As part of its normalization strategy, the
InMemoryCache
has historically generated fake IDs for unidentified objects, based on the object's path within the query, such asROOT_QUERY.books.0.author
. These fake IDs would appear alongside actual entity IDs (such as those returned bydataIdFromObject
) in the normalization map. Whenever an unidentified object was about to be replaced by an identified entity, the existing data would (sometimes) be merged with the actual entity data, and the generated object would be deleted from the cache.Although this implementation strategy made some things easier—because we could assume any object had an ID, even if it wasn't really a proper entity—it also increased the size of the normalized cache and worsened its performance, thanks to the indirection of the fake IDs.
A more natural alternative would be simply to store unidentified data (arrays, objects without IDs, whatever) within its parent object, similar to scalar field values, without any normalization. In this strategy, the cache uses ID references only to refer to normalized entity objects, rather than abusing IDs to store generic data.
This was a significant refactoring, and I took extreme care to split the work up into meaningful commits, with
apollo-cache-inmemory
tests passing at every step of the way (with some tweaks, of course). I was not always confident that such a path existed, yet here it is. The end result is a much more compact and understandable internal representation of normalized data in the cache, with all the same test coverage we had before. If I'd rewritten this code from scratch, I would also have needed to write a ton of new tests, with little intuition about backwards compatibility.On top of all that, this PR reduces the size of the
apollo-cache-inmemory
package by a few hundred bytes, and paves the way for more configurable normalization logic, thanks to commits like feb9f36 and c87f447.This will be a backwards-incompatible change if your code depends on the precise internal representation of normalized data in the cache.
The use ofSee 01e0cc5 where theReference
instances instead ofIdValue
objects (0af6233) may also complicate JSON serialization, though I have some ideas for how to serialize custom data types without imposing new requirements on persistence APIs like IndexedDB orlocalStorage
.Reference
class was replaced with a{ __ref: string }
interface type.