-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Production-only bug causing stale query output: internal results cache contains stale data that is erroneously merged into readFromStore output #9735
Comments
Thanks for the detailed and deep analysis @akallem! Your diagnosis makes sense to me, and you have no doubt saved me a bunch of time/work here. It definitely was/is my intention that no observable behavior should ever depend on the actual frozenness of the objects, to prevent any differences of behavior in production (where we'd like to avoid actually freezing objects, since that makes using them slower in various ways). Besides hiding this bug (presumably), did you ever see benefits from canonization? We had to change the default I'm sure we can find a way to scope the |
No problem @benjamn, I'm glad it's helpful. Thanks for the quick acknowledgement and response. I did indeed reenable Meantime, as you think about the patch for this, it might be helpful to backport the patch to 3.5.x (if you don't already plan to), since I (and others) might not be ready or want to move to 3.6.x just yet. |
If you revert the previous commit and run `npm test`, you'll see all the tests this dynamic Object.isFrozen check has been silently protecting from failing, but only (and this is the important part) in development. Since we only actually freeze objects with Object.freeze in development, this Object.isFrozen check does not help in production, so an object that would have been frozen in development gets reused as a mutable copy, potentially acquiring properties it should not acquire (a bug fixed by the previous commit, first reported in issue #9735).
If you revert the previous commit and run `npm test`, you'll see all the tests this dynamic Object.isFrozen check has been silently protecting from failing, but only (and this is the important part) in development. Since we only actually freeze objects with Object.freeze in development, this Object.isFrozen check does not help in production, so an object that would have been frozen in development gets reused as a mutable copy, potentially acquiring properties it should not acquire (a bug fixed by the previous commit, first reported in issue #9735).
If you revert the previous commit and run `npm test`, you'll see all the tests this dynamic Object.isFrozen check has been silently protecting from failing, but only (and this is the important part) in development. Since we only actually freeze objects with Object.freeze in development, this Object.isFrozen check does not help in production, so an object that would have been frozen in development gets reused as a mutable copy, potentially acquiring properties it should not acquire (a bug fixed by the previous commit, first reported in issue #9735).
@akallem Thanks to your reproduction, I am confident this will be fixed in |
@benjamn Great! And thanks for the quick resolution. Is it possible for the fix to be backported to v3.5.x as well? |
Intended outcome:
Apollo Client should return the same data in production as in development.
Actual outcome:
Under a specific configuration of InMemoryCache and usage of GraphQL fragments (details below), Apollo Client returns correct data in development but incorrect (stale) data in production.
Quick summary of the reason:
Objects cached in the internal results cache (the calls to
wrap
here and here) are erroneously being modified after-the-fact to include fields that aren't in the cache key selection set. The erroneous modification is happening because a single instance of DeepMerger is being used. At an intermediate level duringdiffQueryAgainstStore
's recursion, a deep copy output from DeepMerger is cached in the results cache, but that object can subsequently be modified later in the recursion because it's in DeepMerger'spastCopies
set.How to reproduce the issue:
Here's a minimal example reproducing the issue.
Steps to reproduce the bug:
npm run start
).npm run build
thennpm run serve
).Versions
System:
OS: Linux 5.4 Ubuntu 18.04.6 LTS (Bionic Beaver)
Binaries:
Node: 16.15.0 - ~/.nvm/versions/node/v16.15.0/bin/node
npm: 8.5.5 - ~/.nvm/versions/node/v16.15.0/bin/npm
Browsers:
Chrome: 100.0.4896.75
Firefox: 100.0
npmPackages:
@apollo/client: ^3.6.0 => 3.6.0
Additional Info
Here are more details from my investigation to help you with the resolution.
The bug first appeared in Apollo Client v3.5.0. (v3.4.17 was fine.) In particular, the issue was introduced in this commit. Interestingly, this is a merge commit. Neither parent has the bug! The reason is that the bug only occurs if, among other things,
canonizeResults
is false. One parent to that merge includes commits that changed the default to false. The other parent includes this commit which modifiedreadFromStore
's merging logic. Independently they're fine, but together, they introduced the bug.Why does this bug occur only in production? In development, Apollo Client freezes returned objects so that the app fails loudly if it tries to modify them. In production, this behavior is disabled. It sounds like that couldn't possibly cause an issue, but it does! The reason is that DeepMerger behaves differently if an object is frozen. The behavior was introduced in this commit. In development (object is frozen) it returns a fresh copy, while in production (object is not frozen) it returns what it thinks is a safe-to-modify past copy. It is indeed a past copy, but it is not safe to modify because it has been cached in the results cache.
One of the easier ways to see the results cache is getting corrupted is to add a console log here logging a snapshot of the object (e.g., via
JSON.parse(JSON.stringify(returnValue))
and the object itself. You'll see that additional fields beyond the selection set appear on the object itself that don't appear in the snapshot or the selection set.Once the results cache returns an object that includes fields that aren't in the selection set, all bets are off. Whether it actually results in erroneous data being returned by Apollo Client to the caller depends on how the GraphQL document was defined.
The bug only occurs when
canonizeResults
is false (the default since v3.4.14),resultsCaching
is true (the default), and the GraphQL document includes fragments in a particular way that causes the erroneously modified results cache to actually manifest when reading back the data.Here's my summary of what I think happened:
canonizeResults
was defaulted to false on one branch, and the newreadFromStore
behavior was introduced on a separate branch in this commitThe text was updated successfully, but these errors were encountered: