Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relay Error Handling Project #4416

Open
itamark opened this issue Aug 23, 2023 · 7 comments
Open

Relay Error Handling Project #4416

itamark opened this issue Aug 23, 2023 · 7 comments
Labels

Comments

@itamark
Copy link
Contributor

itamark commented Aug 23, 2023

Relay Error Handling Project

TL;DR

We are proposing a way for engineers to have a safer, more controlled approach to handling GraphQL field errors in Relay.
The proposal includes the ability to specify how you want errors to be handled on fields - potentially including throwing, returning the error along with successful fields, and an option to preserve the current behavior.
These changes are meant to prevent inaccurate client behavior in cases where an error occurs on a field server-side, as well as more granular control over a user’s experience when something does go wrong.

Problem

A feature of GraphQL is that it coalesces errors to null, providing errors as metadata instead of directly throwing exceptions at the network level. Here are some issues that arise from this behavior:
An ecosystem-wide tradeoff (ecosystem-wide because no GraphQL client has addressed this before):

  • Discard queries with errors or,
  • Not be able to discern whether a null is error or not

Essentially, this leaves users with a choice between:

  • Robustness: meaning, “render at all costs” - so if one field errors, everything still returns and that field is null, or
    • This choice leaves users unable to delineate between errors and true nulls.
    • Users also forget that this is how Relay works - even after they’ve learned about it - and fail to account for that edge case.
  • Correctness: meaning, if something errors, the query throws and nothing is rendered - so that partially incorrect info never shows.

An additional severe pitfall is that some client-side data could write back incorrect info to the server. So if a field errors and is coalesced to 0 (from null) - it could potentially write back incorrect data to the server.

Our Proposed Solution

We are planning on introducing the ability to define how you want an erroring field to behave - giving users the flexibility to either define the existing behavior (return null upon server error), throw entirely (essentially rethrowing the server error client-side, to be handled by error boundaries for instance), or provided the error in the response to be handled separately.

We are currently working on ironing out implementation details, ergonomics, and addressing known edge-cases.
It’s important to note that we are taking performance and other concerns seriously.

You can expect to see more details on this project as our design solidifies, and we welcome any comments or questions here.

Thanks!

@sibelius
Copy link
Contributor

@captbaritone
Copy link
Contributor

captbaritone commented Aug 23, 2023

have you tried @required https://relay.dev/docs/guides/required-directive/#internaldocs-banner ?

One main goal of this project, which @required does not help with, is to address the fact that users can't currently delineate between true null and null-due-to-error in their components.

Longer term, we see a path where this style of explicit error handling could remove the need for @required in most cases. I'll be making a post about this larger vision soon.

@benjie
Copy link

benjie commented Aug 23, 2023

There may be some small overlap between this and the Client Controlled Nullability proposal, e.g. earlier this evening we were discussing "skipping" fields rather than including them as null on error, and potentially using a @skip(on: [ERROR, NULL]) or @skipOnError/@skipOnNull directive:

https://github.com/graphql/client-controlled-nullability-wg/blob/main/notes/2023/2023-08.md

I've uploaded the video from tonights meeting to YouTube but it might take an hour or two to process before it's visible: https://www.youtube.com/watch?v=9XqTNnSLpgs&list=PLP1igyLx8foFPThkIGEUVbne2_DBwgQke

I look forward to reading your post, Jordan!

@itamark
Copy link
Contributor Author

itamark commented Aug 23, 2023

There may be some small overlap between this and the Client Controlled Nullability proposal,

Hi @benjie! Thanks for the link to the meeting. I'll watch it.

Yes, these proposals are definitely related. Thanks to Matt, we were made aware of it early in the design phase. We're actually taking CCN very carefully into consideration to make sure this proposal includes sane interop.

We are planning to attend the graphql wg monthly in September to discuss this, and by then @captbaritone will have published his post and we will be able to present not only the direction of our proposal, but the interop concerns and solutions with CCN.

@mjmahone
Copy link
Contributor

One note on the "render at all costs" current state: not only do you end up attempting to render a surface with an error, you also do not necessarily know the blast radius of the coalesced error. Example:

{
  me {
    ...HasName
    ...HasBestFriend
  }
}
fragment HasName on User {
  name
}
fragment HasBestFriend on User {
  bestFriend {
    ...HasName
  }
}

and a type system like:

type Query {
  me: User
}
type User {
  name: String!
  bestFriend: User!
}

If we have an error on me.bestFriend.name, the response would end up like:

{
  "data": {
    "me": null
  },
  "errors": [{
    "message": "NoName!",
    "path": ["me", "bestFriend", "name"]
  }]
}

Even though we could still render a UI showing the user's name, and even though there is a valid bestFriend, because of the way errors bubble today, the response makes this impossible. And if not handled well, we could end up poisoning Query.me in the normalized store (setting me to null): if you don't set me to null then at render/data-read time, you'll be reading out old (potentially empty/unrelated-to-your-request) data, but if you do then you'll cause all other surfaces making a Query.me request to rerender.

@captbaritone
Copy link
Contributor

@benjie

I look forward to reading your post, Jordan!

I put together a detailed writeup here: graphql/nullability-wg#19

Sorry for the length. Would be very curious to hear your thoughts if/when you have a chance to read it.

@benjie
Copy link

benjie commented Nov 27, 2023

As an update to the above; @captbaritone's post spawned a lot of thought and heavy discussions at GraphQLConf. Skipping over all the intermediate results (interrobang and asterisk), it has led to two competing but similar solutions:

Both of these introduce a way for a field to indicate that it is "null only on error" - i.e. it will never be null unless there's an associated error in the errors list. Mostly they differ in syntax. Lee's version has a global toggle between two modes (via the @strictNullability directive on the schema itself) which changes the meaning of an unadorned type Type to not allow semantic nulls (instead, nullable type would now be Type?). My solution is purely additive in nature (introducing !Type as a semantic non-null type) and does not change the meaning of the unadorned type Type, thus not requiring a toggle.

In the wake of this, the Client Controlled Nullability WG has opted to rename itself to the Nullability WG and has put CCN work on hold whilst we explore the semantic nullability questions.

Thanks again for your excellent write up @captbaritone! 🙌

facebook-github-bot pushed a commit that referenced this issue Feb 5, 2024
Summary:
This PR adds experimental support for `semanticNonNull` as described in apollographql/specs#42. Which is part of a broader effort to explore semantic nullability in GraphQL as explored in [RFC: SemanticNonNull type (null only on error) ](graphql/graphql-spec#1065). This directive-based approach should allow us to experiment with the concepts and identify issues as we work to understand the viability of semantic nullability in GraphQL.

## Experimental

As this is still an experimental implementation, it's designed to be minimally invasive rather than ideal in terms of architecture or performance. As the feature/RFCs stabilize I would imagine we would bake this into the schema crate and data structures as a first class concept.

This flag will not be broadly safe to enable in Relay since by default fields that are null due to error are still surfaced to the user. This is only safe to enable if:

1. Your network layer discards all payloads that have any field errors
2. You enable our [explicit error handling feature](#4416), which is still itself experimental.

## Missing Pieces

- [ ] Documentation about how to use this feature (purposeful, since this is experimental)
- [ ] Support for `semanticNonNullField` which allows a patching an existing type to define it's field's semantic nullability
- [ ] Validation
  - [ ] Invalid use of `levels` will simply panic.
  - [ ] Uses of `semanticNonNullField` will simply be ignored
  - [ ] There is no schema validation ensuring interface types have type-compatible semantic nullability

Pull Request resolved: #4601

Test Plan:
```
cargo test
```

I also spun up a version of [`grats-relay-example`](https://github.com/captbaritone/grats-relay-example) using Grat's [experimental support for `semanticNonNull`](https://grats.capt.dev/docs/guides/strict-semantic-nullability/) and was able to see it working end to end.

https://github.com/facebook/relay/assets/162735/dc979a58-95f3-4e55-9d9b-577afdd798ca

Reviewed By: alunyov

Differential Revision: D53191255

Pulled By: captbaritone

fbshipit-source-id: c09333f2b9475315d81792d33947fd908001c021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants