add errors for response validation #5787

Geal · 2024-08-07T14:35:58Z

When formatting responses, the router is validating the data returned by subgraphs and replacing it with null values as appropriate. MEssages were put in the response extensions outlining the paths at which the nulls were propagated up the response (due to non nullable types) but it was not generating error messages for invalid values

Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

Exceptions

Note any exceptions here

Notes

It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. ↩
Configuration is an important part of many changes. Where applicable please try to document configuration examples. ↩
Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions. ↩

When formatting responses, the router is validating the data returned by subgraphs and replacing it with null values as appropriate. MEssages were put in the response extensions outlining the paths at which the nulls were propagated up the response (due to non nullable types) but it was not generating error messages for invalid values

router-perf · 2024-08-07T14:36:30Z

CI performance tests

Geal · 2024-08-07T14:36:43Z

I think error messages could be improved to add more info about the kind of data that was received

garypen · 2024-08-19T09:14:38Z

The only thing required now is to make the error message match the gateway.

Geal · 2024-08-20T14:50:39Z

I'm updating the error messages following https://github.com/apollographql/federation/blob/main/gateway-js/src/resultShaping.ts#L305-L430

Geal · 2024-08-20T14:59:43Z

6733583 follows the gateway's error messages. I'd like to modify them to not show the value we got in the error message, nor mention the subgraph. The rationale here being that we don't want to expose in the errors some sensitive data that might have been returned by error.
6733583 fails the typename_propagation2 test introduced in #3978 because null is not a valid value for the __typename field. I'd argue that this test can actually be removed, as the original issue is already covered by the other tests in the PR

Geal · 2024-08-20T16:35:22Z

36231f5 updates the error message to expose less information but still give enough for debugging

Geal · 2024-08-20T16:39:29Z

in the interest of avoiding breaking changes, I'd rather not fail hard on the typename validation, but still add the error message

apollo-router/src/json_ext.rs

Geal · 2024-08-27T10:13:19Z

I removed the __typename validation in 965cfd2 and added a comment for more context

svc-apollo-docs · 2024-10-15T14:15:54Z

✅ Docs Preview Ready

No new or changed pages found.

Geal · 2024-10-15T14:35:35Z

docs and an error code added

goto-bus-stop · 2024-10-17T10:53:13Z

docs/source/errors.mdx

+</Property>
+ <Property name="RESPONSE_VALIDATION_FAILED">
+
+A subgraph returned a field with a different type that mandated by the GraphQL schema.


Suggested change

A subgraph returned a field with a different type that mandated by the GraphQL schema.

A subgraph returned a field with a value that doesn't match the type declared in the GraphQL schema.

(mostly to fix the that/than typo, but i think being explicit about value vs. type is also helpful)

goto-bus-stop

This is called Result Coercion in the graphql spec, should we use RESULT_COERCION_FAILED for the error code?

goto-bus-stop · 2024-10-17T11:13:19Z

I'll push here the test Andrew made in #6143, which validates that integer values fit in the Int 32 bit range.

apollo-router/src/spec/query/tests.rs

goto-bus-stop · 2024-10-17T11:28:39Z

apollo-router/src/spec/query.rs

@@ -388,7 +405,8 @@ impl Query {
                    input,
                    output,
                    path,
-                    field_type,
+                    parent_type,


My change: this call is recursing from Int! to Int, it's not entering a new field, so the parent type remains the same.
Before this change you could get an error message like Invalid value found for field Int!.field, which doesn't make sense

so this is not a huge issue because parent_type is only used for that error message, but I worry that this would break assumptions when we change the code later. Could we move it back to field_type, but use parent_type.inner_named_type() in the error message?

The code should assume that the parent type is the type that the selection is executed on, while the field type is the type that the selection returns. inner_named_type() is also incorrect as it would mention Int.someNumber in the error, when Int is the return type, not the type that has the field someNumber.

Before this change parent_type would sometimes contain the parent type, but sometimes not contain the parent type at all. It could be two entirely different things that must be used differently. So I don't think it could possibly be correct before

addressed

goto-bus-stop · 2024-10-17T11:35:22Z

I added a small fix as well to get that test to pass & to fix the ID type name issue. The implementation looks good but I think this needs more tests to exercise all cases pointed out by Result Coercion sections in the spec

Geal · 2024-10-18T09:43:23Z

@goto-bus-stop when you say it needs to exercise all cases, do you mean that we could generate false positive errors or that there are invalid values that we may not be detecting yet?

goto-bus-stop · 2024-10-18T10:35:01Z

I'd like to see tests exercising more wrong combinations of types, instead of just one for each type. I think I'm mostly looking for test assurance that we are not missing error cases. Some combinations don't feel all that useful to test but the spec provides some freedom to servers, so I think it's good to nail down what our choices are even for the obvious stuff. For example, if a field returns a boolean true or false, while the schema mandates a String, a GraphQL server is allowed to either cast it to a string "true" or "false" or raise a field error. (We raise a field error, as I think we should)

The case that I was thinking of specifically is if a subgraph responds with a float for an Int field, as in the JSON they are not differentiated. this is what the spec says:

GraphQL services may coerce non-integer internal values to integers when reasonable without losing information, otherwise they must raise a field error. Examples of this may include returning 1 for the floating-point number 1.0, or returning 123 for the string "123". In scenarios where coercion may lose data, raising a field error is more appropriate. For example, a floating-point number 1.2 should raise a field error instead of being truncated to 1.

So it leaves a lot of free choice for us. I think we already do the right thing because serde_json's as_i64 does not lose information, and it doesn't cast non-number types, but it's probably the most useful case to test.

Geal · 2024-10-21T07:58:22Z

honestly I'd be more ok with missing some error cases right now and revisit in a later PR, than wait a few more months without any validation errors

goto-bus-stop · 2024-10-22T09:01:28Z

I found several minor issues by adding tests:

We accept 1234.5678 floating point values for ID types, which the spec arguably ambiguously allows as a return value, but explicitly disallows as an input value for this type. I think the spec is meant to disallow floating point values for IDs in general.
We do not coerce integer values for ID types to strings--this is a known issue
We do not accept 1234.0 integer-valued floating-point values for Int types. The spec is vague on this point, but since JSON numbers are all floats, 1234 and 1234.0 are undeniably equivalent and I believe we should cast integer-valued floats to integers.
There are several other cases where the error message uses the wrong type (producing errors like Invalid value for field String.b which does not make sense)

I'm surprised serde_json doesn't handle 1234.0 -> 1234 conversion in as_i64().

My proposed initial steps for the coercion issues:

Enable ID validation with the current implementation: accepting ints, floats, and strings. The ID validation is currently disabled because of a typo in the code causing the branch to never match. It's highly unlikely that anyone is returning lists and objects and booleans from ID fields, so I think that's safe to do.
Do not address other ID coercion issues as customers may be relying on this. I'm adding FIXME comments on the test cases that we can address later if we know it's safe, or in a major release.
Accept 1234.0 integer-valued floating-point values for Int types, using the coerced i32 value as the value in the response. This is not a breaking change as it expands the accepted set of values. I can do that in a follow-up PR.

goto-bus-stop · 2024-10-22T12:30:29Z

Backing out my ID change to keep this PR focused on adding error messages only. I'll file a separate PR with low-/non-breaking fixes we can make to align with the GraphQL coercion rules

goto-bus-stop

This should be good to, I reverted my unnecessary changes so the only things I added are more tests encoding the current behaviour, and a fix to the error message.

This comment has been minimized.

Sign in to view

apollo-bot2 assigned Geal Aug 7, 2024

Geal force-pushed the geal/response-validation-errors branch from 8b2deb0 to b2bddc3 Compare August 7, 2024 14:50

Geal added 3 commits August 20, 2024 11:49

Merge branch 'dev' into geal/response-validation-errors

97a748d

lint

6d62d38

lint

ffde309

Geal added 2 commits August 20, 2024 16:55

align error messages with the gateway

6733583

error messages for invalid typename values

4747e4a

Geal added 2 commits August 20, 2024 17:02

lint

24d46b9

update error messages

36231f5

Merge branch 'dev' into geal/response-validation-errors

9cf401e

Geal commented Aug 27, 2024

View reviewed changes

apollo-router/src/json_ext.rs Outdated Show resolved Hide resolved

Geal added 4 commits August 27, 2024 11:42

Update apollo-router/src/json_ext.rs

cecf18a

revert __typename validation

965cfd2

let null values go through

6f2fcaf

fix

99aef34

Geal marked this pull request as ready for review August 27, 2024 10:13

Geal requested review from a team as code owners August 27, 2024 10:13

abernix requested a review from IvanGoncharov August 27, 2024 10:15

changeset

a59f15d

abernix requested a review from SimonSapin September 23, 2024 08:55

Merge branch 'dev' into geal/response-validation-errors

d23b782

add an error code

6881f9b

goto-bus-stop reviewed Oct 17, 2024

View reviewed changes

Add a test for out-of-range values for Int

4006b2e

goto-bus-stop mentioned this pull request Oct 17, 2024

Return a specific error when attempting to format an Int and it turns out to be bigger than I32 #6143

Closed

6 tasks

goto-bus-stop previously requested changes Oct 17, 2024

View reviewed changes

apollo-router/src/spec/query/tests.rs Outdated Show resolved Hide resolved

goto-bus-stop added 3 commits October 17, 2024 13:22

Update test from #6143 to match the implementation from #5787

468ae57

Fix parent type when result-coercing non-null types

e6020bb

Fix result coercion for ID fields

45c7e8a

goto-bus-stop reviewed Oct 17, 2024

View reviewed changes

goto-bus-stop added 2 commits October 22, 2024 09:51

Add more tests for response value coercion/validation

7c7ba29

Move the response validation tests together

c34b7d1

goto-bus-stop added 5 commits October 22, 2024 12:16

Adjust test expectations to match current behaviour

536d9f6

Correct parent type used for top-level fields

877eb2c

lint

4fa8353

Fix snapshot

e206b6d

Revert ID validation fix--moving to separate PR

f0f7300

goto-bus-stop approved these changes Oct 22, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add errors for response validation #5787

add errors for response validation #5787

Geal commented Aug 7, 2024 •

edited by goto-bus-stop

Loading

This comment has been minimized.

router-perf bot commented Aug 7, 2024

Geal commented Aug 7, 2024

garypen commented Aug 19, 2024

Geal commented Aug 20, 2024

Geal commented Aug 20, 2024

Geal commented Aug 20, 2024

Geal commented Aug 20, 2024

Geal commented Aug 27, 2024

svc-apollo-docs commented Oct 15, 2024 •

edited

Loading

Geal commented Oct 15, 2024

goto-bus-stop Oct 17, 2024 •

edited

Loading

goto-bus-stop left a comment

goto-bus-stop commented Oct 17, 2024 •

edited

Loading

goto-bus-stop Oct 17, 2024

Geal Oct 18, 2024

goto-bus-stop Oct 18, 2024

goto-bus-stop commented Oct 17, 2024 •

edited

Loading

Geal commented Oct 18, 2024

goto-bus-stop commented Oct 18, 2024

Geal commented Oct 21, 2024

goto-bus-stop commented Oct 22, 2024 •

edited

Loading

goto-bus-stop commented Oct 22, 2024

goto-bus-stop left a comment

	A subgraph returned a field with a different type that mandated by the GraphQL schema.
	A subgraph returned a field with a value that doesn't match the type declared in the GraphQL schema.

add errors for response validation #5787

Are you sure you want to change the base?

add errors for response validation #5787

Conversation

Geal commented Aug 7, 2024 • edited by goto-bus-stop Loading

Footnotes

This comment has been minimized.

router-perf bot commented Aug 7, 2024

Geal commented Aug 7, 2024

garypen commented Aug 19, 2024

Geal commented Aug 20, 2024

Geal commented Aug 20, 2024

Geal commented Aug 20, 2024

Geal commented Aug 20, 2024

Geal commented Aug 27, 2024

svc-apollo-docs commented Oct 15, 2024 • edited Loading

✅ Docs Preview Ready

Geal commented Oct 15, 2024

goto-bus-stop Oct 17, 2024 • edited Loading

Choose a reason for hiding this comment

goto-bus-stop left a comment

Choose a reason for hiding this comment

goto-bus-stop commented Oct 17, 2024 • edited Loading

goto-bus-stop Oct 17, 2024

Choose a reason for hiding this comment

Geal Oct 18, 2024

Choose a reason for hiding this comment

goto-bus-stop Oct 18, 2024

Choose a reason for hiding this comment

goto-bus-stop commented Oct 17, 2024 • edited Loading

Geal commented Oct 18, 2024

goto-bus-stop commented Oct 18, 2024

Geal commented Oct 21, 2024

goto-bus-stop commented Oct 22, 2024 • edited Loading

goto-bus-stop commented Oct 22, 2024

goto-bus-stop left a comment

Choose a reason for hiding this comment

Geal commented Aug 7, 2024 •

edited by goto-bus-stop

Loading

svc-apollo-docs commented Oct 15, 2024 •

edited

Loading

goto-bus-stop Oct 17, 2024 •

edited

Loading

goto-bus-stop commented Oct 17, 2024 •

edited

Loading

goto-bus-stop commented Oct 17, 2024 •

edited

Loading

goto-bus-stop commented Oct 22, 2024 •

edited

Loading