Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exposing topologySpreadConstraints on the helm chart #3892

Merged

Conversation

bjoernw
Copy link
Contributor

@bjoernw bjoernw commented Sep 22, 2023

This allows users to set topologySpreadConstraints which can be used to more evenly spread pods out over nodes/AZs/etc. and prevent uneven utilization of federation pods.

Fixes #3891


Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

  • Changes are compatible1
  • Documentation2 completed
  • Performance impact assessed and acceptable
  • Tests added and passing3
    • Unit Tests
    • Integration Tests
    • Manual Tests

Exceptions
Tested helm chart template generation locally

Note any exceptions here
N/A
Notes
N/A

Footnotes

  1. It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this.

  2. Configuration is an important part of many changes. Where applicable please try to document configuration examples.

  3. Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions.

@router-perf
Copy link

router-perf bot commented Sep 22, 2023

CI performance tests

  • events_big_cap_high_rate - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity
  • events_without_dedup - Stress test for events with a lot of users and deduplication DISABLED
  • events - Stress test for events with a lot of users and deduplication ENABLED
  • large-request - Stress test with a 1 MB request payload
  • step - Basic stress test that steps up the number of users over time
  • xlarge-request - Stress test with 10 MB request payload
  • reload - Reload test over a long period of time at a constant rate of users
  • no-graphos - Basic stress test, no GraphOS.
  • xxlarge-request - Stress test with 100 MB request payload
  • step-jemalloc-tuning - Clone of the basic stress test for jemalloc tuning
  • const - Basic stress test that runs with a constant number of users

@garypen garypen self-assigned this Sep 25, 2023
@Meemaw
Copy link
Contributor

Meemaw commented Sep 29, 2023

Hey @garypen any blockers to get this in? Its a relatively small change, but super useful one!

@garypen
Copy link
Contributor

garypen commented Sep 29, 2023

Hey @garypen any blockers to get this in? Its a relatively small change, but super useful one!

Two things:

  • I'm on vacation (back on Monday)
  • I was waiting to re-enable kubeconform (which I think may have just happened)

Expect something to happen next week.

Copy link
Contributor

@garypen garypen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. If you don't mind making those changes we can commit this.

helm/chart/router/values.yaml Outdated Show resolved Hide resolved
Co-authored-by: Gary Pennington <garypen@gmail.com>
@bjoernw
Copy link
Contributor Author

bjoernw commented Oct 4, 2023

@garypen done

@garypen garypen merged commit 6649432 into apollographql:dev Oct 5, 2023
11 checks passed
This was referenced Oct 17, 2023
EverlastingBugstopper referenced this pull request in apollographql/rover Oct 20, 2023
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [apollographql/router](https://togithub.com/apollographql/router) |
minor | `v1.32.0` -> `v1.33.0` |

---

### Release Notes

<details>
<summary>apollographql/router (apollographql/router)</summary>

###
[`v1.33.0`](https://togithub.com/apollographql/router/releases/tag/v1.33.0)

[Compare
Source](https://togithub.com/apollographql/router/compare/v1.32.0-alpha.0...v1.33.0)

#### 🚀 Features

##### Add `hasNext` to SupergraphRequest ([Issue
#&#8203;4016](https://togithub.com/apollographql/router/issues/4016))

Coprocessors multi-part response support has been enhanced to include
`hasNext`, allowing you to determine when a request has completed.

When `stage` is `SupergraphResponse`, `hasNext` if present and `true`
indicates that there will be subsequent `SupergraphResponse` calls to
the co-processor for each multi-part (`@defer`/subscriptions) response.

See the [coprocessor
documentation](https://www.apollographql.com/docs/router/customizations/coprocessor/)
for more details.

By [@&#8203;BrynCooke](https://togithub.com/BrynCooke) in
[https://github.com/apollographql/router/pull/4017](https://togithub.com/apollographql/router/pull/4017)

##### Expose the ability to set topology spread constraints on the helm
chart ([3891](https://togithub.com/apollographql/router/issues/3891))

Give developers the ability to set topology spread constraints that can
be used to guarantee that federation pods are spread out evenly across
AZs.

By [@&#8203;bjoernw](https://togithub.com/bjoernw) in
[https://github.com/apollographql/router/pull/3892](https://togithub.com/apollographql/router/pull/3892)

#### 🐛 Fixes

##### Ignore JWKS keys which aren't supported by the router ([Issue
#&#8203;3853](https://togithub.com/apollographql/router/issues/3853))

If you have a JWKS which contains a key which has an algorithm (alg)
which the router doesn't recognise, then the entire JWKS is disregarded
even if there were other keys in the JWKS which the router could use.

We have changed the JWKS processing logic so that we remove entries with
an unrecognised algorithm from the list of available keys. We print a
warning with the name of the algorithm for each removed entry.

By [@&#8203;garypen](https://togithub.com/garypen) in
[https://github.com/apollographql/router/pull/3922](https://togithub.com/apollographql/router/pull/3922)

##### Fix panic when streaming responses to co-processor ([Issue
#&#8203;4013](https://togithub.com/apollographql/router/issues/4013))

Streamed responses will no longer cause a panic in the co-processor
plugin. This affected defer and stream queries.

By [@&#8203;BrynCooke](https://togithub.com/BrynCooke) in
[https://github.com/apollographql/router/pull/4014](https://togithub.com/apollographql/router/pull/4014)

##### Only reject defer/subscriptions if actually part of a batch
([Issue
#&#8203;3956](https://togithub.com/apollographql/router/issues/3956))

Fix the checking logic so that deferred queries or subscriptions will
only be rejected when experimental batching is enabled and the
operations are part of a batch.

Without this fix, all subscriptions or deferred queries would be
rejected when experimental batching support was enabled.

By [@&#8203;garypen](https://togithub.com/garypen) in
[https://github.com/apollographql/router/pull/3959](https://togithub.com/apollographql/router/pull/3959)

##### Fix requires selection in arrays ([Issue
#&#8203;3972](https://togithub.com/apollographql/router/issues/3972))

When a field has a `@requires` annotation that selects an array, and
some fields are missing in that array or some of the elements are null,
the router would short circuit the selection and remove the entire
array. This relaxes the condition to allow nulls in the selected array

By [@&#8203;Geal](https://togithub.com/Geal) in
[https://github.com/apollographql/router/pull/3975](https://togithub.com/apollographql/router/pull/3975)

##### Fix router hang when opening the explorer, prometheus or health
check page ([Issue
#&#8203;3941](https://togithub.com/apollographql/router/issues/3941))

The Router did not gracefully shutdown when an idle connections are made
by a client, and would instead hang. In particular, web browsers make
such connection in anticipation of future traffic.

This is now fixed, and the Router will now gracefully shut down in a
timely fashion.

By [@&#8203;Geal](https://togithub.com/Geal) in
[https://github.com/apollographql/router/pull/3969](https://togithub.com/apollographql/router/pull/3969)

##### Fix hang and high CPU usage when compressing small responses ([PR
#&#8203;3961](https://togithub.com/apollographql/router/pull/3961))

When returning small responses (less than 10 bytes) and compressing them
using gzip, the router could go into an infinite loop

By [@&#8203;Geal](https://togithub.com/Geal) in
[https://github.com/apollographql/router/pull/3961](https://togithub.com/apollographql/router/pull/3961)

#### 📃 Configuration

##### Add `enabled` field for telemetry exporters ([PR
#&#8203;3952](https://togithub.com/apollographql/router/pull/3952))

Telemetry configuration now supports `enabled` on all exporters. This
allows exporters to be disabled without removing them from the
configuration and in addition allows for a more streamlined default
configuration.

```diff
telemetry:
  tracing: 
    datadog:
+      enabled: true
    jaeger:
+      enabled: true
    otlp:
+      enabled: true
    zipkin:
+      enabled: true
```

Existing configurations will be migrated to the new format automatically
on startup. However, you should update your configuration to use the new
format as soon as possible.

By [@&#8203;BrynCooke](https://togithub.com/BrynCooke) in
[https://github.com/apollographql/router/pull/3952](https://togithub.com/apollographql/router/pull/3952)

#### 🛠 Maintenance

##### Create a replacement self-signed server certificate: 10 years
lifespan ([Issue
#&#8203;3998](https://togithub.com/apollographql/router/issues/3998))

This certificate is only used for testing, so 10 years lifespan is
acceptable.

By [@&#8203;garypen](https://togithub.com/garypen) in
[https://github.com/apollographql/router/pull/4009](https://togithub.com/apollographql/router/pull/4009)

#### 📚 Documentation

##### Updated documentation for deploying router ([PR
#&#8203;3943](https://togithub.com/apollographql/router/pull/3943))

Updated documentation for containerized router deployments, with guides
and examples for [deploying on
Kubernetes](https://www.apollographql.com/docs/router/containerization/kubernetes)
and [running on
Docker](https://www.apollographql.com/docs/router/containerization/docker).

By [@&#8203;shorgi](https://togithub.com/shorgi) in
[https://github.com/apollographql/router/pull/3943](https://togithub.com/apollographql/router/pull/3943)

##### Document guidance for request and response buffering ([Issue
#&#8203;3838](https://togithub.com/apollographql/router/issues/3838))

Provides specific guidance on request and response buffering within the
router.

By [@&#8203;garypen](https://togithub.com/garypen) in
[https://github.com/apollographql/router/pull/3970](https://togithub.com/apollographql/router/pull/3970)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/apollographql/rover).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4xOS4yIiwidXBkYXRlZEluVmVyIjoiMzcuMTkuMiIsInRhcmdldEJyYW5jaCI6Im1haW4ifQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expose the ability to set topologyspreadConstraints on the Deployment
3 participants