Skip to content

Commit

Permalink
fix(k8s): fix kubernetes workload rollout status check (#5794)
Browse files Browse the repository at this point in the history
<!--  Thanks for sending a pull request! Here are some tips for you:

1. If this is your first pull request, please read our contributor
guidelines in the
https://github.com/garden-io/garden/blob/main/CONTRIBUTING.md file.
2. Please label this pull request according to what type of issue you
are addressing (see "What type of PR is this?" below)
3. Ensure you have added or run the appropriate tests for your PR.
4. If the PR is unfinished, add `WIP:` at the beginning of the title or
use the GitHub Draft PR feature.
5. Please add at least two reviewers to the PR. Currently active
maintainers are: @edvald, @thsig, @eysi09, @stefreak, @TimBeyer, and
@vvagaytsev.
-->

**What this PR does / why we need it**:
For checking the rollout status of workloads we were using the
`spec.replicas` and not the `status.replicas`. This means that during a
rollout we would not catch the case where there are currently more
`replicas` than `updated` replicas. If we compare the `spec.replicas` to
`updated` it is possible that both have the same number, even though
`updated` has not passed it's readiness probe yet, which is the trigger
for the old pods to be terminated. Only once `status.replicas` and
`status.updated` match are we truly done. This led to garden not waiting
for deployment rollouts in some cases, which could lead to further
errors trying to run execs or tests in non-running pods or pods being
canceled mid-execution, due to them being terminated as old pods.

This is also implemented in this way in kubectl see
[here](https://github.com/kubernetes/kubernetes/blob/331ced5606fe32d8beb77c974b28dc12f7795012/staging/src/k8s.io/kubectl/pkg/polymorphichelpers/rollout_status.go#L83).

This is an old bug, but it only surfaced recently since #4846 where we
removed `compareDeployedResources` from the status checks for kubernetes
type actions.

**Which issue(s) this PR fixes**:

Fixes #

**Special notes for your reviewer**:
  • Loading branch information
twelvemo authored Mar 1, 2024
1 parent c83f815 commit 445d25c
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions core/src/plugins/kubernetes/status/workload.ts
Original file line number Diff line number Diff line change
Expand Up @@ -186,8 +186,8 @@ async function getRolloutStatus(workload: Workload) {

out.state = "ready"

// See `https://github.com/kubernetes/kubernetes/blob/master/pkg/kubectl/rollout_status.go` for a reference
// for this logic.
// See here as a reference for this logic:
// `https://github.com/kubernetes/kubernetes/blob/331ced5606fe32d8beb77c974b28dc12f7795012/staging/src/k8s.io/kubectl/pkg/polymorphichelpers/rollout_status.go#L75
if (workload.metadata.generation! > workload.status!.observedGeneration!) {
out.lastMessage = `Waiting for spec update to be observed...`
out.state = "deploying"
Expand Down Expand Up @@ -234,7 +234,7 @@ async function getRolloutStatus(workload: Workload) {

const desired = deploymentSpec.replicas === undefined ? 1 : deploymentSpec.replicas
const updated = status.updatedReplicas || 0
const replicas = deploymentSpec.replicas || 0
const replicas = status.replicas || 0
const available = status.availableReplicas || 0

if (updated < desired) {
Expand Down

0 comments on commit 445d25c

Please sign in to comment.