Skip to content

Commit

Permalink
fix: do not require pod readiness when switching desired service sele…
Browse files Browse the repository at this point in the history
…ctor on abort (#3338)

* do not switch service selectors back when using alb due to race between two controllers with pod readiness gates

Signed-off-by: Zach Aller <zachaller@users.noreply.github.com>

* update tests for alb

Signed-off-by: Zach Aller <zachaller@users.noreply.github.com>

* lets not check for readiness instead

Signed-off-by: Zach Aller <zachaller@users.noreply.github.com>

* clean up notes

Signed-off-by: Zach Aller <zachaller@users.noreply.github.com>

* fix /

Signed-off-by: Zach Aller <zachaller@users.noreply.github.com>

---------

Signed-off-by: Zach Aller <zachaller@users.noreply.github.com>
  • Loading branch information
zachaller committed Feb 5, 2024
1 parent 4dc0a4e commit 041e68f
Showing 1 changed file with 5 additions and 3 deletions.
8 changes: 5 additions & 3 deletions rollout/trafficrouting.go
Original file line number Diff line number Diff line change
Expand Up @@ -179,13 +179,15 @@ func (c *rolloutContext) reconcileTrafficRouting() error {
desiredWeight = c.calculateDesiredWeightOnAbortOrStableRollback()
if (c.rollout.Spec.Strategy.Canary.DynamicStableScale && desiredWeight == 0) || !c.rollout.Spec.Strategy.Canary.DynamicStableScale {
// If we are using dynamic stable scale we need to also make sure that desiredWeight=0 aka we are completely
// done with aborting before resetting the canary service selectors back to stable
err = c.ensureSVCTargets(c.rollout.Spec.Strategy.Canary.CanaryService, c.stableRS, true)
// done with aborting before resetting the canary service selectors back to stable. For non-dynamic scale we do not check for availability because we are
// fully aborted and stable pods will be there, if we check for availability it causes issues with ALB readiness gates if all stable pods
// have the desired readiness gate on them during an abort we get stuck in a loop because all the stable go unready and rollouts won't be able
// to switch the desired services because there is no ready pods which causes pods to get stuck progressing forever waiting for readiness.
err = c.ensureSVCTargets(c.rollout.Spec.Strategy.Canary.CanaryService, c.stableRS, false)
if err != nil {
return err
}
}

err := reconciler.RemoveManagedRoutes()
if err != nil {
return err
Expand Down

0 comments on commit 041e68f

Please sign in to comment.