Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to proceed to the next rollout step based on a percentage of available replicas #3991

Open
Idokah opened this issue Dec 8, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@Idokah
Copy link

Idokah commented Dec 8, 2024

Summary

Add an option to allow Argo Rollouts to proceed to the next step in a rollout even if the desired replica count has not been fully achieved. Instead, allow progression based on a predefined percentage of the new ReplicaSet’s replicas being available.

Use Cases

In certain deployment scenarios, strict adherence to achieving 100% of the desired replica count can cause rollouts to stall indefinitely. For example:

  • Dynamic Workloads: Some applications dynamically scale or remove pods as part of their operation. In these cases, the number of AvailableReplicas may fluctuate during the rollout process, even though the application is healthy and able to progress.
  • Non-Critical Tolerance: Applications that do not require every single replica to be ready before moving to the next step might benefit from progressing with a threshold percentage instead of waiting indefinitely.

Currently, the rollout process performs a strict check via:

func allDesiredAreAvailable(rs *appsv1.ReplicaSet, desired int32) bool {
    return rs != nil && desired == *rs.Spec.Replicas && desired == rs.Status.AvailableReplicas
}

This check ensures that the AvailableReplicas exactly matches the desired count. However, in the above scenarios, the rollout can get stuck indefinitely if all replicas are not consistently ready.

Proposed Solution

Introduce an optional parameter (e.g., progressThreshold) that defines the percentage of the desired replicas that must be available before proceeding to the next step. For example:

strategy:
  canary:
    steps:
      - setWeight: 50
    progressThreshold: 90

In this case, the rollout would proceed if at least 90% of the desired replicas are available, offering flexibility for applications that tolerate slight deviations.

Benefits

  • Flexibility: Allows rollouts to progress in dynamic or non-critical environments where exact replica availability cannot always be achieved.
  • Avoid Stalling: Prevents rollouts from getting stuck indefinitely in scenarios where waiting for 100% of replicas is impractical or unnecessary.
  • Backward Compatible: Existing behavior remains unchanged unless the progressThreshold parameter is explicitly defined.

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

@Idokah Idokah added the enhancement New feature or request label Dec 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant