Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strategy For Pod Life Time #205

Closed
seanmalloy opened this issue Dec 12, 2019 · 15 comments
Closed

Strategy For Pod Life Time #205

seanmalloy opened this issue Dec 12, 2019 · 15 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@seanmalloy
Copy link
Member

seanmalloy commented Dec 12, 2019

This is an enhancement proposal to add a new descheduler strategy for evicting old pods. I would like to be able to evict pods that were created more than X hours ago.

Here are my initial thoughts on the API for a new PodLifeTime strategy. In this example the descheduler would evict all pods that were created more than 24 hours ago.

kind: "DeschedulerPolicy"
apiVersion: "descheduler/v1alpha1"
strategies:
  "PodLifeTime":
     enabled: true
     params:
        maxPodLifeTimeHours: 24
@seanmalloy
Copy link
Member Author

Any feedback on this proposal would be greatly appreciated.

CC: @sanbornick @vinny-sabatini

@seanmalloy seanmalloy changed the title Policy For Pod Life Time Strategy For Pod Life Time Dec 20, 2019
@seanmalloy
Copy link
Member Author

Here is a slightly different proposal, specify the max pod life time in minutes instead of hours.

kind: "DeschedulerPolicy"
apiVersion: "descheduler/v1alpha1"
strategies:
  "PodLifeTime":
     enabled: true
     params:
        maxPodLifeTimeMinutes: 1440   # 24 hours

@mcandre
Copy link

mcandre commented Dec 20, 2019

Don't allow users to omit units: Accept a Go-compatible time duration string.

https://golang.org/pkg/time/#ParseDuration

@seanmalloy
Copy link
Member Author

@ravisantoshgudimetla @aveshagarwal @damemi do you have any thoughts on this proposal? Does this sound reasonable?

@damemi
Copy link
Contributor

damemi commented Jan 19, 2020

Not sure if this applies here, but I think Kubernetes convention prefers having the units in the field name (https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#units), which also makes it easier for non-go clients to parse the API.

If that's the case then I agree with the smaller units (ie Minutes instead of Hours, or maybe even Seconds)

Edit: actually that section specifically says durations should be handled with the units in the field name:

Duration fields must be represented as integer fields with units being part of the field name (e.g. leaseDurationSeconds). We don't use Duration in the API since that would require clients to implement go-compatible parsing.

For that reason I would vote for podLifetimeSeconds as the field name

@seanmalloy
Copy link
Member Author

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 2, 2020
@seanmalloy
Copy link
Member Author

Based on the feedback from @damemi here is the updated proposal.

kind: "DeschedulerPolicy"
apiVersion: "descheduler/v1alpha1"
strategies:
  "PodLifeTime":
     enabled: true
     params:
        maxPodLifeTimeSeconds: 86400   # 24 hours

@seanmalloy
Copy link
Member Author

/assign @seanmalloy

@seanmalloy
Copy link
Member Author

@alculquicondor during the SIG scheduling meeting you mentioned there was a k8s API related to pod evictions that might be to relevant to this feature. This was something different than pod disruption budgets. If you find the documentation that would be super useful. Thanks!

@alculquicondor
Copy link

Other than disruption budget, there is a "safe-to-evict" label. However, it's namespaced under cluster-autoscaler.

https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-node

I'm not sure if the descheduler should be respecting that. But I wonder what does this label offer that is not possible through disruption budget. @MaciekPytel?

In a totally different note, shouldn't the strategy have a way to filter, like with a Label Selector? Unless no other strategies support this currently?

@MaciekPytel
Copy link

MaciekPytel commented Feb 28, 2020

This annotation takes two values:

  • "safe-to-evict: true" bypasses various checks that would normally prevent CA from deleting a pod (ex. CA will not delete a pod that is not managed by a controller, because it doesn't know if it will be restarted; safe-to-evict: true bypasses this check). This is very CA-specific and not related to PDB at all.
  • "safe-to-evict: false" prevents Cluster Autoscaler from ever restarting the pod. The original reasons (as discussed in Add annotation to stop the autoscaler from moving a specific pod kubernetes/autoscaler#510) were:
    1. Prevent autoscaler from restarting a pod while still allowing other components to do it.
    • This makes more sense than it seems at a first glance. There aren't all that many reasons for a pod to be restarted and in many (most?) cases it's reasonable to expect vast majority of restarts to be caused by autoscaler. If you assume restarting a pod carries certain cost (ex. lost computation, dropping some requests) it makes sense to opt out of autoscaling, but allow drain in other, less common scenarios (ex. manual drain, node upgrade).
    1. At the time jobs didn't respect PDB (no idea what's the current status, presumably it's fixed).

Overall, this label is very specific to autoscaler, so I'm not sure if it makes much sense to respect it in other components. Also - do we even expect users to use descheduler together with CA? They seem to conflict in what they try to achieve - CA explicitly ignores pod scores (I guess that's how we call priorities now?) and tries to pack pods as tightly as possible whether they like it or not, while IIUC the goal of descheduler is to reschedule pods based on scores (so, spread them out).

That being said I wouldn't be surprised if at some point there is a FR to add similar annotation for descheduler.

@seanmalloy
Copy link
Member Author

In a totally different note, shouldn't the strategy have a way to filter, like with a Label Selector? Unless no other strategies support this currently?

PR #202 has been opened to add a label selector feature to descheduler.

@seanmalloy
Copy link
Member Author

That being said I wouldn't be surprised if at some point there is a FR to add similar annotation for descheduler.

The descheduler annotation descheduler.alpha.kubernetes.io/evict works very similar to this. It is not exactly the same, but pretty close.

@seanmalloy
Copy link
Member Author

/close

Pull request #274 was merged into the master branch.

@k8s-ci-robot
Copy link
Contributor

@seanmalloy: Closing this issue.

In response to this:

/close

Pull request #274 was merged into the master branch.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

6 participants