diff --git a/modules/nodes-descheduler-about.adoc b/modules/nodes-descheduler-about.adoc index 63b5f2d77b34..a8b247cf7236 100644 --- a/modules/nodes-descheduler-about.adoc +++ b/modules/nodes-descheduler-about.adoc @@ -13,6 +13,7 @@ You can benefit from descheduling running Pods in situations such as the followi * Pod and node affinity requirements, such as taints or labels, have changed and the original scheduling decisions are no longer appropriate for certain nodes. * Node failure requires Pods to be moved. * New nodes are added to clusters. +* Pods have been restarted too many times. [IMPORTANT] ==== diff --git a/modules/nodes-descheduler-configuring-strategies.adoc b/modules/nodes-descheduler-configuring-strategies.adoc index e2c4d35f8d78..c2c709e4691f 100644 --- a/modules/nodes-descheduler-configuring-strategies.adoc +++ b/modules/nodes-descheduler-configuring-strategies.adoc @@ -32,24 +32,31 @@ spec: strategies: - name: "LowNodeUtilization" <1> params: - - name: "cputhreshold" + - name: "CPUThreshold" value: "10" - - name: "memorythreshold" + - name: "MemoryThreshold" value: "20" - - name: "podsthreshold" + - name: "PodsThreshold" value: "30" - - name: "memorytargetthreshold" + - name: "MemoryTargetThreshold" value: "40" - - name: "cputargetthreshold" + - name: "CPUTargetThreshold" value: "50" - - name: "podstargetthreshold" + - name: "PodsTargetThreshold" value: "60" - - name: "nodes" + - name: "NumberOfNodes" value: "3" - name: "RemoveDuplicates" <2> + - name: "RemovePodsHavingTooManyRestarts" <3> + params: + - name: "PodRestartThreshold" + value: "10" + - name: "IncludingInitContainers" + value: "false" ---- -<1> The `LowNodeUtilization` strategy provides additional parameters, such as `cputhreshold` and `memorythreshold`, that you can optionally configure. +<1> The `LowNodeUtilization` strategy provides additional parameters, such as `CPUThreshold` and `MemoryThreshold`, that you can optionally configure. <2> The `RemoveDuplicates`, `RemovePodsViolatingInterPodAntiAffinity`, `RemovePodsViolatingNodeAffinity`, and `RemovePodsViolatingNodeTaints` strategies do not have any additional parameters to configure. +<3> The `RemovePodsHavingTooManyRestarts` strategy requires the `PodRestartThreshold` parameter to be set. It also provides the optional `IncludingInitContainers` parameter. + You can enable multiple strategies and the order that the strategies are specified in is not important. diff --git a/modules/nodes-descheduler-strategies.adoc b/modules/nodes-descheduler-strategies.adoc index d9242af4467b..ec112cf3f581 100644 --- a/modules/nodes-descheduler-strategies.adoc +++ b/modules/nodes-descheduler-strategies.adoc @@ -14,7 +14,7 @@ The underutilization of nodes is determined by several configurable threshold pa + You can also set a target threshold for CPU, memory, and number of Pods. If a node's usage is above the configured target thresholds for all parameters, then the node's Pods might be considered for eviction. + -Additionally, you can use the `nodes` parameter to set the strategy to activate only when the number of underutilized nodes is above the configured value. This can be helpful in large clusters where a few nodes might be underutilized frequently or for a short period of time. +Additionally, you can use the `NumberOfNodes` parameter to set the strategy to activate only when the number of underutilized nodes is above the configured value. This can be helpful in large clusters where a few nodes might be underutilized frequently or for a short period of time. Duplicate Pods:: The `RemoveDuplicates` strategy ensures that there is only one Pod associated with a ReplicaSet, ReplicationController, Deployment, or Job running on same node. If there are more, then those duplicate Pods are evicted for better spreading of Pods in a cluster. @@ -35,3 +35,10 @@ Violation of node taints:: The `RemovePodsViolatingNodeTaints` strategy ensures that Pods violating `NoSchedule` taints on nodes are removed. + This situation could occur if a Pod is set to tolerate a taint `key=value:NoSchedule` and is running on a tainted node. If the node's taint is updated or removed, the taint is no longer satisfied by the Pod's tolerations and the Pod is evicted. + +Too many restarts:: +The `RemovePodsHavingTooManyRestarts` strategy ensures that Pods that have been restarted too many times are removed from nodes. ++ +This situation could occur if a Pod is scheduled on a node that is unable to start it. For example, if the node is having network issues and is unable to mount a networked persistent volume, then the Pod should be evicted so that it can be scheduled on another node. Another example is if the Pod is crashlooping. ++ +This strategy has two configurable parameters: `PodRestartThreshold` and `IncludingInitContainers`. If a Pod is restarted more than the configured `PodRestartThreshold` value, then the Pod is evicted. You can use the `IncludingInitContainers` parameter to specify whether restarts for Init Containers should be calculated into the `PodRestartThreshold` value.