Add a parameter for making the thresholds of the LowNodeUtilization strategy relative to average values #473

AmoVanB · 2020-12-16T15:57:14Z

Reason for this PR

Currently, the thresholds that can be set for the LodNodeUtilization strategy only allow to make sure that nodes aren't over or underutilized compared to thresholds relative to their capacities. However, sometimes, it's desirable to make sure all nodes are "similarly" utilized to ensure the proper balance of a cluster. While this is possible to implement using the current thresholds, the values needed to achieve such a balancing behavior would be dependent on the number of nodes, pods, and their memory/CPU requests, which is not very practical to configure, especially for dynamic clusters.

Feature of this PR

This PR enables the descheduler to achieve such a goal. We define an additional useDeviationThresholds boolean parameter (the name is maybe not the best, it can change). If false, the descheduler behaves as now. If true, the thresholds are considered as percentage deviations relative to the average utilizations among all nodes.

For example. Considering only the memory metric. If nodes have utilizations of [10%, 20%, 30%, 20%] and both threshold and targetThreshold are 5%. The average node utilization is (10% + 20% + 30% + 20%) / 4 = 20%. The first node has an utilization of 10%, which is lower than 20% (average) - 5% (threshold). The node is hence considered underutilized. The third node has an utilization of 30%, which is greater than 20% (average) + 5% (targetThreshold). The node is hence considered overutilized. Both other nodes have utilizations within the [20% - 5%, 20% + 5%] window and are hence considered appropriately utilized.

Once nodes are labeled as under- or overutilized, the strategy behaves exactly the same as before.

Code style comment

Since the only change compared to the original strategy is the way nodes are categorized as over- and underutilized, I thought it is best to include that in the existing LowNodeUtilization strategy and not to create a new strategy. Another option would be to create two distinct strategies that share methods. I went for the first option as that was the easiest to implement. I would definitely not be against the second option.

Test

We had this need/problem in our 200+ pods cluster. We deployed the descheduler from this PR with success in our test clusters. It passes our tests (that make sure that the correct pods are evicted when the cluster is unbalanced) and works as expected and without any detected issues for now.

Vision

In general, the proposed feature goes into the direction of a more general intention we have with the descheduler: take the state of the cluster, smartly compute a new schedule for all the pods (not one by one but all at once to leverage the complete knowledge we have) according to a given strategy (ours: balanced utilization), and then implementing the transition from the current state to the desired state. This PR is our first (small) step towards that goal but is there such a plan for the descheduler? Overall, that would turn the descheduler into a "cluster scheduler" in contrast to the current "per-pod scheduler" of Kubernetes. I know there is a plan to eventually incorporate scheduling in the tool and not to rely on the existing scheduler, but not sure if you're looking into such optimization-driven strategies.

k8s-ci-robot · 2020-12-16T15:57:22Z

Hi @AmoVanB. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

AmoVanB · 2020-12-16T15:58:54Z

pkg/apis/componentconfig/v1alpha1/types.go

+
+	// Logging specifies the options of logging.
+	// Refer [Logs Options](https://github.com/kubernetes/component-base/blob/master/logs/options.go) for more information.
+	Logging componentbaseconfig.LoggingConfiguration `json:"logging,omitempty"`


I was getting a complaint from update-generated-conversions.sh without that. Not sure if it's really needed though.

AmoVanB · 2020-12-16T16:01:01Z

pkg/descheduler/strategies/lownodeutilization.go

+		} else {
+			lowResourceThreshold = map[v1.ResourceName]*resource.Quantity{
+				v1.ResourceCPU:    resource.NewMilliQuantity(int64(float64(strategyConfig.Thresholds[v1.ResourceCPU])*float64(nodeCapacity.Cpu().MilliValue())*0.01), resource.DecimalSI),
+				v1.ResourceMemory: resource.NewQuantity(int64(float64(strategyConfig.Thresholds[v1.ResourceMemory])*float64(nodeCapacity.Memory().Value())*0.01), resource.BinarySI),


Is resource.BinarySI correct here? I took that from the existing code, but I'm surprised that this is different from resource.DecimalSI for the CPU and pods. Not sure about the purpose/meaning of that parameter though.

Yes, they're different because for CPU, 5 cores = 5 * 1000m, but for Memory, 5GiB = 5 * 1024 MiB.

AmoVanB · 2020-12-16T16:02:08Z

pkg/descheduler/strategies/lownodeutilization_test.go

@@ -103,7 +104,7 @@ func TestLowNodeUtilization(t *testing.T) {
 				},
 				n2NodeName: {
 					Items: []v1.Pod{
-						*test.BuildTestPod("p9", 400, 0, n1NodeName, test.SetRSOwnerRef),
+						*test.BuildTestPod("p9", 400, 0, n2NodeName, test.SetRSOwnerRef),


Here and in the 3 following changes, I thought that was a typo as the pod is actually supposed to be on node 2. Not sure if that is either important or correct.

Please put this changes into a separate PR.

Already fixed on master by this commit.

AmoVanB · 2020-12-16T16:03:22Z

pkg/descheduler/strategies/topologyspreadconstraint.go

 			continue
 		}

 		// 2. for each topologySpreadConstraint in that namespace
+		klog.V(1).InfoS("Processing "+strconv.Itoa(len(namespaceTopologySpreadConstraints))+" constraints", "namespace", namespace.Name)


Just added some logs which I think are very helpful for simple debug.

I don't think these debug logs should be in this level, you can set it to level 3.

…rage

lixiang233

Hi @AmoVanB. Thanks for your PR.

If we run lowNodeUtilization periodically in a dynamic cluster, we may need to change Thresholds or TargetThresholds once the total utilization of the cluster changes. So it makes sense to make Thresholds and TargetThresholds change with the average resources usage of the cluster.

/ok-to-test
/kind feature

lixiang233 · 2021-02-22T07:37:52Z

README.md

 |`thresholds`|map(string:int)|
 |`targetThresholds`|map(string:int)|


I think we should consider to change the name of these thresholds to low and high to make it easier to understand, that's what @ingvagabund suggests here.

Yeah, we were discussing renaming the params few times back. Wrt. bumping version of the strategy config as well. E.g. reorganize the data type so we can allow to configure the same strategy multiple times with different configuration.

Though, we can still add new params, deprecate the current ones and remove them after 3 releases.

lixiang233 · 2021-02-22T07:49:55Z

pkg/descheduler/strategies/topologyspreadconstraint.go

 			continue
 		}

 		// 2. for each topologySpreadConstraint in that namespace
+		klog.V(1).InfoS("Processing "+strconv.Itoa(len(namespaceTopologySpreadConstraints))+" constraints", "namespace", namespace.Name)


I don't think these debug logs should be in this level, you can set it to level 3.

lixiang233 · 2021-02-22T08:05:09Z

pkg/descheduler/strategies/lownodeutilization.go

+		} else {
+			lowResourceThreshold = map[v1.ResourceName]*resource.Quantity{
+				v1.ResourceCPU:    resource.NewMilliQuantity(int64(float64(strategyConfig.Thresholds[v1.ResourceCPU])*float64(nodeCapacity.Cpu().MilliValue())*0.01), resource.DecimalSI),
+				v1.ResourceMemory: resource.NewQuantity(int64(float64(strategyConfig.Thresholds[v1.ResourceMemory])*float64(nodeCapacity.Memory().Value())*0.01), resource.BinarySI),


Yes, they're different because for CPU, 5 cores = 5 * 1000m, but for Memory, 5GiB = 5 * 1024 MiB.

ingvagabund

Some suggestions before a proper review of this PR. I like the idea in overall.

ingvagabund · 2021-03-14T09:45:07Z

pkg/descheduler/strategies/lownodeutilization.go

-	targetThresholds := strategy.Params.NodeResourceUtilizationThresholds.TargetThresholds
-	if err := validateStrategyConfig(thresholds, targetThresholds); err != nil {
+	strategyConfig := strategy.Params.NodeResourceUtilizationThresholds
+	if err := validateStrategyConfig(strategyConfig); err != nil {


Please put the changes generalizing thresholds and targetThresholds into strategyConfig.thresholds and strategyConfig.targetThresholds into a separate commit. It will be easier to follow the changes.

ingvagabund · 2021-03-14T09:46:24Z

pkg/descheduler/strategies/topologyspreadconstraint.go

@@ -114,9 +115,10 @@ func RemovePodsViolatingTopologySpreadConstraint(
 			(len(excludedNamespaces) > 0 && excludedNamespaces.Has(namespace.Name)) {
 			continue
 		}
+		klog.V(1).InfoS("Processing namespace", "namespace", namespace.Name)


Please put changes in TSC into its own PR as it's not related to the deviation feature.

ingvagabund · 2021-03-14T09:47:46Z

pkg/descheduler/strategies/lownodeutilization_test.go

@@ -103,7 +104,7 @@ func TestLowNodeUtilization(t *testing.T) {
 				},
 				n2NodeName: {
 					Items: []v1.Pod{
-						*test.BuildTestPod("p9", 400, 0, n1NodeName, test.SetRSOwnerRef),
+						*test.BuildTestPod("p9", 400, 0, n2NodeName, test.SetRSOwnerRef),


Please put this changes into a separate PR.

k8s-ci-robot · 2021-03-14T09:49:32Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: AmoVanB
To complete the pull request process, please assign damemi after the PR has been reviewed.
You can assign the PR to them by writing /assign @damemi in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2021-03-14T09:49:38Z

@AmoVanB: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ingvagabund · 2021-03-14T09:50:04Z

We should get #434 merged first though.

ingvagabund · 2021-04-06T07:44:55Z

@AmoVanB #434 merged

AmoVanB · 2021-05-08T11:43:04Z

I'm not working on that anymore. I tried to rebase but there are a couple of non-obvious conflicts. I unfortunately won't have time to work on rebasing and fixing that. Hopefully someone can take over! :)

matthieu-eck · 2021-07-26T08:13:33Z

Thanks @AmoVanB, I will take over from here ;) @ingvagabund, Shall I open another PR with the required changes?

ingvagabund · 2021-07-26T08:27:37Z

@matthieu-eck by all means. Thank you for continuing the work @AmoVanB started!!!

k8s-triage-robot · 2022-01-10T13:39:18Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-ci-robot · 2022-02-03T19:47:02Z

@AmoVanB: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-descheduler-test-e2e-k8s-master-1-21	`326d175`	link	`/test pull-descheduler-test-e2e-k8s-master-1-21`
pull-descheduler-helm-test	`326d175`	link	`/test pull-descheduler-helm-test`
pull-descheduler-test-e2e-k8s-master-1-22	`326d175`	link	`/test pull-descheduler-test-e2e-k8s-master-1-22`
pull-descheduler-test-e2e-k8s-1-21-1-21	`326d175`	link	`/test pull-descheduler-test-e2e-k8s-1-21-1-21`
pull-descheduler-unit-test-master-master	`326d175`	link	true	`/test pull-descheduler-unit-test-master-master`
pull-descheduler-test-e2e-k8s-master-1-23	`326d175`	link	true	`/test pull-descheduler-test-e2e-k8s-master-1-23`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

…sigs#473

…sigs#473 feat(LowNodeUtilization): useDeviationThresholds, redo of kubernetes-sigs#473

…sigs#473

…sigs#473 feat(LowNodeUtilization): useDeviationThresholds, redo of kubernetes-sigs#473 feat(LowNodeUtilization): useDeviationThresholds, redo of kubernetes-sigs#473

k8s-triage-robot · 2022-03-05T20:31:55Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

…sigs#473

…sigs#473 [751]: normalize Percentage in nodeutilization and clean the tests

feat: Add DeviationThreshold Paramter for LowNodeUtilization, (Previous attempt - #473 )

ingvagabund · 2022-03-28T10:54:18Z

Closing in favor of #751
/close

k8s-ci-robot · 2022-03-28T10:54:29Z

@ingvagabund: Closed this PR.

In response to this:

Closing in favor of #751
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 16, 2020

k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Dec 16, 2020

k8s-ci-robot requested review from damemi and k82cn December 16, 2020 15:57

AmoVanB commented Dec 16, 2020

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 21, 2021

add a parameter for making nodeutilization thresholds relative to ave…

326d175

…rage

AmoVanB force-pushed the feature/relative-node-utilization branch from 946d17b to 326d175 Compare January 22, 2021 09:20

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 22, 2021

lixiang233 reviewed Jan 28, 2021

View reviewed changes

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. kind/feature Categorizes issue or PR as related to a new feature. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 28, 2021

lixiang233 reviewed Feb 22, 2021

View reviewed changes

ingvagabund requested changes Mar 14, 2021

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 14, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 10, 2022

HelmutLety pushed a commit to HelmutLety/descheduler that referenced this pull request Mar 1, 2022

feat(LowNodeUtilization): useDeviationThresholds, redo of kubernetes-…

821b290

…sigs#473

HelmutLety pushed a commit to HelmutLety/descheduler that referenced this pull request Mar 1, 2022

feat(LowNodeUtilization): useDeviationThresholds, redo of kubernetes-…

bd7f899

…sigs#473

HelmutLety mentioned this pull request Mar 1, 2022

feat: Add DeviationThreshold Paramter for LowNodeUtilization, (Previous attempt - #473 ) #751

Merged

HelmutLety pushed a commit to HelmutLety/descheduler that referenced this pull request Mar 1, 2022

feat(LowNodeUtilization): useDeviationThresholds, redo of kubernetes-…

2113262

…sigs#473 feat(LowNodeUtilization): useDeviationThresholds, redo of kubernetes-sigs#473

HelmutLety pushed a commit to HelmutLety/descheduler that referenced this pull request Mar 2, 2022

feat(LowNodeUtilization): useDeviationThresholds, redo of kubernetes-…

182947d

…sigs#473

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 5, 2022

HelmutLety pushed a commit to HelmutLety/descheduler that referenced this pull request Mar 17, 2022

feat(LowNodeUtilization): useDeviationThresholds, redo of kubernetes-…

3801f17

…sigs#473

HelmutLety added a commit to HelmutLety/descheduler that referenced this pull request Mar 17, 2022

feat(LowNodeUtilization): useDeviationThresholds, redo of kubernetes-…

14e1b99

…sigs#473

HelmutLety added a commit to HelmutLety/descheduler that referenced this pull request Mar 28, 2022

feat(LowNodeUtilization): useDeviationThresholds, redo of kubernetes-…

2ea65e6

…sigs#473 [751]: normalize Percentage in nodeutilization and clean the tests

k8s-ci-robot added a commit that referenced this pull request Mar 28, 2022

Merge pull request #751 from HelmutLety/redo_#473

cf59d08

feat: Add DeviationThreshold Paramter for LowNodeUtilization, (Previous attempt - #473 )

k8s-ci-robot closed this Mar 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a parameter for making the thresholds of the LowNodeUtilization strategy relative to average values #473

Add a parameter for making the thresholds of the LowNodeUtilization strategy relative to average values #473

AmoVanB commented Dec 16, 2020

k8s-ci-robot commented Dec 16, 2020

AmoVanB Dec 16, 2020

AmoVanB Dec 16, 2020

lixiang233 Feb 22, 2021

AmoVanB Dec 16, 2020

ingvagabund Mar 14, 2021

matthieu-eck Aug 3, 2021

AmoVanB Dec 16, 2020

lixiang233 Feb 22, 2021

lixiang233 left a comment

lixiang233 Feb 22, 2021

ingvagabund Feb 22, 2021 •

edited

Loading

lixiang233 Feb 22, 2021

lixiang233 Feb 22, 2021

ingvagabund left a comment

ingvagabund Mar 14, 2021

ingvagabund Mar 14, 2021

ingvagabund Mar 14, 2021

k8s-ci-robot commented Mar 14, 2021

k8s-ci-robot commented Mar 14, 2021

ingvagabund commented Mar 14, 2021

ingvagabund commented Apr 6, 2021

AmoVanB commented May 8, 2021

matthieu-eck commented Jul 26, 2021

ingvagabund commented Jul 26, 2021

k8s-triage-robot commented Jan 10, 2022

k8s-ci-robot commented Feb 3, 2022

k8s-triage-robot commented Mar 5, 2022

ingvagabund commented Mar 28, 2022

k8s-ci-robot commented Mar 28, 2022

		\|`thresholds`\|map(string:int)\|
		\|`targetThresholds`\|map(string:int)\|

Add a parameter for making the thresholds of the LowNodeUtilization strategy relative to average values #473

Add a parameter for making the thresholds of the LowNodeUtilization strategy relative to average values #473

Conversation

AmoVanB commented Dec 16, 2020

Reason for this PR

Feature of this PR

Code style comment

Test

Vision

k8s-ci-robot commented Dec 16, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lixiang233 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ingvagabund Feb 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ingvagabund left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-ci-robot commented Mar 14, 2021

k8s-ci-robot commented Mar 14, 2021

ingvagabund commented Mar 14, 2021

ingvagabund commented Apr 6, 2021

AmoVanB commented May 8, 2021

matthieu-eck commented Jul 26, 2021

ingvagabund commented Jul 26, 2021

k8s-triage-robot commented Jan 10, 2022

k8s-ci-robot commented Feb 3, 2022

k8s-triage-robot commented Mar 5, 2022

ingvagabund commented Mar 28, 2022

k8s-ci-robot commented Mar 28, 2022

ingvagabund Feb 22, 2021 •

edited

Loading