Add new podEvictor statistics #648

pravarag · 2021-10-17T12:26:26Z

Fixes #503

As part of this change, we are adding newer metrics for improving podEvictor statistics. Currently there is only one metric which is being calculated under: evictions.go

k8s-ci-robot · 2021-10-17T12:26:34Z

Hi @pravarag. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

a7i · 2021-10-19T14:26:20Z

🥇

It would be nice to have more insights around these metrics. What are your (and others) thoughts around including the following info?

Include namespace

I think this should be easy to do and I would think this is something that most users would care about. PodsEvicted metric already does this.

Include all ownerReferences[].name

I don't have a use-case around this, just an idea.

Inject custom labels into the metrics.

This one is tricky but my company has standards around labels (e.g. project= team=), it would be nice to add those in the DeschedulerPolicy somehow and inject them as "custom" labels in the metrics. I realize this is outside the scope of this PR so I can take this on after this PR is merged.

this conflicts with option 3 but another option would be to include recommended labels

I personally prefer option 3 (i.e custom labels) but just an idea

pravarag · 2021-10-26T13:35:41Z

Thanks @a7i for sharing above comments. I've few doubts though and again for this change we don't have anything definite on paper so I'll post these here for now,

Inject custom labels into the metrics.

Wouldn't this be more user specific? Like, not everyone will be having same custom labels. My understanding could be wrong here but if you have any examples for the same would be helpful 🙂 or does the custom labels refer to labels here?

a7i · 2021-10-26T14:01:21Z

Thanks @a7i for sharing above comments. I've few doubts though and again for this change we don't have anything definite on paper so I'll post these here for now,

Inject custom labels into the metrics.

Wouldn't this be more user specific? Like, not everyone will be having same custom labels. My understanding could be wrong here but if you have any examples for the same would be helpful 🙂 or does the custom labels refer to labels here?

Sorry if I was not clear. This proposal will be done outside of this PR. Once this PR is merged, I will create a Feature with my proposal/ideas around it.

pravarag · 2021-11-15T15:22:40Z

@a7i thanks for the information shared above. Please feel free to suggest any changes I can implement as part of this PR as well.
@damemi @ingvagabund kindly review and let me know if I can update or make any further changes :)

a7i · 2021-11-15T16:03:26Z

@a7i thanks for the information shared above. Please feel free to suggest any changes I can implement as part of this PR as well. @damemi @ingvagabund kindly review and let me know if I can update or make any further changes :)

I would really like to see namespace included in the metrics. PodsEvicted metric already does this.

damemi

/ok-to-test
@pravarag could you please move the PR out of draft if it's ready for review and remove the WIP from the title?

damemi · 2021-11-22T15:18:07Z

metrics/metrics.go

+			StabilityLevel: metrics.ALPHA,
+		}, []string{"result"})
+
+	TotalPodsSkipped = metrics.NewCounterVec(


is there anywhere you're updating TotalPodsSkipped?

I was not able to figure out as where I can update TotalPodsSkipped if it could be below this line and this line. If you could suggest me anything here, will be helpful.

a7i · 2021-11-30T13:29:24Z

pkg/descheduler/evictions/evictions.go

@@ -99,6 +99,7 @@ func (pe *PodEvictor) TotalEvicted() int {
 	for _, count := range pe.nodepodCount {
 		total += count
 	}
+	metrics.TotalPodsEvicted.With(map[string]string{"result": "total pods evicted so far"}).Inc()


Why is the metric being incremented here?

The reason for incrementing the metrics here is to capture the total pods evicted as per the count goes up in line 100. I thought of keeping it align with the evicted pod count going up in evictions.go so that it might be easier to record the total count. Let me know your thoughts otherwise I'll make some changes.

I see. It seems odd that it's being incremented in a method that is only supposed to get total evicted count.

// TotalEvicted gives a number of pods evicted through all nodes

Updated this as well to this line and have included namespace as a field under all the metrics.

a7i · 2021-11-30T13:30:07Z

pkg/descheduler/evictions/evictions.go

@@ -112,14 +113,16 @@ func (pe *PodEvictor) EvictPod(ctx context.Context, pod *v1.Pod, node *v1.Node,
 	}
 	if pe.maxPodsToEvictPerNode > 0 && pe.nodepodCount[node]+1 > pe.maxPodsToEvictPerNode {
 		metrics.PodsEvicted.With(map[string]string{"result": "maximum number reached", "strategy": strategy, "namespace": pod.Namespace}).Inc()
-		return false, fmt.Errorf("Maximum number %v of evicted pods per %q node reached", pe.maxPodsToEvictPerNode, node.Name)
+		metrics.PodsEvictedSuccess.With(map[string]string{"strategy": strategy}).Inc()


Why is PodsEvictedSuccess being incremented when maxPodsToEvictPerNode has been reached? Should this be TotalPodsSkipped?

Thanks for identifying this, I'll move PodsEvictedSuccess to a different and more appropriate place. I might have misinterpreted it's usage here.

Edit: @a7i I've moved PodsEvictedSuccess to this line 142. Let me know if this looks correct. Also do you have any suggestions in ways I can test this?

pravarag · 2021-12-14T13:20:29Z

@a7i @damemi I've one doubt regarding the utilization of metric TotalPodsSkipped. So for it's use-cases, I could think of two possible situations when we might want to skip a pod and then increment the respective metrics for it:

I think in these two cases when we have reached maximum number of pods per Node, Namespace: L120-125
After this line if there is any error faced while trying to evict a pod.

Any suggestions on where else I can utilize TotalPodsSkipped metrics would be helpful.

damemi · 2022-01-20T18:08:33Z

@pravarag I think this looks good, just needs a rebase
/approve

k8s-ci-robot · 2022-01-20T18:09:03Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: damemi, pravarag

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [damemi]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

pravarag · 2022-01-21T08:11:07Z

Thanks @damemi , I've rebased the branch as suggested.

ingvagabund

I wonder what is benefit of the new TotalPodsSkipped, PodsEvictedSuccess and PodsEvictedFailed metrics besides simpler expressions when querying the metrics? Currently they are subset of what is already provided by PodsEvicted.

When running the descheduler with metrics enabled the PodsEvicted metric gets continuously incremented over time with pods getting regularly evicted. In order to provide a number of pods evicted/skipped/... in a single run the pod evictor needs to be signaled a new run/descheduling cycle started.

ingvagabund · 2022-01-21T08:49:07Z

pkg/descheduler/evictions/evictions.go

-		if pe.metricsEnabled {
-			metrics.PodsEvicted.With(map[string]string{"result": "maximum number of pods per node reached", "strategy": strategy, "namespace": pod.Namespace, "node": node.Name}).Inc()
-		}
+		metrics.PodsEvicted.With(map[string]string{"result": "maximum number of pods per node reached", "strategy": strategy, "namespace": pod.Namespace, "node": node.Name}).Inc()


Would you mind putting all the places where your new metrics are populated under the same condition? The PodEvictor was recently extended with pe.metricsEnabled condition to populate the metrics only when the metrics server is running.

ingvagabund · 2022-01-21T09:10:16Z

metrics/metrics.go

+		&metrics.CounterOpts{
+			Subsystem:      DeschedulerSubsystem,
+			Name:           "total_pods_skipped",
+			Help:           "Total pods skipped for a single run",


I wonder what a single run means in this context? Is it meant as a single descheduling cycle? TotalPodsSkipped metric is used only inside PodEvictor where there's currently no way to distinguish when a single run finished and a new one started.

Maybe we could introduce a copy of these metrics named something like PodsSkippedLastRun, for example. It would require some refactoring to the PodEvictor, maybe pass it a timestamp at the start of a new run, and if that timestamp differs from the last one it remembers then reset the counters for those metrics. But I think knowing the totals vs individual runs would be helpful.

I suppose you could get the total by just summing the individual runs, in which case we don't need the Total.. metrics anymore. Either way, I think this is a good start and the refactors could come as a follow up. Wdyt?

With single run I meant a single descheduling cycle (somewhat mentioned here in this comment: #503 (comment)). So just wanted to check will it still be a good idea to introduce this metric or not @damemi @ingvagabund

ingvagabund · 2022-01-21T09:13:43Z

pkg/descheduler/evictions/evictions.go

@@ -184,6 +184,7 @@ func evictPod(ctx context.Context, client clientset.Interface, pod *v1.Pod, poli
 	err := client.PolicyV1beta1().Evictions(eviction.Namespace).Evict(ctx, eviction)

 	if apierrors.IsTooManyRequests(err) {
+		metrics.TotalPodsSkipped.With(map[string]string{"result": "total pods skipped so far", "namespace": pod.Namespace}).Inc()


Setting the result to total pods skipped so far is redundant here given the metric's name is TotalPodsSkipped. I wonder if it would make more sense to set the result to pod skipped due to TooManyRequests error?

ingvagabund · 2022-01-21T09:17:31Z

pkg/descheduler/evictions/evictions.go

 		return false, nil
 	}

+	metrics.PodsEvictedSuccess.With(map[string]string{"strategy": strategy, "namespace": pod.Namespace}).Inc()


PodsEvictedSuccess is a special case of PodsEvicted (with result label set to success and node label ignored).

ingvagabund · 2022-01-21T09:19:36Z

pkg/descheduler/evictions/evictions.go

-			metrics.PodsEvicted.With(map[string]string{"result": "error", "strategy": strategy, "namespace": pod.Namespace, "node": node.Name}).Inc()
-		}
+		metrics.PodsEvicted.With(map[string]string{"result": "error", "strategy": strategy, "namespace": pod.Namespace, "node": node.Name}).Inc()
+		metrics.PodsEvictedFailed.With(map[string]string{"strategy": strategy, "namespace": pod.Namespace}).Inc()


PodsEvictedFailed is a special case of PodsEvicted (with result label set to error and node label ignored). I wonder what is benefit of creating this metric compared to PodsEvictedFailed?

So this was again inclined towards capturing Failed metrics for a single run. Are you suggesting PodsEvictedFailed as compared to PodsEvicted metric alone?

ingvagabund · 2022-01-21T09:22:07Z

pkg/descheduler/evictions/evictions.go

-			metrics.PodsEvicted.With(map[string]string{"result": "maximum number of pods per namespace reached", "strategy": strategy, "namespace": pod.Namespace, "node": node.Name}).Inc()
-		}
+		metrics.PodsEvicted.With(map[string]string{"result": "maximum number of pods per namespace reached", "strategy": strategy, "namespace": pod.Namespace, "node": node.Name}).Inc()
+		metrics.TotalPodsSkipped.With(map[string]string{"result": "total pods skipped so far", "namespace": pod.Namespace}).Inc()


TotalPodsSkipped called here is a special case of PodsEvicted (with result label set to maximum number of pods per namespace reached and node label ignored).

ingvagabund · 2022-01-21T09:26:44Z

pkg/descheduler/evictions/evictions.go

 		return false, fmt.Errorf("Maximum number %v of evicted pods per %q namespace reached", *pe.maxPodsToEvictPerNamespace, pod.Namespace)
 	}

 	err := evictPod(ctx, pe.client, pod, pe.policyGroupVersion)
+	// increment TotalPodsEvicted
+	metrics.TotalPodsEvicted.With(map[string]string{"result": "total pods evicted so far", "namespace": pod.Namespace}).Inc()


At this point it is unknown if a pod was evicted or not. Thus, incrementing the TotalPodsEvicted does not reflect the actual pod eviction. The metric currently captures how many times the evictPod function was invoked.

I see, I'll update this part. Thanks!

ingvagabund · 2022-01-21T09:54:38Z

Within each descheduling cycle a new pod evictor instance is created. Which eliminates any way of telling the pod evictor "a run has ended". The PodsEvicted metric is shared across cycles so there's no distinction of which increments belongs to which descheduling cycle. On the other hand when the descheduling cycle is set to e.g. 1h there's no need for such distinction as one can either see the steps in a graph or use delta{...[1h]} expression to see the increments per cycle.

jklaw90 · 2022-02-02T07:08:10Z

@pravarag hey, i was looking into similar metrics around evictions. are you still planning on finishing this pr up or do you need any help?

pravarag · 2022-02-02T07:33:44Z

Hey @jklaw90 , I'm working on addressing the latest review comments for this PR. If you have any suggestions/comments, please feel free to share on this PR and I'll definitely check those 🙂 .

pravarag · 2022-02-02T16:55:44Z

Within each descheduling cycle a new pod evictor instance is created. Which eliminates any way of telling the pod evictor "a run has ended". The PodsEvicted metric is shared across cycles so there's no distinction of which increments belongs to which descheduling cycle. On the other hand when the descheduling cycle is set to e.g. 1h there's no need for such distinction as one can either see the steps in a graph or use delta{...[1h]} expression to see the increments per cycle.

@ingvagabund @damemi thanks for the detailed review. So few questions which I have now:

Do we still want to capture metrics for a single run?
Are there any other metrics that will need to be added or removed as part of this PR?

pravarag · 2022-02-23T04:43:21Z

@damemi @ingvagabund are we planning to merge these changes as part of release 1.24. If so, I would like you suggestions in moving this forward with all the changes required :)

damemi · 2022-02-23T19:15:44Z

@pravarag I would like to merge this, and sorry it's taken so long.

I think the only thing still being discussed was the fact that this is reporting cumulative pod evictions rather than single-run as implied. @ingvagabund is that all that was left to sort out? I wonder if maybe a different metric type, like a histogram, could solve this and provide easier access to single-run metrics

pravarag · 2022-02-25T01:54:23Z

@pravarag I would like to merge this, and sorry it's taken so long.

I think the only thing still being discussed was the fact that this is reporting cumulative pod evictions rather than single-run as implied. @ingvagabund is that all that was left to sort out? I wonder if maybe a different metric type, like a histogram, could solve this and provide easier access to single-run metrics

I can give a try by implementing histogram type metric for single-run of pod.

ingvagabund · 2022-03-14T09:58:18Z

I wonder if we still need to capture the per single-run metrics given we can use delta{...[1h]} query in the Prometheus UI. I wonder if there's a metric type which would allow to capture sequences in time. A histogram is another cumulative metric type. It will not help here.

@pravarag I am sorry. I don't think there's anything else left to do for the moment wrt. capturing metrics for a single run. Rather going back to the drawing board and identifying new metrics which we might add to the code.

pravarag · 2022-03-16T16:59:37Z

Thanks @ingvagabund @damemi for your reviews on this. And like mentioned above, that we may not need to capture metrics for a single run I guess, I can close this PR then? and since I've already put some effort around it, I still don't want to leave it unfinished as this has come out to be a good learning for me as well. Kindly let me know in ways we can rethink on implementing newer metrics as part of this issue? I'm definitely open for a discussion around it be it on this PR itself or maybe going back to the original issue :)

damemi · 2022-04-05T17:58:38Z

@pravarag sorry about this. it sounds like we've come back around to not needing single-run metrics. I think per-strategy metrics could be a good option though. what do you think?

pravarag · 2022-04-07T04:40:49Z

@pravarag sorry about this. it sounds like we've come back around to not needing single-run metrics. I think per-strategy metrics could be a good option though. what do you think?

I think that's a good idea @damemi @ingvagabund and I can work on implementing those. But wanted to check how to approach the same? Do I need to open a Proposal or new discussion around it, mayb a new issue? Or we are good to continue in #503 issue itself?

pravarag · 2022-04-11T10:26:14Z

On the other hand, I was thinking if I could just close this PR and open a fresh PR with new (per strategy) changes 🤔 and we can continue in that same issue.

ingvagabund · 2022-04-11T12:02:41Z

The new descheduling framework will have more options for introducing new metrics.

pravarag · 2022-04-11T16:08:52Z

@ingvagabund by options, do you mean the newer metrics will be added as part of new descheculing framework ? If so, then we can probably close this PR and issue related to it?

ingvagabund · 2022-04-19T07:44:23Z

Opening a new PR sounds reasonable so we can start fresh. The issue can stay open though.

pravarag · 2022-04-19T12:47:05Z

Closing this PR based on above discussions, will open a fresh one based on newer changes.

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 17, 2021

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 17, 2021

k8s-ci-robot requested review from lixiang233 and seanmalloy October 17, 2021 12:26

k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Oct 17, 2021

pravarag force-pushed the podevictor-statistics branch from 24f7920 to 8d513b5 Compare October 19, 2021 03:58

damemi reviewed Nov 15, 2021

View reviewed changes

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 15, 2021

pravarag marked this pull request as ready for review November 16, 2021 02:50

pravarag changed the title ~~[WIP] add new podEvictor statistics~~ Add new podEvictor statistics Nov 17, 2021

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 17, 2021

damemi reviewed Nov 22, 2021

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 29, 2021

a7i reviewed Nov 30, 2021

View reviewed changes

pravarag force-pushed the podevictor-statistics branch from 8d513b5 to b67619d Compare December 11, 2021 04:01

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 11, 2021

pravarag force-pushed the podevictor-statistics branch 2 times, most recently from 55ef3c4 to 8b5bb6a Compare December 12, 2021 14:36

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 20, 2022

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 20, 2022

pravarag force-pushed the podevictor-statistics branch from 62f27d9 to afc9334 Compare January 21, 2022 07:46

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 21, 2022

add new podEvictor statistics

ff0e422

pravarag force-pushed the podevictor-statistics branch from afc9334 to ff0e422 Compare January 21, 2022 07:54

ingvagabund reviewed Jan 21, 2022

View reviewed changes

damemi mentioned this pull request Feb 2, 2022

Kubernetes 1.23 Release Cycle #641

Closed

pravarag closed this Apr 19, 2022

Dentrax mentioned this pull request Jun 16, 2022

Improve podEvictor statistics #503

Closed

Dentrax mentioned this pull request Jan 20, 2023

[Tracking Issue] Enrich the Descheduler Metrics #1047

Closed

6 tasks

Add new podEvictor statistics #648

Add new podEvictor statistics #648

Conversation

pravarag commented Oct 17, 2021

k8s-ci-robot commented Oct 17, 2021

a7i commented Oct 19, 2021 • edited Loading

pravarag commented Oct 26, 2021

a7i commented Oct 26, 2021

pravarag commented Nov 15, 2021

a7i commented Nov 15, 2021

damemi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pravarag Dec 13, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pravarag Dec 9, 2021 • edited Loading

Choose a reason for hiding this comment

pravarag commented Dec 14, 2021

damemi commented Jan 20, 2022

k8s-ci-robot commented Jan 20, 2022

pravarag commented Jan 21, 2022

ingvagabund left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ingvagabund commented Jan 21, 2022 • edited Loading

jklaw90 commented Feb 2, 2022

pravarag commented Feb 2, 2022

pravarag commented Feb 2, 2022

pravarag commented Feb 23, 2022

damemi commented Feb 23, 2022

pravarag commented Feb 25, 2022

ingvagabund commented Mar 14, 2022

pravarag commented Mar 16, 2022

damemi commented Apr 5, 2022

pravarag commented Apr 7, 2022

pravarag commented Apr 11, 2022

ingvagabund commented Apr 11, 2022

pravarag commented Apr 11, 2022

ingvagabund commented Apr 19, 2022

pravarag commented Apr 19, 2022

a7i commented Oct 19, 2021 •

edited

Loading

pravarag Dec 13, 2021 •

edited

Loading

pravarag Dec 9, 2021 •

edited

Loading

ingvagabund commented Jan 21, 2022 •

edited

Loading