Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(pod autosharding): transition from labelselector to fieldselector #2347

Conversation

pkoutsovasilis
Copy link
Contributor

@pkoutsovasilis pkoutsovasilis commented Mar 21, 2024

What this PR does / why we need it:

This PR is a minor change for pod autosharding mode where it substitutes the LabelSelector with a FieldSelector. Specifically, at the moment in this mode, we detect and extract the labels of pod-owner StatefulSet based on which a LabelSelector is constructed and used in NewFilteredListWatchFromClient. However, if a label of the former StatefulSet is changed for any reason, e.g. an arbitrary operator manages and injects in the labels the hash of the whole statefulset to decide whether it has changed during a reconcile, as the pods won't restart (k8s design) no more events will arrive and thus shards won't be updated properly. Instead of relying on LabelSelector this PR replaces it with a FieldSelector that targets the owner StatefulSet by its name, thus it will always receive updates. This last bit, is aligned with the current if ss.Name != statefulSetName in AddFunc and UpdateFunc

How does this change affect the cardinality of KSM: (increases, decreases or does not change cardinality)

  • does not change cardinality

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #2355

Copy link

linux-foundation-easycla bot commented Mar 21, 2024

CLA Signed

  • ✅login: pkoutsovasilis / (097ae84)

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 21, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If kube-state-metrics contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

Welcome @pkoutsovasilis!

It looks like this is your first PR to kubernetes/kube-state-metrics 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kube-state-metrics has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Mar 21, 2024
@pkoutsovasilis pkoutsovasilis changed the title [pod autosharding] transition from labelselector to fieldselector feat: [pod autosharding] transition from labelselector to fieldselector Mar 21, 2024
@pkoutsovasilis
Copy link
Contributor Author

pkoutsovasilis commented Mar 26, 2024

@dgrisonnet could you please triage this PR?

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Mar 29, 2024
@CatherineF-dev
Copy link
Contributor

CatherineF-dev commented Apr 18, 2024

/lgtm

  1. metadata.name is more invariant compared to labels. Though I am not sure why it used labels in the beginning.

  2. Most k/k codes are using metadata.name instead of labels. https://github.com/search?q=repo%3Akubernetes%2Fkubernetes+OneTermEqualSelector%28%22metadata.name%22&type=code&p=1

  3. Tests are done to verify that it can work.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 18, 2024
@pkoutsovasilis pkoutsovasilis changed the title feat: [pod autosharding] transition from labelselector to fieldselector fix(pod autosharding): transition from labelselector to fieldselector Apr 22, 2024
@pkoutsovasilis
Copy link
Contributor Author

@CatherineF-dev @dgrisonnet just checking, do we need something more for this PR? Could this make it in the next version of kube-state-metrics?

@LaikaN57 LaikaN57 mentioned this pull request May 1, 2024
@CatherineF-dev
Copy link
Contributor

cc @dgrisonnet to approve

@diranged
Copy link

I'm not sure what the status is here - but this is a really huge pain point for us right now. Any time we ship any release that updates labels on any deployments/statefulsets/etc, our kube-state-metrics pods get into a bad state and start sending invalid data.. which invariably leads to our ops teams getting paged with incorrect alerts about pods being in bad states.

@CatherineF-dev
Copy link
Contributor

In asking an approval for this PR. I only have LGTM permission so far.

@rexagod
Copy link
Member

rexagod commented Jun 4, 2024

/approve

@CatherineF-dev feel free to send a PR to add yourself to approvers.

Sorry, I misspoke. We can still do this, but this needs to go through other approvers internally first, and really not something for me to individually pass a judgement on.

Nonetheless, thank you for all the reviews!

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CatherineF-dev, pkoutsovasilis, rexagod

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 4, 2024
@k8s-ci-robot k8s-ci-robot merged commit f28abc9 into kubernetes:main Jun 4, 2024
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kube-state-metrics with autosharding stops updating shards when the labels of the statefulset are updated
5 participants