Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRD metrics are extremely brittle #1847

Closed
logicalhan opened this issue Oct 4, 2022 · 2 comments · Fixed by #1850
Closed

CRD metrics are extremely brittle #1847

logicalhan opened this issue Oct 4, 2022 · 2 comments · Fixed by #1850
Assignees
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug.

Comments

@logicalhan
Copy link
Member

What happened:

CRD metrics are currently extremely brittle due to the way it is implemented. The name of the CRD is interpolated into the name of the metric which makes it impossible to aggregate a CRD across different versions (or at least very very annoying to do so).

What you expected to happen:

The name, version, group of the CRD should be a value in a metric label.

How to reproduce it (as minimally and precisely as possible):

https://github.com/kubernetes/kube-state-metrics/blob/master/docs/customresourcestate-metrics.md

@logicalhan logicalhan added the kind/bug Categorizes issue or PR as related to a bug. label Oct 4, 2022
@dgrisonnet
Copy link
Member

I agree with that sentiment, prefixing the metrics by the GVK means that it is impossible today to build any kind of alerts or dashboards that would work across multiple versions of the CRD. It is very tedious and risky to have to update all the things that depend on the metrics when an API is updated. A UX improvement could also be to make it so that users don't necessarily have to update their ksm config after every API update by always matching all the versions available for a certain API if none is specified.

Today, we have a way to override the prefix of the metric: https://github.com/kubernetes/kube-state-metrics/blob/master/docs/customresourcestate-metrics.md#naming but as far as I know, this would still not add the gvk labels which would make it impossible to differentiate metrics from different versions of the API.
In terms of usability, I also don't think that defaulting to a gvk prefix is the right choice, for the reason mentioned in this issue so I would be in favor of changing the existing implementation as suggested above.

@dgrisonnet dgrisonnet added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Oct 6, 2022
@rexagod
Copy link
Member

rexagod commented Oct 7, 2022

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants