Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split PodMonitor into separate objects #2193

Closed
wants to merge 1 commit into from
Closed

Conversation

Nalum
Copy link
Member

@Nalum Nalum commented Dec 8, 2021

This relates to issues #2192 and #2150

We are breaking the initial manifest up into multiple as it doesn't like when selector.matchExpressions[*].values has more than 1 value. Let me know if there is more to this.

Please let me know if you'd prefer this broken out into multiple files. I'll be able to test this change out tomorrow.

Signed-off-by: Luke Mallon (Nalum) <luke.mallon@weave.works>
@kingdonb
Copy link
Member

kingdonb commented Dec 8, 2021

it doesn't like when selector.matchExpressions[*].values has more than 1 value

This seems like a bug, and I'm pretty sure this worked before. I went looking through release notes:
https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack#from-22x-to-23x

and didn't find anything that particularly jumped out at me as explaining this. However we do have this release from the guide configured without any version number:

https://github.com/fluxcd/flux2/blob/main/manifests/monitoring/kube-prometheus-stack/release.yaml

This explains how it broke! (This would be upgrading from one version of the stack to the next, without any commit or input from our side. Whatever they have done to break it would have been silently applied on the cluster, and in fact on my 5 day old cluster I see at least 9 upgrades, having overflowed the helm release revision history... it looks like there have been potentially breaking upgrades in the possibly very recent history.)

I suspect if we reverted kube-prometheus-stack to an earlier version, we would find it still works there.

It's up to you if we want to figure out where that initial problem came from. To me, if this selector can't match multiple values in the array, that's a bug and that should be addressed. I don't want to say "not our bug" (but for the time being, we have to do something as our guide is broken!)

selector:
  matchExpressions:
    - key: app
      operator: In
      values:
        - kustomize-controller

I tested your change and it resolves the issue for now 👍

@kingdonb
Copy link
Member

kingdonb commented Dec 8, 2021

I definitely see 6 controllers with the original config on: -- (Edit: this note must have been wrong, since it disagrees with the order and other information here. I went back and re-tested 21.0.5 and found it definitely suffers from the 20.0.0+ issue.)

XXX $ ...
XXX kube-prometheus-stack	21.0.5

and I did not see the 6 controllers (only the last of them, image-reflector-controller, which happened to be the last array element) on this version:

23.1.2

It should take me a couple of minutes to bisect and find out exactly what version broke it!

@Nalum
Copy link
Member Author

Nalum commented Dec 8, 2021

Good work 👍

I feel it's probably better to pin the version and then monitor for it to be fixed upstream. I can spend a bit of time on that tomorrow and look for/create an issue for it on the upstream project.

@kingdonb
Copy link
Member

kingdonb commented Dec 8, 2021

It looks like the problem was introduced in the 20.0.0 chart release.

Everything is fine on 19.2.3, fine on 19.3.0, then 20.0.0 is the next release, and subsequent releases all have this issue.

Just to confirm, 23.1.2 (the release from just two hours ago) also has this issue, having tested it once again after ascertaining all of the above.

I'll submit a (fluxcd/website) docs update to pin the version at 19.2.3 19.3.0 for the time being, since I agree that is the best way for now!

kingdonb pushed a commit to kingdonb/flux2 that referenced this pull request Dec 8, 2021
Something in kube-prometheus-stack 20.0.0 has broken our example.
See fluxcd#2193 for more information.
kingdonb pushed a commit to kingdonb/flux2 that referenced this pull request Dec 8, 2021
Something in kube-prometheus-stack 20.0.0 has broken our example.
See fluxcd#2193 for more information.

Signed-off-by: Kingdon Barrett <kingdon@weave.works>
@kingdonb
Copy link
Member

kingdonb commented Dec 8, 2021

@Nalum Nalum closed this Dec 9, 2021
@Nalum Nalum deleted the podMonitor branch December 9, 2021 08:50
@grafjo
Copy link
Contributor

grafjo commented Dec 12, 2021

@kingdonb this issue is fixed with release kube-prometheus-stack-23.2.0 and dashboards have data again

I opend PR #2208

grafjo pushed a commit to grafjo/flux2 that referenced this pull request Dec 12, 2021
this release contains the prometheus operator in version 0.52.1

see fluxcd#2192
fluxcd#2193 for issues
grafjo pushed a commit to grafjo/flux2 that referenced this pull request Dec 12, 2021
this release contains the prometheus operator in version 0.52.1

see fluxcd#2192
fluxcd#2193 for issues

Signed-off-by: Johannes Graf <graf@synyx.de>
souleb pushed a commit to souleb/flux2 that referenced this pull request Jul 10, 2023
Something in kube-prometheus-stack 20.0.0 has broken our example.
See fluxcd#2193 for more information.

Signed-off-by: Kingdon Barrett <kingdon@weave.works>
souleb pushed a commit to souleb/flux2 that referenced this pull request Jul 10, 2023
this release contains the prometheus operator in version 0.52.1

see fluxcd#2192
fluxcd#2193 for issues

Signed-off-by: Johannes Graf <graf@synyx.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants