-
Notifications
You must be signed in to change notification settings - Fork 614
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
oci source for prometheus-community charts doesn't work from flux #3313
Comments
I confirm the issue from a fresh cluster. |
There is no such version as 41.7.4 https://github.com/prometheus-community/helm-charts/pkgs/container/charts%2Fkube-prometheus-stack |
Indeed, they only publish new releases in the OCI registry and it looks like the process failed on the commit of 41.7.4. As I also use the HelmRepository for the other charts that haven't new release yet (exporters), I decided − for now − to revert to the official repository (and disable verification), but I confirm that reverting to 41.7.3 would also fix the deployment. Note: I said official repository as it's the one still documented in their pages and README. Extra notes:
|
It looks like some changes went in since the failed release that have mitigated the 403 error (maybe?) At least there has been a test since the failed build in the prometheus-community repo with a different chart tag, and it passed (3 days ago), so I suspect the next tag would succeed. I have gone through the issue reports you mentioned @sereinity and that's the conclusion I came to... Maybe OT or maybe not... Is there a way to publish a new GHA job that then runs for all tags, including old ones that haven't run the job yet? (I'm trying to think of ways to publish all back versions of charts, once we are sure we have the job right.) I think we'd want to see all of the charts going forward reliably published as OCI, with a method to publish those that were skipped due to failures or that came before the update... is this a blocker before we document this and make it the preferred distribution? I'd probably settle for a way to publish the missing chart in OCI, hopefully without removing or recreating any tags, but ideally we'd be publishing old versions backwards from the date when OCI first became available – this might be a challenge. The benefit would be so people can switch their repos at any time without worrying about whether they are on a late enough version of a given chart, or if they need to upgrade Prom first. Ideally it's all of the chart versions, scripted back to a given threshold date, or just literally all of them, as far back as will work (with a version of Helm that had OCI support, I guess?) Maybe worth adding a GHA job with workflow dispatch trigger that only publishes the OCI chart for a given tag on-demand, so we can patch up issues like this one as they come along (hopefully not very often) – I don't know how much value there is in going back and republishing all the old charts, but I wouldn't rule it out as many people will probably see that as a blocker to adopt, (until/unless the old HTTP-based chart repo is axed and not supported anymore for new chart versions.) |
Yes this was a temporary error in the new CI for https://github.com/prometheus-community/helm-charts. It should be fixed now, but yes makes sense that a package version was missing. I'll manually push the chart version 41.7.4 today. Note I still need to push previous chart history to OCI, it is currently only new chart versions (minus ones missing due to the recent new CI errors that are fixed now). |
@scottrigby the exporters seems to have no OCI charts, can you push all charts from that repo to GHCR please? |
@stefanprodan ok yes will do 👍 |
Current versions of all Prometheus community charts are pushed to ghcr, signed and public packages ✅ Still need to push package history. There are thousands of past versions, and when pushing locally there is an OIDC prompt. So I think I may put my automation into a GitHub action to run on a cron schedule for backfilling in OCI. |
thank you @scottrigby, I'm really happy that we come to this solution. I'm closing the issue as this is no more related to this repository, and the main issue is gone. |
Describe the bug
Since #3303 has been merged flux can't operate charts coming from prometheus-community.
here the status of charts from this registry:
flux -n monitoring get sources chart NAME REVISION SUSPENDED READY MESSAGE elastic-prometheus-elasticsearch-exporter 4.14.0 False False chart pull error: failed to download chart for remote reference: failed to authorize: failed to fetch anonymous token: unexpected status: 403 Forbidden monitoring-kube-prometheus-stack 41.7.3 False False chart verification error: failed to verify oci://ghcr.io/prometheus-community/charts/kube-prometheus-stack:41.7.4: GET https://ghcr.io/v2/prometheus-community/charts/kube-prometheus-stack/manifests/41.7.4: MANIFEST_UNKNOWN: manifest unknown
I tried to suspend, reconcile, resume those resources (and associated helmreleases) without any success.
I didn't try yet from a fresh installation of flux and all, will probably be done by the week/day as I need to provision a fresh cluster.
Steps to reproduce
Expected behavior
The new chart is pulled, verified and applied
Screenshots and recordings
No response
OS / Distro
Arch Linux
Flux version
v0.36.0
Flux check
► checking prerequisites
✔ Kubernetes 1.24.6-gke.1500 >=1.20.6-0
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.26.0
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.30.0
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.28.0
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v0.31.0
► checking crds
✔ alerts.notification.toolkit.fluxcd.io/v1beta1
✔ buckets.source.toolkit.fluxcd.io/v1beta2
✔ gitrepositories.source.toolkit.fluxcd.io/v1beta2
✔ helmcharts.source.toolkit.fluxcd.io/v1beta2
✔ helmreleases.helm.toolkit.fluxcd.io/v2beta1
✔ helmrepositories.source.toolkit.fluxcd.io/v1beta2
✔ kustomizations.kustomize.toolkit.fluxcd.io/v1beta2
✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2
✔ providers.notification.toolkit.fluxcd.io/v1beta1
✔ receivers.notification.toolkit.fluxcd.io/v1beta1
✔ all checks passed
Git provider
No response
Container Registry provider
No response
Additional context
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: