Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate reconcile error messages and certificate log spam #623

Closed
andreasgerstmayr opened this issue Sep 19, 2023 · 2 comments · Fixed by #679
Closed

Investigate reconcile error messages and certificate log spam #623

andreasgerstmayr opened this issue Sep 19, 2023 · 2 comments · Fixed by #679

Comments

@andreasgerstmayr
Copy link
Collaborator

Starting with the example CR from the README:

apiVersion: tempo.grafana.com/v1alpha1
kind: TempoStack
metadata:
  name: simplest
spec:
  storage:
    secret:
      name: minio-test
      type: s3
  storageSize: 1Gi
  resources:
    total:
      limits:
        memory: 2Gi
        cpu: 2000m
  template:
    queryFrontend:
      jaegerQuery:
        enabled: true

If I e.g. increase only the CPU limit, I see the following logs:

{"level":"info","ts":"2023-09-13T12:39:34.375236946Z","logger":"certrotation-reconcile","msg":"Checking if TempoStack certificates expired","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"de402bec-bebd-48ac-aa5c-b0a340d58799","tempo":{"name":"simplest","namespace":"default"},"name":"default/simplest","interval":"12h0m0s"}
{"level":"info","ts":"2023-09-13T12:39:34.37562641Z","logger":"certrotation-reconcile","msg":"Skipping cert rotation, all TempoStack certificates still valid","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"de402bec-bebd-48ac-aa5c-b0a340d58799","tempo":{"name":"simplest","namespace":"default"},"name":"default/simplest"}
{"level":"info","ts":"2023-09-13T12:39:34.375676783Z","logger":"tempostack-reconcile","msg":"certificate manifests built","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"e75ecdf5-b60d-40dc-8d33-95fbc164cd73","tempo":{"name":"simplest","namespace":"default"},"tempostacks":"default/simplest","event":"createOrRotateCerts","count":8}
{"level":"info","ts":"2023-09-13T12:39:34.509259308Z","logger":"certrotation-reconcile","msg":"Checking if TempoStack certificates expired","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"cad4f8e7-8a76-4465-8982-76163cce2d45","tempo":{"name":"simplest","namespace":"default"},"name":"default/simplest","interval":"12h0m0s"}
{"level":"info","ts":"2023-09-13T12:39:34.509647717Z","logger":"tempostack-reconcile","msg":"certificate manifests built","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"1c63f09b-c2f2-42c3-810b-e542fa38b73d","tempo":{"name":"simplest","namespace":"default"},"tempostacks":"default/simplest","event":"createOrRotateCerts","count":8}
{"level":"info","ts":"2023-09-13T12:39:34.509890166Z","logger":"certrotation-reconcile","msg":"Skipping cert rotation, all TempoStack certificates still valid","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"cad4f8e7-8a76-4465-8982-76163cce2d45","tempo":{"name":"simplest","namespace":"default"},"name":"default/simplest"}
{"level":"error","ts":"2023-09-13T12:39:34.587606545Z","logger":"tempostack-reconcile","msg":"failed to configure resource","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"1c63f09b-c2f2-42c3-810b-e542fa38b73d","tempo":{"name":"simplest","namespace":"default"},"object_name":"tempo-simplest-querier","object_kind":"&TypeMeta{Kind:,APIVersion:apps/v1,}","error":"Operation cannot be fulfilled on deployments.apps \"tempo-simplest-querier\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"github.com/grafana/tempo-operator/controllers/tempo.(*TempoStackReconciler).createOrUpdate\n\t/workspace/controllers/tempo/tempostack_create_or_update.go:169\ngh.neting.cc/grafana/tempo-operator/controllers/tempo.(*TempoStackReconciler).Reconcile\n\t/workspace/controllers/tempo/tempostack_controller.go:115\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:227"}
{"level":"error","ts":"2023-09-13T12:39:34.605115755Z","msg":"Reconciler error","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"1c63f09b-c2f2-42c3-810b-e542fa38b73d","error":"failed to create objects for TempoStack default/simplest: Operation cannot be fulfilled on deployments.apps \"tempo-simplest-querier\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:227"}
{"level":"info","ts":"2023-09-13T12:39:34.605208147Z","logger":"certrotation-reconcile","msg":"Checking if TempoStack certificates expired","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"b15c65ba-a7a6-4198-b009-eb373c94af58","tempo":{"name":"simplest","namespace":"default"},"name":"default/simplest","interval":"12h0m0s"}
{"level":"info","ts":"2023-09-13T12:39:34.605567402Z","logger":"certrotation-reconcile","msg":"Skipping cert rotation, all TempoStack certificates still valid","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"b15c65ba-a7a6-4198-b009-eb373c94af58","tempo":{"name":"simplest","namespace":"default"},"name":"default/simplest"}
{"level":"info","ts":"2023-09-13T12:39:34.605574512Z","logger":"tempostack-reconcile","msg":"certificate manifests built","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"33b46c81-7dbe-4f01-b49b-ee5943b98195","tempo":{"name":"simplest","namespace":"default"},"tempostacks":"default/simplest","event":"createOrRotateCerts","count":8}
{"level":"info","ts":"2023-09-13T12:39:34.653764281Z","logger":"certrotation-reconcile","msg":"Checking if TempoStack certificates expired","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"37875325-1802-4a13-aa61-2ae2a8e4bc7b","tempo":{"name":"simplest","namespace":"default"},"name":"default/simplest","interval":"12h0m0s"}
{"level":"info","ts":"2023-09-13T12:39:34.654103274Z","logger":"certrotation-reconcile","msg":"Skipping cert rotation, all TempoStack certificates still valid","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"37875325-1802-4a13-aa61-2ae2a8e4bc7b","tempo":{"name":"simplest","namespace":"default"},"name":"default/simplest"}
{"level":"info","ts":"2023-09-13T12:39:34.654102576Z","logger":"tempostack-reconcile","msg":"certificate manifests built","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"8e69c763-4524-42f7-a8d1-2d239610fced","tempo":{"name":"simplest","namespace":"default"},"tempostacks":"default/simplest","event":"createOrRotateCerts","count":8}
{"level":"info","ts":"2023-09-13T12:39:34.677580717Z","logger":"tempostack-reconcile","msg":"certificate manifests built","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"be34ab58-cb08-4a8f-b950-6d6ef3b132c9","tempo":{"name":"simplest","namespace":"default"},"tempostacks":"default/simplest","event":"createOrRotateCerts","count":8}
@andreasgerstmayr
Copy link
Collaborator Author

After further investigation, it looks like the amount of reconciliations are expected.

For example, when enabling JaegerQuery (.spec.template.queryFrontend.jaegerQuery.enabled):

  • Editing the TempoStack CR triggers the first reconciliation.
  • The first reconciliation modifies the tempo-query-frontend deployment, which triggers the second reconciliation.
  • Now the deployment is progressing, triggering multiple further reconciliations. We do update the TempoStack status field with the status of its components, therefore we need to observe status changes of the deployments.

@andreasgerstmayr
Copy link
Collaborator Author

{"level":"error","ts":"2023-11-10T16:37:28.979667934Z","logger":"tempostack-reconcile","msg":"failed to configure resource","controller":"tempostack","controllerGroup":"tempo.grafana.com","controllerKind":"TempoStack","TempoStack":{"name":"simplest","namespace":"default"},"namespace":"default","name":"simplest","reconcileID":"5bbafb6d-196e-4d28-81f7-c02d76b56806","tempo":{"name":"simplest","namespace":"default"},"object_name":"tempo-simplest-query-frontend","object_kind":"&TypeMeta{Kind:,APIVersion:apps/v1,}","error":"Operation cannot be fulfilled on deployments.apps \"tempo-simplest-query-frontend\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"github.com/grafana/tempo-operator/controllers/tempo.(*TempoStackReconciler).createOrUpdate\n\t/workspace/controllers/tempo/tempostack_create_or_update.go:186\ngh.neting.cc/grafana/tempo-operator/controllers/tempo.(*TempoStackReconciler).Reconcile\n\t/workspace/controllers/tempo/tempostack_controller.go:115\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227"}

is afaics caused by a stale cache in controller-runtime. The code runs Get() followed by a Update(): https://github.com/kubernetes-sigs/controller-runtime/blob/v0.16.3/pkg/controller/controllerutil/controllerutil.go#L198-L222 - either the cache was stale or some other component modified the deployment at the same time.

andreasgerstmayr added a commit to andreasgerstmayr/tempo-operator that referenced this issue Nov 10, 2023
Checking/skipping/building of certificate manifests is run for every
reconciliation and is filling up the logs unnecessarily. For debugging
purposes, the log level can be increased.

Resolves: grafana#623
Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
andreasgerstmayr added a commit that referenced this issue Nov 13, 2023
Checking/skipping/building of certificate manifests is run for every
reconciliation and is filling up the logs unnecessarily. For debugging
purposes, the log level can be increased.

Resolves: #623

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant