Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application controller excessive reconciling #9014

Closed
paveq opened this issue Apr 6, 2022 · 6 comments
Closed

Application controller excessive reconciling #9014

paveq opened this issue Apr 6, 2022 · 6 comments
Labels
bug Something isn't working

Comments

@paveq
Copy link

paveq commented Apr 6, 2022

Describe the bug

When Kubernetes resource contained within Argo-CD application changes rapidly in target cluster and namespace (due to bug in eg. some other operator), it causes application controller getting loaded.

To Reproduce

Have a resource updating rapidly in application's target namespace.

Expected behavior

Application controller should detect this happens, and backoff refreshing applications exponentially, potentially remove watches for specific time on that application.

Even if root cause of resource updating rapidly is certainly bug somewhere else, it should not cause argo-cd to fail or spend excess amount of CPU to constantly reconcile application.

Version

2.3.3

Logs

In case here, we have argo-events sensor object updating rapidly due to unknown cause, but we've seen this happen with other CR's and operators too.

time="2022-04-06T10:44:45Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argo-events-ci
time="2022-04-06T10:44:45Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: argo-events)" application=argo-events-ci
time="2022-04-06T10:44:45Z" level=info msg="getRepoObjs stats" application=argo-events-ci build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=5 unmarshal_ms=5 version_ms=0
time="2022-04-06T10:44:45Z" level=info msg="Skipping auto-sync: most recent sync already to 453b2e55216b2de40fd95d008f96f24a654a3691" application=argo-events-ci
time="2022-04-06T10:44:45Z" level=info msg="No status changes. Skipping patch" application=argo-events-ci
time="2022-04-06T10:44:45Z" level=info msg="Reconciliation completed" application=argo-events-ci dedup_ms=0 dest-name= dest-namespace=argo-events dest-server="https://kubernetes.default.svc" diff_ms=2 fields.level=1 git_ms=6 health_ms=0 live_ms=1 settings_ms=0 sync_ms=0 time_ms=19
time="2022-04-06T10:44:46Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argo-events-ci
time="2022-04-06T10:44:46Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: argo-events)" application=argo-events-ci
time="2022-04-06T10:44:46Z" level=info msg="getRepoObjs stats" application=argo-events-ci build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=6 unmarshal_ms=6 version_ms=0
time="2022-04-06T10:44:46Z" level=info msg="Skipping auto-sync: most recent sync already to 453b2e55216b2de40fd95d008f96f24a654a3691" application=argo-events-ci
time="2022-04-06T10:44:46Z" level=info msg="No status changes. Skipping patch" application=argo-events-ci
time="2022-04-06T10:44:46Z" level=info msg="Reconciliation completed" application=argo-events-ci dedup_ms=0 dest-name= dest-namespace=argo-events dest-server="https://kubernetes.default.svc" diff_ms=2 fields.level=1 git_ms=6 health_ms=0 live_ms=1 settings_ms=0 sync_ms=0 time_ms=20
time="2022-04-06T10:44:46Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argo-events-ci
time="2022-04-06T10:44:46Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: argo-events)" application=argo-events-ci
time="2022-04-06T10:44:46Z" level=info msg="getRepoObjs stats" application=argo-events-ci build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=5 unmarshal_ms=5 version_ms=0
time="2022-04-06T10:44:46Z" level=info msg="Skipping auto-sync: most recent sync already to 453b2e55216b2de40fd95d008f96f24a654a3691" application=argo-events-ci
time="2022-04-06T10:44:46Z" level=info msg="No status changes. Skipping patch" application=argo-events-ci
time="2022-04-06T10:44:46Z" level=info msg="Reconciliation completed" application=argo-events-ci dedup_ms=0 dest-name= dest-namespace=argo-events dest-server="https://kubernetes.default.svc" diff_ms=2 fields.level=1 git_ms=5 health_ms=0 live_ms=1 settings_ms=0 sync_ms=0 time_ms=20
@paveq paveq added the bug Something isn't working label Apr 6, 2022
@paveq paveq changed the title Application controller excessive reconcile activity Application controller excessive reconciling Apr 6, 2022
@rishabh625
Copy link
Contributor

can u check in logs what exactly is triggering excess reconcile

as log included here and released in v2.3.0, can't see in added logs
https://github.com/argoproj/argo-cd/pull/8192/files

@mikutas
Copy link
Contributor

mikutas commented Oct 4, 2022

In our environment ExternalSecret/ClusterExternalSecret cause this issue

@ku524
Copy link

ku524 commented Nov 25, 2022

@mikutas Could you explain your case & solution, please?
I experienced same issue nowadays, and coincidentally, we are also using external-secret

@mikutas
Copy link
Contributor

mikutas commented Nov 29, 2022

@ku524
ExternalSecret's status is updated when it's synced.
ClusterExternalSecret's status is also updated when each ExternalSecret is synced (if CES has more destination, more often updates).
Those status updates cause refreshing I think.

ctrl.requestAppRefresh(app.QualifiedName(), &level, nil)

Because CRD's status is only ignored by default.

https://argo-cd.readthedocs.io/en/stable/user-guide/diffing/#system-level-configuration

By default status field is ignored during diffing for CustomResourceDefinition resource. The behavior can be extended to all resources using all value or disabled using none.

iamKunal added a commit to iamKunal/home-k8s-gitops-mirror that referenced this issue Mar 16, 2023
fix: check if reconcilation loop cpu usage is due to es

Ref argoproj/argo-cd#9014
@iamKunal
Copy link

I faced a similar issue recently on my local cluster.

In my case I was using External Secrets too, but I was also using it for syncing Argo's username, password (using ES's templating features):
https://github.com/iamKunal/home-k8s-gitops-mirror/blob/d2d9f1f8e6752cc487a21e0d00be6be3330963a1/apps/argocd/templates/argocd-password-external-secret.yaml#L17-L18

Since ArgoCD creates the server.secretKey on startup I changed the creation Policy to merge so that there are no conflicts. But this led to the generated secret not being owned by the External Secret.

It led to some sort of weird loop wherein the External Secret update led to Secret Update which triggered reconciliation which led to External Secret update again.

I had to disable the secret generation by argo and disable the merging, but as others mentioned server.secretKey had to be set to an empty string. After an argoCD restart and it seems to be working fine once the secretKey is regenerated, but it does lead to a weird scenario with two keys in the secret having different values:

image

Everything seems to be working fine now even after a couple of external secret refreshes.

On a separate note, it would be good to have #10393

For reference, these are the changes: iamKunal/home-k8s-gitops-mirror@da2f3e0

@agaudreault
Copy link
Member

The behavior can be configured in ignoreResourceUpdates to resolve this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants