-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with spec.syncPolicy.automated.selfHeal Option Ignoring Timeouts(probably) #19289
Comments
I can't reproduce this issue. Can you tell me how to reproduce it in detail? |
this is my guess. Please configure HorizontalPodAutoscaler so that it changes the number of replicas automatically - if Argo CD does not react to these automatic changes in the cluster, then there is no problem . a little later I will set these timeouts to 86400 in my cluster and show the logs |
argocd-application-controller-0
Previously I installed using Helm - Now I changed the timeouts in the variables file and helm upgrade argocd .The controller has rebooted with new settings. It still does something endlessly and takes up as much CPU time as it can.
There are many such lines and they appear in batches with breaks of a couple of minutes, I think the processor can't go faster, it's busy with something... With the above settings I expect the controller to be idle. At least the CPU load should be lower - but nothing has changed |
it could be that he's been storing tasks in a queue for months and is now trying to complete them all. That is, there's a queue of tens/hundreds of thousands of tasks. And he'll continue to complete them all? Now it continues to do what I described above and without stopping it grows the message log
|
this didn't change anything either How to debug this - Refreshing app status (controller refresh requested) |
I believe there is a problem with the
spec.syncPolicy.automated.selfHeal
option in Argo CD. The current implementation seems to ignore the configured timeouts and triggers checks more frequently than expected, leading to unnecessary resource consumption.Details:
I have configured the following timeouts:
My expectation is that the controller will not attempt to change or check the cluster state within these timeout periods. Specifically, I expect to see no synchronization attempts in the controller logs for at least five minutes. However, this is not happening. Instead, the controller constantly attempts to check and synchronize the state, as shown in the logs below:
Hypothesis:
It appears that the controller might be responding to events in the cluster and ignoring the configured timeouts. One possible trigger for these frequent checks could be automatic changes in the number of replicas managed by a HorizontalPodAutoscaler, which, in this case, seems unnecessary.
Request:
I would like the ability to disable all checks and events except for the regular timeout checks. The controller should respect the configured timeouts and not attempt to check or synchronize the cluster state during these intervals unless explicitly required.
Thank you for looking into this issue.
The text was updated successfully, but these errors were encountered: