You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is pretty common to implement init logic in Kubernetes Job resources and use some kind of hooks to run the init logic before the main application.
An example in Helm would be to annotate the init Job with Helm hooks (i.e. "helm.sh/hook": pre-install and "helm.sh/hook-delete-policy": hook-succeeded,hook-failed) so we, for example, run migrations before the Deployment resource creates the actual application.
ArgoCD has a long-standing bug where if a Job has ttlSecondsAfterFinished set to 0 or a low value, the Job gets deleted before ArgoCD can mark the hook phase as completed, and it gets stuck in the hook phase and cannot progress further.
The infinite loop happens in this part of the code.
When the Job resource gets deleted by the Job controller because of expired TTL, the syncTask for the hook does not have a liveObject anymore, and it cannot call the getOperationPhase function here to get the updated status.
The bug happens in the gitops-engine rather than core ArgoCD.
This issue has been mentioned in a couple of places:
Open the UI, then open the created Application and notice it is constantly syncing and the message is waiting for completion of hook batch/Job/hello-world-job
Expected behavior
PreSync hook completes successfully and the Sync progresses to Healthy.
Checklist:
argocd version
.Describe the bug
It is pretty common to implement init logic in Kubernetes Job resources and use some kind of hooks to run the init logic before the main application.
An example in Helm would be to annotate the init Job with Helm hooks (i.e.
"helm.sh/hook": pre-install
and"helm.sh/hook-delete-policy": hook-succeeded,hook-failed
) so we, for example, run migrations before the Deployment resource creates the actual application.ArgoCD has a long-standing bug where if a Job has
ttlSecondsAfterFinished
set to 0 or a low value, the Job gets deleted before ArgoCD can mark the hook phase as completed, and it gets stuck in the hook phase and cannot progress further.The infinite loop happens in this part of the code.
When the Job resource gets deleted by the Job controller because of expired TTL, the
syncTask
for the hook does not have aliveObject
anymore, and it cannot call thegetOperationPhase
function here to get the updated status.The bug happens in the
gitops-engine
rather than core ArgoCD.This issue has been mentioned in a couple of places:
waiting for completion of hook batch/Job/argocd-redis-secret-init
argo-helm#2887To Reproduce
Helm chart used for testing can be found here.
The chart has the following resources:
make start-local
waiting for completion of hook batch/Job/hello-world-job
Expected behavior
PreSync hook completes successfully and the Sync progresses to Healthy.
Screenshots
Version
argocd commit 730363f
gitops-engine commit 0371401803996f84bcd70a5f6bb2f0ecc7d7b5d2
Logs
The text was updated successfully, but these errors were encountered: