---
title: Neat-enhancement-idea
authors:
- "@alexmt"
sponsors:
- TBD
reviewers:
- "@jessesuen"
- TBD
approvers:
- "@jessesuen"
- TBD
creation-date: 2020-04-19
last-updated: 2020-04-19
---
Support manual approval for pruning and deleting Kubernetes resources during application syncing/deletion.
Introduce Kubernetes resource-level annotations that require manual user approval using Argo CD UI/CLI/API before the resource is pruned or deleted. The annotations should be respected while Argo CD attempts to synchronize or delete the application.
We’ve seen cases where Argo CD deleted Kubernetes resources due to a bug or misconfiguration. Examples include corrupted data in Redis, user errors (1, 2) and bug in the automation on top of Argo CD. These examples don’t mean Argo CD is not reliable; however, there are cases where misbehavior is catastrophic, and erroneous deletion is not acceptable. Examples include the app-of-apps pattern where Argo CD is used to manage itself, or namespaces in production clusters.
The goals of a proposal ares:
Developers should be able to add an annotation to resources that require manual approval before deletion. The annotation should be respected by Argo CD when it attempts to delete the application.
Developers should be able to add an annotation to resources that require manual approval before pruning. The annotation should be respected by Argo CD when it attempts to prune extra resources while syncing the application.
We've made our best effort to implement corrected behavior, and as of now, we are not aware of any bugs that cause erroneous deletion. The goal of this proposal is to provide a safety net for cases where deletion is not acceptable.
It is proposed to introduce two new sync options for Argo CD applications: Prune=confirm
and Delete=confirm
. Options would
protect resources from accidental deletion during cascading application deletion as well as during sync operations.
Argo CD already supports argocd.argoproj.io/sync-options: Prune=false
sync option that prevents resource deletion while syncing
the application. This, however, is not ideal since it prevents implementing fully automated workflows that include resource deletion.
In order to improve the situation, we propose to introduce confirm
option for Prune sync option. When confirm
option is set, Argo CD should pause the sync operation
before deleting any app resources and wait for the user to confirm the deletion. The confirmation can be done in a very friendly way using Argo CD UI, CLI or API.
- Sync Operation status. I suggest not to introduce new sync operation states to avoid disturbing the existing automation around syncing (CI pipelines, scripts etc).
If Argo CD is waiting for the operation state should remain
Progressing
. Once the user confirms the deletion, the operation should resume. - Sync Waves. The sync wave shuold be "paused" while Argo CD is waiting for the user to confirm the deletion. No difference from waiting for the resource to became healthy.
Similarly to Prune
sync option we need to introduce confirm
value for Delete
sync option: argocd.argoproj.io/sync-options: Delete=confirm
. The confirm
option
should pause the sync operation before deleting any app resources and wait for the user to confirm the deletion. The confirmation can be done in a very friendly way
using Argo CD UI, CLI or API.
Since we know Argo CD is often used to implement fully automated developer workflows that include resource deletion, the deletion approval process should be as painless as possible. This way, platform administrators can instruct end users to apply the new prune/delete option to resources that require special care without significantly disturbing the developer experience.
In both cases where Argo CD requires manual approval, the user should be able to approve the deletion using Argo CD UI, CLI, or API. The approval process should be as simple as possible and should not require the user to understand the internals of Argo CD.
A new field requiresDeletionApproval
should be added to the status.resources
list items. The field should be set to true
when the resource deletion approval is required.
- health:
status: Healthy
kind: Service
name: guestbook-ui
namespace: default
status: OutOfSync
version: v1
requiresPruning: true
requiresDeletionApproval: true # new field that indicates that deletion approval is required
The Argo CD UI, CLI should visualize the requiresDeletionApproval
field so that the user can easily discover which resources require manual approval.
The Argo CD UI, CLI should bundle the Approve Deletion
resource action
that would allow the user to approve the deletion. The action should patch the resource with the argocd.argoproj.io/deletion-approved: true
annotation.
Once annotation is applied the Argo CD should proceed with the deletion.
The main reason to use the action is that we can reuse existing RBAC to control who can approve the deletion.
The Argo CD UI should provide a convinient way to approve resources that require manual approval. The existing user interface will provide a button that allows end user
execute the Approve Deletion
action and approve resources one by one. In addition to the single resource approval, the UI should provide a way to approve all resources
that require manual approval. The new button should execute the Approve Deletion
action for all resources that require manual approval.
Argo CD CLI would no need changes since existing argocd app actions run
command allows to execute an action against multiple resources.
The default Argo CD notification catalog should include a trigger and notification template that notifies the user when deletion approval is required. The notification template should include a list of resources that require approval.
The user should be able to approve resource deletion without using the UI or CLI by manually adding the argocd.argoproj.io/deletion-approved: true
annotation to the resource.
Add a list of detailed use cases this enhancement intends to take care of.
As a developer, I would like to mark resources that require manual pruning approval so I can prevent the accidental deletion of critical resources.
As a developer, I would like to mark resources that require manual deletion approval so I can prevent the accidental deletion of critical resources.
The resource approval would require a mechanism to control who can approve the deletion. The proposal to use resource-level actions solves this problem and allows us to reuse the existing RBAC model.
None.
In case of rollback to the previous version the sync option would be ignored and the resources would be deleted as before.
The proposal would require end users to learn about the new behavior and adjust their workflows. It includes a set of enhancements aimed at minimizing the impact on end users.
None.