Prevent unintentional removal of all clusters if CRD is removed #5117

naemono · 2021-12-05T21:27:33Z

Proposal

We probably want to try and prevent users from accidentally purging all ES clusters (or any Elastic custom resource for that matter) from their cluster by accidentally deleting the custom resource itself, as the below scenario shows is possible. This could be caused by a user during maintenance thinking that they could remove, reinstall without issues, or potentially from issues during a helm upgrade where the user is wanting to completely remove ECK operator, and reinstall it, not understanding the consequences of their action.

Example of the issue to prevent:

❯ kind create cluster --config config/samples/elasticsearch/custom/kind-cluster.yaml --wait 10m --retain
Creating cluster "kind" ...

❯ helm install elastic-operator elastic/eck-operator -n elastic-system --create-namespace -f ./eck-operator-values.yaml
NAME: elastic-operator
LAST DEPLOYED: Sun Dec  5 14:56:39 2021
NAMESPACE: elastic-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
1. Inspect the operator logs by running the following command:
   kubectl logs -n elastic-system sts/elastic-operator

❯ cat ./eck-operator-values.yaml
webhook:
  enabled: true
managedNamespaces:
- elastic

❯ kubectl apply -f config/samples/elasticsearch/custom/elasticsearch-small-1-member.yaml -n elastic

❯ cat config/samples/elasticsearch/custom/elasticsearch-small-1-member.yaml
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: testing
spec:
  version: 7.15.1
  nodeSets:
  - name: masters
    count: 1
    config:
      node.roles: ["master", "data"]
      node.store.allow_mmap: false

❯ kubectl get es -n elastic
NAME      HEALTH   NODES   VERSION   PHASE   AGE
testing   green    1       7.15.1    Ready   96s

❯ kc delete crd elasticsearches.elasticsearch.k8s.elastic.co
customresourcedefinition.apiextensions.k8s.io "elasticsearches.elasticsearch.k8s.elastic.co" deleted

❯ kubectl get pod -n elastic
NAME                   READY   STATUS        RESTARTS   AGE
testing-es-masters-0   1/1     Terminating   0          2m3s

Possible prevention path

We likely could add an additional validating webhook that routes all CRD deletion requests to a validating webhook such as:

apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
  name: elastic-crd-deletion-prevention.k8s.elastic.co
webhooks:
- name: elastic-crd-deletion-prevention.k8s.elastic.co
  rules:
  - apiGroups:
    - apiextensions.k8s.io
    apiVersions:
    - v1
    operations:
    - DELETE
    resources:
    - customresourcedefinitions
  clientConfig:
    service:
      namespace: webhooks
      name: eck-webhook-server
      path: /crd

Where we would

ignore all non-elastic CRDs
Query for any instances of the CRD type in the cluster, and if the number is > 0, refuse the request with a detailed explanation of the reasoning.

The text was updated successfully, but these errors were encountered:

mtparet · 2021-12-06T10:18:41Z

🙏 I experienced this too #4734 because of this issue k3s-io/helm-controller#110

jpuskar · 2023-03-05T18:02:47Z

I ran into this too, when migrating from direct helm installs to argocd, because it involves removing the helm chart. It bit me pretty hard, and this behavior is inconsistent with every other operator helm chart I've used.

What I expected was to remove the helm release, and be able to immediately reapply it, without consequences, because the state would be stored in the CRDs.

The current behavior is extremely dangerous.

jtele2 · 2023-11-15T20:34:09Z

This also bit me hard with flux. This needs to be fixed ASAP.

naemono added the discuss We need to figure this out label Dec 5, 2021

botelastic bot added the triage label Dec 5, 2021

thbkrkr added the >feature Adds or discusses adding a feature to the product label Dec 6, 2021

botelastic bot removed the triage label Dec 6, 2021

naemono mentioned this issue Apr 30, 2024

Helm chart changes to allow CRD prevention webhook #7754

Closed

4 tasks

naemono mentioned this issue May 14, 2024

Add Helm annotation to CRDs to prevent accidental deletion. #7811

Merged

naemono closed this as completed in #7811 May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent unintentional removal of all clusters if CRD is removed #5117

Prevent unintentional removal of all clusters if CRD is removed #5117

naemono commented Dec 5, 2021

mtparet commented Dec 6, 2021

jpuskar commented Mar 5, 2023 •

edited

Loading

jtele2 commented Nov 15, 2023

Prevent unintentional removal of all clusters if CRD is removed #5117

Prevent unintentional removal of all clusters if CRD is removed #5117

Comments

naemono commented Dec 5, 2021

Proposal

Possible prevention path

mtparet commented Dec 6, 2021

jpuskar commented Mar 5, 2023 • edited Loading

jtele2 commented Nov 15, 2023

jpuskar commented Mar 5, 2023 •

edited

Loading