From 9321fedbea0fe4832e3ea2648be0b7ec08294a72 Mon Sep 17 00:00:00 2001 From: Rajendra Indukuri <82365588+rajendraindukuri@users.noreply.github.com> Date: Tue, 25 Jun 2024 10:52:33 +0530 Subject: [PATCH] Added doc for resource limits for CSM Operator (#1146) --- content/docs/deployment/csmoperator/_index.md | 25 ++++++++++++++++--- .../deployment/csmoperator/release/_index.md | 1 + 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/content/docs/deployment/csmoperator/_index.md b/content/docs/deployment/csmoperator/_index.md index f82b884e83..2d7741bd23 100644 --- a/content/docs/deployment/csmoperator/_index.md +++ b/content/docs/deployment/csmoperator/_index.md @@ -40,6 +40,7 @@ Dell CSM Operator can be installed manually or via Operator Hub. Once installed you will be able to deploy [drivers](drivers) and [modules](modules) from the Operator. ### OpenShift Installation via Operator Hub +>NOTE: You cannot update the resource requests and limits when you are deploying operator using Operator Hub `dell-csm-operator` can be installed via Operator Hub on upstream Kubernetes clusters & Red Hat OpenShift Clusters. @@ -61,6 +62,7 @@ Both editions have the same codebase and are supported by Dell Technologies, the * The `Community` can be installed on any Kubernetes distributions. ### Manual Installation on a cluster without OLM +>NOTE: You can update the resource requests and limits when you are deploying operator using manual installation without OLM 1. Install volume snapshot CRDs. For detailed snapshot setup procedure, [click here](../../snapshots/#volume-snapshot-feature). 2. Clone and checkout the required csm-operator version using @@ -69,7 +71,17 @@ git clone -b v1.6.0 https://github.com/dell/csm-operator.git ``` 3. `cd csm-operator` 4. _(Optional)_ If using a local Docker image, edit the `deploy/operator.yaml` file and set the image name for the CSM Operator Deployment. -5. _(Optional)_ If **CSM Replication** is planned for use and will be deployed using two clusters in an environment where the DNS is not configured, and cluster API endpoints are FQDNs, in order to resolve queries to remote API endpoints, it is necessary to edit the `deploy/operator.yaml` file and add the `hostAliases` field and associated `:` mappings to the CSM Operator Controller Manager Deployment under `spec.template.spec`. More information on host aliases can be found, [here](https://kubernetes.io/docs/tasks/network/customize-hosts-file-for-pods/). +5. _(Optional)_ The Dell CSM Operator might need more resources if users have larger environment (>1000 Pods). You can modify the default resource requests and limits in the files `deploy/operator.yaml`, `config/manager/manager.yaml` and increase the values for cpu and memory. More information on setting the resource requests and limits can be found [here](https://sdk.operatorframework.io/docs/best-practices/managing-resources/). Current default values are set as below: + ```yaml + resources: + limits: + cpu: 200m + memory: 512Mi + requests: + cpu: 100m + memory: 192Mi + ``` +6. _(Optional)_ If **CSM Replication** is planned for use and will be deployed using two clusters in an environment where the DNS is not configured, and cluster API endpoints are FQDNs, in order to resolve queries to remote API endpoints, it is necessary to edit the `deploy/operator.yaml` file and add the `hostAliases` field and associated `:` mappings to the CSM Operator Controller Manager Deployment under `spec.template.spec`. More information on host aliases can be found, [here](https://kubernetes.io/docs/tasks/network/customize-hosts-file-for-pods/). ```yaml # example config apiVersion: apps/v1 @@ -84,13 +96,20 @@ git clone -b v1.6.0 https://github.com/dell/csm-operator.git - "remote.FQDN" ip: "255.255.255.1" ``` -6. Run `bash scripts/install.sh` to install the operator. +7. Run `bash scripts/install.sh` to install the operator. >NOTE: Dell CSM Operator will be installed in the `dell-csm-operator` namespace. +>NOTE: If you want to update the resource requests and limits configuration after the operator is installed. Follow the steps below: + + * Uninstall the operator following the steps [here](https://dell.github.io/csm-docs/v3/deployment/csmoperator/#uninstall) + + * Update the resource configuration as mentioned in step 5 and install the operator using the step 7 above + + {{< imgproc install.JPG Resize "2500x" >}}{{< /imgproc >}} -6. Run the command to validate the installation. +8. Run the command to validate the installation. ```bash kubectl get pods -n dell-csm-operator ``` diff --git a/content/docs/deployment/csmoperator/release/_index.md b/content/docs/deployment/csmoperator/release/_index.md index e0708ad2bf..e4ba9c884f 100644 --- a/content/docs/deployment/csmoperator/release/_index.md +++ b/content/docs/deployment/csmoperator/release/_index.md @@ -31,3 +31,4 @@ Description: > | Issue | Workaround | |-------|------------| | When CSM Operator creates a deployment that includes secrets (e.g., application-mobility, observability, cert-manager, velero), these secrets are not deleted on uninstall and will be left behind. For example, the `karavi-topology-tls`, `otel-collector-tls`, and `cert-manager-webhook-ca` secrets will not be deleted. | This should not cause any issues on the system, but all secrets present on the cluster can be found with `kubectl get secrets -A`, and any unwanted secrets can be deleted with `kubectl delete secret -n `| +| In certain environments, users have encountered difficulties in installing drivers using the CSM Operator due to the 'OOM Killed' issue. This issue is attributed to the default resource requests and limits configured in the CSM Operator, which fail to meet the resource requirements of the user environments. OOM error occurs when a process in the container tries to consume more memory than the limit specified in resource configuration.| Before deploying the CSM Operator, it is crucial to adjust the memory and CPU requests and limits in the files [config/manager.yaml](https://github.com/dell/csm-operator/blob/main/config/manager/manager.yaml#L100), [deploy/operator.yaml](https://github.com/dell/csm-operator/blob/main/deploy/operator.yaml#L1330) to align with the user's environment requirements. If the containers running on the pod exceed the specified CPU and memory limits, the pod may get evicted. Currently CSM Operator do not support updating this configuration dynamically. CSM Operator needs to be redeployed for these updates to take effect in case it is already installed. Steps to manually update the resource configuration and then redeploy CSM Operator are available [here](https://dell.github.io/csm-docs/docs/deployment/csmoperator/#installation)| \ No newline at end of file