Skip to content

Commit

Permalink
update troubleshooting and patch items along with the new release
Browse files Browse the repository at this point in the history
Signed-off-by: testadmin <testadmin@redhat.com>
  • Loading branch information
testadmin committed Jun 26, 2023
1 parent fa39a85 commit 50c210a
Show file tree
Hide file tree
Showing 2 changed files with 143 additions and 34 deletions.
66 changes: 64 additions & 2 deletions docs/patching_subscription_image.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,68 @@
# Patching ACM hub and managed clusters with another subscription container images

## Patching hub cluster
## Patching hub cluster and managed clusters together (ACM >= 2.8)

To patch the subscription image, here are the steps:

`quay.io/xiangjingli/multicloud-operators-subscription@sha256:51f12144c277e33b34c18295468a7f375a2261eafc124b1f427253d3924c4867`

- On the hub, Get the namespace and name of the MCH resource
```
% oc get mch -A
NAMESPACE NAME STATUS AGE
open-cluster-management multiclusterhub Running 16h
```

- Create a ConfigMap to reference the images provided in the hotfix
```
$ oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: redhat-acm-hotfix-mintls12
namespace: open-cluster-management # this is the MCH namespace
labels:
operator.multicluster.openshift.io/hotfix: redhat-acm-hotfix-mintls12
data:
manifest.json: |-
[
{
"image-remote": "quay.io/xiangjingli",
"image-key": "multicluster_operators_subscription",
"image-name": "multicloud-operators-subscription",
"image-digest": "sha256:51f12144c277e33b34c18295468a7f375a2261eafc124b1f427253d3924c4867"
}
]
EOF
```

- Activate the hotfix by applying an annotation to the MCH resource for overriding the images specified in the configmap
```
$ oc -n open-cluster-management annotate mch multiclusterhub --overwrite mch-imageOverridesCM=redhat-acm-hotfix-mintls12
```

- The following hub subscription pods are expected to be restart and running with the new hot fix image
```
% oc get pods -n open-cluster-management |grep subscription
multicluster-operators-hub-subscription-5cfdf4bb84-xcc9z 1/1 Running 0 50m
multicluster-operators-standalone-subscription-5467dcdbcc-2w8l2 1/1 Running 0 50m
multicluster-operators-subscription-report-57b776ccf9-ktvph 1/1 Running 0 50m
```

- (optional) Restart the RHACM operator on the hub
if hub subscription pods are not restarted after a while, restart the RHACM operator pods to ensure that the operator picks up the hotfix configuration
```
$ oc -n open-cluster-management scale deployment multiclusterhub-operator --replicas=0
$ oc -n open-cluster-management scale deployment multiclusterhub-operator --replicas=1
```

- Go to all managed clusters, make sure the following application-manager pod is restarted and running with the new hot fix image. This may take a while
```
% oc get pods -n open-cluster-management-agent-addon |grep application-manager
application-manager-bd4f7c5db-zvsvx 1/1 Running 0 50m
```

## Patching hub cluster (ACM <= 2.4)

In `open-cluster-management` namespace on ACM hub cluster, edit the advanced-cluster-management.v2.3.0 csv. (or 2.3.2 CSV)

Expand All @@ -10,7 +72,7 @@ oc edit csv advanced-cluster-management.v2.3.0 -n open-cluster-management

Look for containers **multicluster-operators-standalone-subscription** and **multicluster-operators-hub-subscription** and update their images to `quay.io/open-cluster-management/multicluster-operators-subscription:TAG` (it is recommended you note the current **SHA** tag if you want to revert the change). Replace `TAG` with the actual image tag (use `latest` to get the latest upstream version). This will recreate `multicluster-operators-standalone-subscription-xxxxxxx` and `multicluster-operators-hub-subscription-xxxxxxx` pods in `open-cluster-management` namespace. Check that the new pods are running with the new container image.

## Patching managed clusters
## Patching managed clusters (ACM <= 2.4)

If you are patching `local-cluster` managed cluster, which is the ACM hub cluster itself, run this command.

Expand Down
111 changes: 79 additions & 32 deletions docs/troubleshooting_guidence.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,50 +50,47 @@ I0207 22:59:38.422603 1 mcmhub_controller.go:726] subscription-hub-reconci
I0207 22:59:38.422618 1 mcmhub_controller.go:518] subscription-hub-reconciler/secondsub/second-level-sub "msg"="exit Hub Reconciling
...
```
### Set up log level for the hub subscription pod

- Open the ACM csv, append the log level to 1, save the csv
### Set up log level for the hub subscription pod (ACM >=2.7)
- On the hub, pause the MCH operator
```
% oc annotate mch -n open-cluster-management multiclusterhub mch-pause=true --overwrite=true
```

- Open the hub subscription pod, set up the log level to 1, save the pod
```
% oc edit csv -n open-cluster-management advanced-cluster-management.v2.5.0
% oc edit pods -n open-cluster-management multicluster-operators-hub-subscription-5cfdf4bb84-xcc9z
- name: multicluster-operators-hub-subscription
containers:
- command:
- /usr/local/bin/multicluster-operators-subscription
- --sync-interval=60
- --v=1
containers:
- command:
- /usr/local/bin/multicluster-operators-subscription
- --sync-interval=60
- --v=1
```

- Make sure the hub subscription pod is restarted to run.
- Make sure the hub subscription pod is restarted and running.
- Check more details from the hub subscription pod log

### Set up memory limit for the hub subscription pod

- Open the ACM csv, search the `multicluster-operators-hub-subscription` container, update the memory limit, save the csv

### Set up memory limit for the hub subscription pod (ACM >=2.7)
- On the hub, pause the MCH operator
```
% oc annotate mch -n open-cluster-management multiclusterhub mch-pause=true --overwrite=true
```
% oc edit csv -n open-cluster-management advanced-cluster-management.v2.5.0

- name: multicluster-operators-hub-subscription
spec:
replicas: 1
selector:
matchLabels:
app: multicluster-operators-hub-subscription
......
- Open the hub subscription pod, update the memory limit, save the pod
```
% oc edit pods -n open-cluster-management multicluster-operators-hub-subscription-5cfdf4bb84-xcc9z
resources:
limits:
cpu: 750m
memory: 2Gi ================> this is the hub subscription pod memory limit, update it to 4Gi for example.
requests:
cpu: 150m
memory: 128Mi
resources:
limits:
cpu: 750m
memory: 2Gi ================> this is the hub subscription pod memory limit, update it to 4Gi for example.
requests:
cpu: 150m
memory: 128Mi
```
- verify the hub subscription pod should be restarted with the new memory limit. It could take a while for OLM to be reconciled to do so.

- verify the hub subscription pod is restarted and running with the new memory limit.
```
% oc get pods -n open-cluster-management |grep hub-sub
multicluster-operators-hub-subscription-58858c488f-c52zt 1/1 Running 2 (28h ago) 27d
Expand Down Expand Up @@ -229,7 +226,7 @@ search for Deployment. Set spec.replicas to 0:
% oc get pods -n open-cluster-management-agent-addon |grep klusterlet-addon-appmgr
klusterlet-addon-appmgr-794d76bcbf-tbsn5 1/1 Running 0 14s
```
### Set up memory limit for the managed subscription pod (ACM >= 2.5)
### Set up memory limit for the managed subscription pod (ACM in 2.5 and 2.6)

- On the hub cluster, pause mch reconcile
```
Expand All @@ -254,6 +251,56 @@ search for Deployment. Set spec.replicas to 0
% oc get pods -n open-cluster-management-agent-addon |grep application-manager
```

### Set up memory limit for the managed subscription pod (ACM >= 2.7)
- Enable addondeploymentconfigs to be used in the application-manager addon on all managed clusters
```
apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
name: application-manager
spec:
addOnMeta:
description: Processes events and other requests to managed resources.
displayName: Application Manager
supportedConfigs:
- group: addon.open-cluster-management.io
resource: addondeploymentconfigs
```

- Specify memory request and memory limit in the AddOnDeploymentConfig created in a managed cluster NS e.g. `cluster1`
```
apiVersion: addon.open-cluster-management.io/v1alpha1
kind: AddOnDeploymentConfig
metadata:
name: deploy-config
namespace: cluster1
spec:
customizedVariables:
- name: RequestMemory
value: 512Mi
- name: LimitsMemory
value: 4Gi
```

- Link the AddOnDeploymentConfig CR to the application-manager ManagedClusterAddOn in the same managed cluster NS e.g. `cluster1`
```
apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
name: application-manager
namespace: cluster1
spec:
installNamespace: open-cluster-management-agent-addon
configs:
- group: addon.open-cluster-management.io
resource: addondeploymentconfigs
namespace: cluster1
name: deploy-config
```

As a result, the new memory limit and memory request will be applied to the application-manager pod on the `cluster1`.
The application-manager pod on different managed clusters could set up different memory limits.

### Set up new image for the managed subscription pod (ACM >= 2.5)

Since ACM 2.5, there is no klusterlet-addon-operator any more. The app addon pod (application-manager) running on the managed cluster is deployed by the hub subscription pod.
Expand Down

0 comments on commit 50c210a

Please sign in to comment.