You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We found that when we specify bad values for the field spec.nodeAffinityLabels, we later cannot modify it to correct the statefulSet, and the statefulSet is stuck with the incorrect spec.nodeAffinityLabels. The only way to correct it is to manually modify the statefulSet spec to delete the invalid nodeAffinityLabels.
The root cause of this issue is similar to the previous one: #324
In the previous issue, we were acknowledged that the statefulSet reconciliation being blocked while doing updating is according to the design.
We still want to report this incident to help document the potential bad consequences that can be caused. In this case, it caused dead lock which requires a restart or manual statefulSet correction to solve.
Did you expect to see something different?
The node affinity config on pods should be updated/removed when users updated/remove the invalid nodeAffinityLabels settings in CR.
How to reproduce it (as minimally and precisely as possible):
sync-by-unitobot
changed the title
Unable to update nodeAffinityLabels once setted with a bad value
K8SSAND-1520 ⁃ Unable to update nodeAffinityLabels once setted with a bad value
May 19, 2022
This is different than #324 in that there is no scaling involved. This is the expected behavior and has been for some time; however, I do consider it a bug and think it should be changed.
Any changes to the podTemplateSpec property of the underlying StatefulSet(s) will not be applied unless all Cassandra pods are in the ready state. Chicken meet egg :)
The work around that I typically recommend is to set stopped: true in your CassandraDatacenter spec. This will scale the StatefulSets down to zero pods. Then if you apply a change that involves an update to the podTemplateSpec it will be applied because there aren't any pods that are not ready. Lastly after applying the changes, set stopped: true to scale the StatefulSets back up. Note that this will not result in any data loss. PVCs aren't touched with this process.
Got it, thanks for the confirmation! We are also happy to contribute if you have plans to fix it:)
hoyhbx
changed the title
K8SSAND-1520 ⁃ Unable to update nodeAffinityLabels once setted with a bad value
K8SSAND-1520 ⁃ Unable to update Cassandra cluster once it gets into unhealthy state
Apr 15, 2023
sync-by-unitobot
changed the title
K8SSAND-1520 ⁃ Unable to update Cassandra cluster once it gets into unhealthy state
Unable to update Cassandra cluster once it gets into unhealthy state
Oct 11, 2024
What happened?
We found that when we specify bad values for the field
spec.nodeAffinityLabels
, we later cannot modify it to correct the statefulSet, and the statefulSet is stuck with the incorrectspec.nodeAffinityLabels
. The only way to correct it is to manually modify the statefulSet spec to delete the invalidnodeAffinityLabels
.The root cause of this issue is similar to the previous one: #324
In the previous issue, we were acknowledged that the statefulSet reconciliation being blocked while doing updating is according to the design.
We still want to report this incident to help document the potential bad consequences that can be caused. In this case, it caused dead lock which requires a restart or manual statefulSet correction to solve.
Did you expect to see something different?
The node affinity config on pods should be updated/removed when users updated/remove the invalid
nodeAffinityLabels
settings in CR.How to reproduce it (as minimally and precisely as possible):
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.7.1/cert-manager.yaml kubectl apply -f init.yaml kubectl apply --force-conflicts --server-side -k 'github.com/k8ssandra/cass-operator/config/deployments/cluster?ref=v1.10.3'
kubectl apply -f cr1.yaml
cr1.yaml:
Environment
Cass Operator version:
* Kubernetes version information: `v1.21.1 * Kubernetes cluster kind:```kind v0.11.1 go1.16.4 linux/amd64docker.io/k8ssandra/cass-operator@sha256:fb9d9822fceda0057a1de39b690a5cfe570980a93e3782948482ccf68c3683bc
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
Changing the name to server-storage is the only change we have made compared to upstream
name: server-storage
provisioner: rancher.io/local-path
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
The text was updated successfully, but these errors were encountered: