Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow-updatespec once can be removed prematurely #699

Closed
burmanm opened this issue Sep 6, 2024 · 0 comments · Fixed by #698
Closed

allow-updatespec once can be removed prematurely #699

burmanm opened this issue Sep 6, 2024 · 0 comments · Fixed by #698
Assignees
Labels
bug Something isn't working ready-for-review Issues in the state 'ready-for-review'

Comments

@burmanm
Copy link
Contributor

burmanm commented Sep 6, 2024

What happened?

Three rack system with one node on each rack was requiring an update can sometimes cause the update annotation to be removed before all the StatefulSets have been updated. For example, in this log we update the r1 15:55:48 and then in 15:56:55 we notice that r2 would have required an update also, but the annotation was removed before we processed this part.

2024-09-05T15:52:51.101Z	INFO	update is blocked, but statefulset needs an update. Marking datacenter as requiring update.	{"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"dc1","namespace":"test-upgrade-operator"}, "namespace": "test-upgrade-operator", "name": "dc1", "reconcileID": "f923057c-694c-410f-bd99-ff77f0eaf243", "namespace": "test-upgrade-operator", "datacenterName": "my-super-dc", "clusterName": "cluster1", "rackName": "r1"}
2024-09-05T15:52:56.631Z	INFO	update is blocked, but statefulset needs an update. Marking datacenter as requiring update.	{"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"dc1","namespace":"test-upgrade-operator"}, "namespace": "test-upgrade-operator", "name": "dc1", "reconcileID": "3ef08510-5548-413f-8c7e-95e841c9474d", "namespace": "test-upgrade-operator", "datacenterName": "my-super-dc", "clusterName": "cluster1", "rackName": "r1"}
2024-09-05T15:55:48.095Z	INFO	statefulset needs an update	{"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"dc1","namespace":"test-upgrade-operator"}, "namespace": "test-upgrade-operator", "name": "dc1", "reconcileID": "2ce2ae60-7928-4c90-96b4-84e5fa1ca6a0", "namespace": "test-upgrade-operator", "datacenterName": "my-super-dc", "clusterName": "cluster1", "rackName": "r1"}
2024-09-05T15:56:55.263Z	INFO	update is blocked, but statefulset needs an update. Marking datacenter as requiring update.	{"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"dc1","namespace":"test-upgrade-operator"}, "namespace": "test-upgrade-operator", "name": "dc1", "reconcileID": "5db48d2d-cfa9-4662-be7d-92662768cd93", "namespace": "test-upgrade-operator", "datacenterName": "my-super-dc", "clusterName": "cluster1", "rackName": "r2"}
2024-09-05T15:57:02.976Z	INFO	update is blocked, but statefulset needs an update. Marking datacenter as requiring update.	{"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"dc1","namespace":"test-upgrade-operator"}, "namespace": "test-upgrade-operator", "name": "dc1", "reconcileID": "b09c4e3d-3066-4c3a-8109-d3a0d592eeb1", "namespace": "test-upgrade-operator", "datacenterName": "my-super-dc", "clusterName": "cluster1", "rackName": "r2"}

And that means the datacenter is now in state where it requires an update, but the r1 is updated while r2 and r3 are not. We shouldn't get to the end of reconciliation if there's still other racks requiring an update.

What did you expect to happen?

All racks should be updated before we remove the annotation.

How can we reproduce it (as minimally and precisely as possible)?

operator_upgrade test in some cases triggers this behavior.

cass-operator version

1.22.1

Kubernetes version

1.30

Method of installation

No response

Anything else we need to know?

No response

┆Issue is synchronized with this Jira Story by Unito
┆Fix Versions: 2024-10,2024-11
┆Issue Number: CASS-63

@burmanm burmanm added the bug Something isn't working label Sep 6, 2024
@burmanm burmanm self-assigned this Sep 6, 2024
@adejanovski adejanovski added assess Issues in the state 'assess' ready-for-review Issues in the state 'ready-for-review' and removed assess Issues in the state 'assess' labels Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ready-for-review Issues in the state 'ready-for-review'
Projects
No open projects
Status: Ready For Review
Development

Successfully merging a pull request may close this issue.

2 participants