[controller] Ensure k8s statefulset statuses are fresh #271

schallert · 2021-02-10T05:52:11Z

In debugging #268, I discovered that in all cases where we updated more
than 1 statefulset, it was because the observed generation on the set
was 1 less than the set's actual generation. This means that the
Kubernetes statefulset controller had not yet processed the update, and
that the information on Status that we check (ready replicas, current
replicas, etc.) was not up-to-date.

Also out of paranoia, I changed the update behavior to ensure that we
keep the old object meta fields around when we send the Update to
Kubernetes. I think if this isn't the case, we potentially overwrite a
set even if it had changed since we last process it.

This PR ensures that we stop processing the cluster if any of the set
statuses are not fresh, and that we use familiar conflict methods on set
updated.

Closes #268.

In debugging #268, I discovered that in all cases where we updated more than 1 statefulset, it was because the observed generation on the set was 1 less than the set's actual generation. This means that the Kubernetes statefulset controller had not yet processed the update, and that the information on `Status` that we check (ready replicas, current replicas, etc.) was not up-to-date. Also out of paranoia, I changed the update behavior to ensure that we keep the old object meta fields around when we send the `Update` to Kubernetes. I think if this isn't the case, we potentially overwrite a set even if it had changed since we last process it. This PR ensures that we stop processing the cluster if any of the set statuses are not fresh, and that we use familiar conflict methods on set updated. Closes #268.

schallert · 2021-02-10T05:56:51Z

pkg/controller/controller.go

@@ -73,6 +73,7 @@ const (

 var (
 	errOrphanedPod         = errors.New("pod does not belong to an m3db cluster")
+	errNoSetLabel          = errors.New("pod does not have a parent statefulset error")


Unused. Will remove.

schallert · 2021-02-10T05:57:36Z

pkg/controller/controller.go

 	// If any of the statefulsets aren't ready, wait until they are as we'll get
 	// another event (ready == bootstrapped)
 	for _, sts := range childrenSets {
+		c.logger.Info("processing set",


Still need to decide whether these logs are too noisy. If we had them, however, debugging this issue would have been much easier.

Can you simply make these debug logs and then make it easy to turn on debug logging? (might already be easy by just changing the config file?)

codecov · 2021-02-10T06:07:05Z

Codecov Report

Merging #271 (e111e5b) into master (81aaae1) will increase coverage by 0.47%.
The diff coverage is 91.83%.

@@            Coverage Diff             @@
##           master     #271      +/-   ##
==========================================
+ Coverage   75.62%   76.10%   +0.47%     
==========================================
  Files          32       32              
  Lines        2343     2381      +38     
==========================================
+ Hits         1772     1812      +40     
+ Misses        427      426       -1     
+ Partials      144      143       -1

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 81aaae1...e111e5b. Read the comment docs.

jeromefroe

🙏

schallert requested a review from jeromefroe February 10, 2021 05:52

schallert mentioned this pull request Feb 10, 2021

[controller] Run with 1 worker #270

Closed

undo dockerfile change

9b2ccc2

schallert commented Feb 10, 2021

View reviewed changes

unused changes

a05ac6e

wesleyk approved these changes Feb 10, 2021

View reviewed changes

jeromefroe approved these changes Feb 10, 2021

View reviewed changes

jeromefroe mentioned this pull request Feb 10, 2021

[controller] Maintain map of StatefulSet generations to avoid stale cache #269

Closed

schallert added 2 commits February 11, 2021 03:59

debug log

788484c

tests

e111e5b

schallert merged commit 81b6f3a into master Feb 11, 2021

schallert deleted the schallert/check_pod_revisions branch February 11, 2021 04:39

jeromefroe mentioned this pull request Mar 26, 2021

Operator not scaling up cluster #267

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[controller] Ensure k8s statefulset statuses are fresh #271

[controller] Ensure k8s statefulset statuses are fresh #271

schallert commented Feb 10, 2021

schallert Feb 10, 2021

schallert Feb 10, 2021

robskillington Feb 10, 2021

codecov bot commented Feb 10, 2021 •

edited

Loading

jeromefroe left a comment

[controller] Ensure k8s statefulset statuses are fresh #271

[controller] Ensure k8s statefulset statuses are fresh #271

Conversation

schallert commented Feb 10, 2021

schallert Feb 10, 2021

Choose a reason for hiding this comment

schallert Feb 10, 2021

Choose a reason for hiding this comment

robskillington Feb 10, 2021

Choose a reason for hiding this comment

codecov bot commented Feb 10, 2021 • edited Loading

Codecov Report

jeromefroe left a comment

Choose a reason for hiding this comment

codecov bot commented Feb 10, 2021 •

edited

Loading