-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[controller] Ensure k8s statefulset statuses are fresh #271
Conversation
In debugging #268, I discovered that in all cases where we updated more than 1 statefulset, it was because the observed generation on the set was 1 less than the set's actual generation. This means that the Kubernetes statefulset controller had not yet processed the update, and that the information on `Status` that we check (ready replicas, current replicas, etc.) was not up-to-date. Also out of paranoia, I changed the update behavior to ensure that we keep the old object meta fields around when we send the `Update` to Kubernetes. I think if this isn't the case, we potentially overwrite a set even if it had changed since we last process it. This PR ensures that we stop processing the cluster if any of the set statuses are not fresh, and that we use familiar conflict methods on set updated. Closes #268.
pkg/controller/controller.go
Outdated
@@ -73,6 +73,7 @@ const ( | |||
|
|||
var ( | |||
errOrphanedPod = errors.New("pod does not belong to an m3db cluster") | |||
errNoSetLabel = errors.New("pod does not have a parent statefulset error") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused. Will remove.
pkg/controller/controller.go
Outdated
// If any of the statefulsets aren't ready, wait until they are as we'll get | ||
// another event (ready == bootstrapped) | ||
for _, sts := range childrenSets { | ||
c.logger.Info("processing set", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still need to decide whether these logs are too noisy. If we had them, however, debugging this issue would have been much easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you simply make these debug logs and then make it easy to turn on debug logging? (might already be easy by just changing the config file?)
Codecov Report
@@ Coverage Diff @@
## master #271 +/- ##
==========================================
+ Coverage 75.62% 76.10% +0.47%
==========================================
Files 32 32
Lines 2343 2381 +38
==========================================
+ Hits 1772 1812 +40
+ Misses 427 426 -1
+ Partials 144 143 -1 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙏
In debugging #268, I discovered that in all cases where we updated more
than 1 statefulset, it was because the observed generation on the set
was 1 less than the set's actual generation. This means that the
Kubernetes statefulset controller had not yet processed the update, and
that the information on
Status
that we check (ready replicas, currentreplicas, etc.) was not up-to-date.
Also out of paranoia, I changed the update behavior to ensure that we
keep the old object meta fields around when we send the
Update
toKubernetes. I think if this isn't the case, we potentially overwrite a
set even if it had changed since we last process it.
This PR ensures that we stop processing the cluster if any of the set
statuses are not fresh, and that we use familiar conflict methods on set
updated.
Closes #268.