-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[controller] Only list objects created by operator #222
Conversation
Continuously listing every pod and statefulset in large clusters can slow down the operator control loops significantly. We know every pod and statefulset created by an M3DBCluster will have predetermined labels, and can only list objects from the Kubernetes API with those labels.
Codecov Report
@@ Coverage Diff @@
## master #222 +/- ##
==========================================
+ Coverage 75.50% 75.53% +0.02%
==========================================
Files 30 30
Lines 2111 2113 +2
==========================================
+ Hits 1594 1596 +2
Misses 391 391
Partials 126 126 Continue to review full report at Codecov.
|
@@ -53,7 +57,7 @@ import ( | |||
|
|||
const ( | |||
// Informers will resync on this interval | |||
_informerSyncDuration = time.Minute | |||
_informerSyncDuration = 5 * time.Minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fairly large increase, is it better to take a half measure to 2.5mins or something perhaps?
How often would we see a lost delivery do we think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's pretty rare for us to see a lost delivery. We originally had this at a higher value but reduced it to 1min because we thought we were missing events, but it turns out we were overwhelming the controller.
For reference, the default value in kubebuilder is something like 6h and I've never seen issues with code using that.
filteredInformerFactory kubeinformers.SharedInformerFactory | ||
// The kube informer factory is still required to list nodes, as they don't | ||
// have our label. | ||
kubeInformerFactory kubeinformers.SharedInformerFactory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a shared informer factory for the node lister?
I checked where nodeLister
is used and it's only with .Get(..)
(i.e. lookup a specific host) so would it be possible to just use the kube client for that rather than an informer factory which can be affected by the same issue we're solving here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use the informer then we'll be notified of updates, and our Get(...)
calls will just read from the cache if it's fresh. If we were to use the client directly it would put slightly more load on the API server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than comments made without a whole lot of knowledge, so feel free to ignore and merge if they look like benign comments
Continuously listing every pod and statefulset in large clusters can
slow down the operator control loops significantly. We know every pod
and statefulset created by an M3DBCluster will have predetermined
labels, and can only list objects from the Kubernetes API with those
labels.