-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait for Antrea client to be ready before starting watches #1042
Conversation
Thanks for your PR. The following commands are available:
These commands can only be run by members of the vmware-tanzu organization. |
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question about error handling, otherwise LGTM
klog.Errorf("Unable to get Antrea client: %v", err) | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the right way to handle the error here? We will just log one error message and return the error (which will be ignored by the caller). I would argue that this is more of a "fatal" error, that should cause the agent to crash, so that it doesn't go unnoticed. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing it out, I just took another look, wait.PollImmediateUntil
only returns error when stopCh is closed, which is part of process exiting, then perhaps the log should just say it no longer waits for it and stop the goroutine gracefully. We shouldn't log errors in other controllers too.
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/tools/cache/shared_informer.go#L266
https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/daemon/daemon_controller.go#L288
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I think my once concern is that if the ConfigMap doesn't become available for a while, e.g. because of a controller issue, we will not get a single message in the logs about it, unless I am missing something. Maybe a log message with an exponential backoff would be nice.
/test-all |
@antoninbas I understand your concern now, I have added an info log to indicate that the controller is waiting for antrea client ready per 2 seconds, I didn't use exponential backoff for the whole check because it still needs to be aggressive since it's important to receive network policy as early as possible. Please see if it addresses your concern. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Fixes #1035