Watches always fail to start once when agent restarts #1035
Labels
kind/bug
Categorizes issue or PR as related to a bug.
priority/important-longterm
Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Describe the bug
I don't know if this is a bug, but when force restarting an Antrea Agent on a Node (by deleting the previous Pod), I always see these logs on restart:
Notice the warnings.
The call to this function is failing:
https://github.com/vmware-tanzu/antrea/blob/5be570b203cde54f42eaeb8f52a4353f846ff214/pkg/agent/client.go#L72-L76
I believe this is because the apiserver code uses the ConfigMap lister before the cache has synced (not 100% sure): https://github.com/kubernetes/apiserver/blob/7b7ecfc9c50835ea75f5dfe2abd93036cf9628cf/pkg/server/dynamiccertificates/configmap_cafile_content.go#L141-L146
We could ignore the error in
RunOnce
, like is done here: https://github.com/kubernetes/apiserver/blob/7b7ecfc9c50835ea75f5dfe2abd93036cf9628cf/pkg/server/dynamiccertificates/configmap_cafile_content.go#L189-L195But the watches would still fail. @tnqn do you have an idea on how we can avoid these warnings?
To Reproduce
Deploy Antrea on a cluster. Delete an Antrea Agent Pod on a Node. Wait for the Agent to restart and look at the logs.
Expected
No warning since the antrea-ca ConfigMap already exists.
Actual behavior
Warnings about failure to retrieve certificate, and consequently warnings about failure to start the watches.
Versions:
Antrea v0.9.0-dev
The text was updated successfully, but these errors were encountered: