-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recreate resources when a member cluster rejoins the ClusterSet #5410
Conversation
4c79c34
to
01bdabc
Compare
/test-multicluster-e2e |
/test-multicluster-e2e |
01bdabc
to
dad1f08
Compare
/test-multicluster-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question - do we need to recreate any resource when the ClusterSet is recreated in a leader?
multicluster/controllers/multicluster/member/gateway_controller.go
Outdated
Show resolved
Hide resolved
requests[i] = reconcile.Request{ | ||
NamespacedName: podNamespacedName, | ||
} | ||
r.podLabelToRecreate.Insert(podNamespacedName.String()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cost seems high to process every Pod one by one. Do we have a more efficient way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to process every Pod when the ClusterSet is recreated. Even we have some cached label data, we can't tell if there is any Pod label update events during the time windows before the ClusterSet is recreated.
For Pod event handling, we tuned the concurrent go-routine number via this PR #5099 for a better performance.
591ff0a
to
7d43bd5
Compare
There is no resources we can recreate in the leader itself. When the member cluster is rejoining the ClusterSet, it will create corresponding resources in the leader cluster. The AntreaClusterNetworkPolicy type of ResourceExports are created manually by the user in the leader cluster only for ACNP replication, it's not managed by Antrea MC controller. |
multicluster/controllers/multicluster/member/gateway_controller.go
Outdated
Show resolved
Hide resolved
f5dcbc7
to
48f87d8
Compare
multicluster/controllers/multicluster/member/gateway_controller.go
Outdated
Show resolved
Hide resolved
48f87d8
to
834325c
Compare
/test-multicluster-e2e |
834325c
to
64e92d2
Compare
/test-multicluster-e2e |
64e92d2
to
4d34495
Compare
e79306d
to
bbe0c0c
Compare
@@ -176,7 +192,9 @@ func (r *GatewayReconciler) createResourceExport(ctx context.Context, req ctrl.R | |||
// SetupWithManager sets up the controller with the Manager. | |||
func (r *GatewayReconciler) SetupWithManager(mgr ctrl.Manager) error { | |||
return ctrl.NewControllerManagedBy(mgr). | |||
For(&mcsv1alpha1.Gateway{}). | |||
For(&mcv1alpha1.Gateway{}). | |||
Watches(&source.Kind{Type: &mcv1alpha2.ClusterSet{}}, handler.EnqueueRequestsFromMapFunc(r.clusterSetMapFunc), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, probably let ClusterSet controller post an event to other Reconcilers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it before, but for reconciler process, I feel we'd rely on event mapping, the channel way is suitable for a one time case. But for reconcile, we are mapping multiple objects to one reconcile process, it's not suitable to use channel way since we need to handle every request's retry if there is any error. It's better to rely on controller runtime framework to handle the requests and retry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not get what is the difference. Even you trigger from ClusterSet controller, you can post an event for each affected object. The key is why we should depend on a remote object and another implicit relation, when you have all state in the local process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got your point about the state are all in the local process. But in multi-cluster, we are not handling events for each objects directly, we rely on the controller framework to do reconcile. It will be triggered with event filter in SetupWithManager(mgr ctrl.Manager)
, the framework will generate []reconcile.Request{}
for each object and call Reconcile(ctx context.Context, req ctrl.Request)
correspondingly.
I didn't get how to post an event for each affected object. Do you mean to generate []reconcile.Request{}
for each object? If so, it means we need to call Reconcile(ctx context.Context, req ctrl.Request)
by ourselves per my understanding. It seems repeat the controller-runtime framework and not doable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Could we stop the controller when commonArea is stopped, and start it when commonArea is started?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it's probably doable considering importer part is started after common area is ready. I will check how to refine this, but it will be a big change for current design, I doubt it can be done in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we can think about a follow-up PR in the next release.
multicluster/controllers/multicluster/member/gateway_controller.go
Outdated
Show resolved
Hide resolved
multicluster/controllers/multicluster/member/gateway_controller.go
Outdated
Show resolved
Hide resolved
bbe0c0c
to
540506c
Compare
@@ -176,7 +192,9 @@ func (r *GatewayReconciler) createResourceExport(ctx context.Context, req ctrl.R | |||
// SetupWithManager sets up the controller with the Manager. | |||
func (r *GatewayReconciler) SetupWithManager(mgr ctrl.Manager) error { | |||
return ctrl.NewControllerManagedBy(mgr). | |||
For(&mcsv1alpha1.Gateway{}). | |||
For(&mcv1alpha1.Gateway{}). | |||
Watches(&source.Kind{Type: &mcv1alpha2.ClusterSet{}}, handler.EnqueueRequestsFromMapFunc(r.clusterSetMapFunc), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Could we stop the controller when commonArea is stopped, and start it when commonArea is started?
var commonArea commonarea.RemoteCommonArea | ||
commonArea, r.localClusterID, err = r.commonAreaGetter.GetRemoteCommonAreaAndLocalID() | ||
commonArea, r.localClusterID, _ = r.commonAreaGetter.GetRemoteCommonAreaAndLocalID() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be a case the commonArea is to be stopped after cleanup is done (which may be in progress or failed). Do we have a way to check that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that will be a case, GetRemoteCommonAreaAndLocalID() is getting commonArea via r.commonAreaGetter from ClusterSet controller, the commonArea is protected by commonAreaLock. If Reconcile in ClusterSet controller is processing the ClusterSet change, it will lock the commonArea, then GetRemoteCommonAreaAndLocalID()
won't be able to get one before it gets the read lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the cleanup retry case (before retry happens), but maybe that is not critical to cover.
I now feel a worse thing is there can be a race condition between these controllers and memberAnnounce deletion in ClusterSet controller, so they can for example create more ResourceExports even after memberAnnounce is deleted. I do not have a simple idea to solve that, and if you do not either, probably let us add a comment here and address the issue later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be possible since we only lock the commonArea on GetRemoteCommonAreaAndLocalID(). Other controllers doesn't lock it once they get the commonArea, and will continue the reconcile process. I will check if there is a way to resolve it, added a comment first.
b8f88c3
to
af6b18d
Compare
/test-multicluster-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nits.
multicluster/controllers/multicluster/member/gateway_controller.go
Outdated
Show resolved
Hide resolved
multicluster/controllers/multicluster/member/gateway_controller.go
Outdated
Show resolved
Hide resolved
multicluster/controllers/multicluster/member/gateway_controller.go
Outdated
Show resolved
Hide resolved
af6b18d
to
45d5e3b
Compare
45d5e3b
to
b44c8e7
Compare
/test-multicluster-e2e |
b44c8e7
to
a9ddbb9
Compare
/test-multicluster-e2e |
// The Gateway will be removed when a ClusterSet is deleted, so here we can set | ||
// the activeGateway to empty directly. | ||
r.activeGateway = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how the gateway will be removed when a ClusterSet is deleted, especially when the activeGateway is reset here.
I think updating activeGateway should be done in Reconcile
and the map func should only trigger its process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment added.
if len(clusterSet.Status.Conditions) > 0 && clusterSet.Status.Conditions[0].Status == corev1.ConditionTrue { | ||
svcExports := &k8smcsv1alpha1.ServiceExportList{} | ||
r.Client.List(ctx, svcExports) | ||
existingSvcExports := sets.Set[string]{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didn't get the purpose of existingSvcExports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stale variable, removed
r.installedSvcs = cache.NewIndexer(svcInfoKeyFunc, cache.Indexers{}) | ||
r.installedEps = cache.NewIndexer(epInfoKeyFunc, cache.Indexers{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, this handling will lead to no serviceExports being deleted eventually.
EndpointExport is perhaps the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment added.
1. Add ClusterSet event mapping to several member cluster controllers to ensure that when a ClusterSet CR is recreated in a member cluster, the corresponding ResourceExports will be created again in the leader cluster. 2. Skip reconciling resources when there is no ClusterSet CR in the member cluster. Signed-off-by: Lan Luo <luola@vmware.com>
a9ddbb9
to
af2a761
Compare
/test-multicluster-e2e |
/skip-all |
to ensure that when a ClusterSet CR is recreated in a member cluster,
the corresponding ResourceExports will be created again in the leader
cluster.
member cluster.
This change is on top of #5351, please review the top commit. Thanks.