Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recreate resources when a member cluster rejoins the ClusterSet #5410

Merged
merged 1 commit into from
Oct 27, 2023

Conversation

luolanzone
Copy link
Contributor

@luolanzone luolanzone commented Aug 21, 2023

  1. Add ClusterSet event mapping to several member cluster controllers
    to ensure that when a ClusterSet CR is recreated in a member cluster,
    the corresponding ResourceExports will be created again in the leader
    cluster.
  2. Skip reconciling resources when there is no ClusterSet CR in the
    member cluster.

This change is on top of #5351, please review the top commit. Thanks.

@luolanzone luolanzone added the area/multi-cluster Issues or PRs related to multi cluster. label Aug 21, 2023
@luolanzone luolanzone force-pushed the mc-clusterset-rejion branch 3 times, most recently from 4c79c34 to 01bdabc Compare August 21, 2023 10:15
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone luolanzone added this to the Antrea v1.14 release milestone Aug 23, 2023
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone luolanzone requested review from jianjuns and Dyanngg August 24, 2023 01:45
@luolanzone luolanzone force-pushed the mc-clusterset-rejion branch from 01bdabc to dad1f08 Compare August 24, 2023 02:53
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@jianjuns jianjuns changed the title Recreate resources when a member cluster rejoin the ClusterSet Recreate resources when a member cluster rejoins the ClusterSet Aug 25, 2023
Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question - do we need to recreate any resource when the ClusterSet is recreated in a leader?

requests[i] = reconcile.Request{
NamespacedName: podNamespacedName,
}
r.podLabelToRecreate.Insert(podNamespacedName.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cost seems high to process every Pod one by one. Do we have a more efficient way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to process every Pod when the ClusterSet is recreated. Even we have some cached label data, we can't tell if there is any Pod label update events during the time windows before the ClusterSet is recreated.
For Pod event handling, we tuned the concurrent go-routine number via this PR #5099 for a better performance.

@luolanzone
Copy link
Contributor Author

Question - do we need to recreate any resource when the ClusterSet is recreated in a leader?

There is no resources we can recreate in the leader itself. When the member cluster is rejoining the ClusterSet, it will create corresponding resources in the leader cluster. The AntreaClusterNetworkPolicy type of ResourceExports are created manually by the user in the leader cluster only for ACNP replication, it's not managed by Antrea MC controller.

@luolanzone luolanzone force-pushed the mc-clusterset-rejion branch 3 times, most recently from f5dcbc7 to 48f87d8 Compare September 21, 2023 15:37
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone luolanzone force-pushed the mc-clusterset-rejion branch from 834325c to 64e92d2 Compare October 8, 2023 09:53
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone luolanzone force-pushed the mc-clusterset-rejion branch from 64e92d2 to 4d34495 Compare October 10, 2023 02:52
@luolanzone luolanzone force-pushed the mc-clusterset-rejion branch 2 times, most recently from e79306d to bbe0c0c Compare October 23, 2023 03:06
@@ -176,7 +192,9 @@ func (r *GatewayReconciler) createResourceExport(ctx context.Context, req ctrl.R
// SetupWithManager sets up the controller with the Manager.
func (r *GatewayReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&mcsv1alpha1.Gateway{}).
For(&mcv1alpha1.Gateway{}).
Watches(&source.Kind{Type: &mcv1alpha2.ClusterSet{}}, handler.EnqueueRequestsFromMapFunc(r.clusterSetMapFunc),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, probably let ClusterSet controller post an event to other Reconcilers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it before, but for reconciler process, I feel we'd rely on event mapping, the channel way is suitable for a one time case. But for reconcile, we are mapping multiple objects to one reconcile process, it's not suitable to use channel way since we need to handle every request's retry if there is any error. It's better to rely on controller runtime framework to handle the requests and retry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not get what is the difference. Even you trigger from ClusterSet controller, you can post an event for each affected object. The key is why we should depend on a remote object and another implicit relation, when you have all state in the local process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got your point about the state are all in the local process. But in multi-cluster, we are not handling events for each objects directly, we rely on the controller framework to do reconcile. It will be triggered with event filter in SetupWithManager(mgr ctrl.Manager), the framework will generate []reconcile.Request{} for each object and call Reconcile(ctx context.Context, req ctrl.Request) correspondingly.
I didn't get how to post an event for each affected object. Do you mean to generate []reconcile.Request{} for each object? If so, it means we need to call Reconcile(ctx context.Context, req ctrl.Request) by ourselves per my understanding. It seems repeat the controller-runtime framework and not doable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Could we stop the controller when commonArea is stopped, and start it when commonArea is started?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it's probably doable considering importer part is started after common area is ready. I will check how to refine this, but it will be a big change for current design, I doubt it can be done in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, we can think about a follow-up PR in the next release.

@luolanzone luolanzone force-pushed the mc-clusterset-rejion branch from bbe0c0c to 540506c Compare October 24, 2023 02:48
@@ -176,7 +192,9 @@ func (r *GatewayReconciler) createResourceExport(ctx context.Context, req ctrl.R
// SetupWithManager sets up the controller with the Manager.
func (r *GatewayReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&mcsv1alpha1.Gateway{}).
For(&mcv1alpha1.Gateway{}).
Watches(&source.Kind{Type: &mcv1alpha2.ClusterSet{}}, handler.EnqueueRequestsFromMapFunc(r.clusterSetMapFunc),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Could we stop the controller when commonArea is stopped, and start it when commonArea is started?

var commonArea commonarea.RemoteCommonArea
commonArea, r.localClusterID, err = r.commonAreaGetter.GetRemoteCommonAreaAndLocalID()
commonArea, r.localClusterID, _ = r.commonAreaGetter.GetRemoteCommonAreaAndLocalID()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be a case the commonArea is to be stopped after cleanup is done (which may be in progress or failed). Do we have a way to check that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that will be a case, GetRemoteCommonAreaAndLocalID() is getting commonArea via r.commonAreaGetter from ClusterSet controller, the commonArea is protected by commonAreaLock. If Reconcile in ClusterSet controller is processing the ClusterSet change, it will lock the commonArea, then GetRemoteCommonAreaAndLocalID() won't be able to get one before it gets the read lock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the cleanup retry case (before retry happens), but maybe that is not critical to cover.

I now feel a worse thing is there can be a race condition between these controllers and memberAnnounce deletion in ClusterSet controller, so they can for example create more ResourceExports even after memberAnnounce is deleted. I do not have a simple idea to solve that, and if you do not either, probably let us add a comment here and address the issue later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be possible since we only lock the commonArea on GetRemoteCommonAreaAndLocalID(). Other controllers doesn't lock it once they get the commonArea, and will continue the reconcile process. I will check if there is a way to resolve it, added a comment first.

@luolanzone luolanzone force-pushed the mc-clusterset-rejion branch 2 times, most recently from b8f88c3 to af6b18d Compare October 26, 2023 08:48
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits.

@luolanzone luolanzone force-pushed the mc-clusterset-rejion branch from af6b18d to 45d5e3b Compare October 27, 2023 04:54
jianjuns
jianjuns previously approved these changes Oct 27, 2023
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@luolanzone luolanzone force-pushed the mc-clusterset-rejion branch from b44c8e7 to a9ddbb9 Compare October 27, 2023 07:57
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

Comment on lines 410 to 412
// The Gateway will be removed when a ClusterSet is deleted, so here we can set
// the activeGateway to empty directly.
r.activeGateway = ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how the gateway will be removed when a ClusterSet is deleted, especially when the activeGateway is reset here.
I think updating activeGateway should be done in Reconcile and the map func should only trigger its process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added.

if len(clusterSet.Status.Conditions) > 0 && clusterSet.Status.Conditions[0].Status == corev1.ConditionTrue {
svcExports := &k8smcsv1alpha1.ServiceExportList{}
r.Client.List(ctx, svcExports)
existingSvcExports := sets.Set[string]{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't get the purpose of existingSvcExports

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale variable, removed

Comment on lines +561 to +560
r.installedSvcs = cache.NewIndexer(svcInfoKeyFunc, cache.Indexers{})
r.installedEps = cache.NewIndexer(epInfoKeyFunc, cache.Indexers{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, this handling will lead to no serviceExports being deleted eventually.
EndpointExport is perhaps the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added.

1. Add ClusterSet event mapping to several member cluster controllers
to ensure that when a ClusterSet CR is recreated in a member cluster,
the corresponding ResourceExports will be created again in the leader
cluster.
2. Skip reconciling resources when there is no ClusterSet CR in the
member cluster.

Signed-off-by: Lan Luo <luola@vmware.com>
@luolanzone luolanzone force-pushed the mc-clusterset-rejion branch from a9ddbb9 to af2a761 Compare October 27, 2023 09:33
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e

@tnqn
Copy link
Member

tnqn commented Oct 27, 2023

/skip-all

@tnqn tnqn merged commit 55e731d into antrea-io:main Oct 27, 2023
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/multi-cluster Issues or PRs related to multi cluster.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants