-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add externally managed predicate #2383
Add externally managed predicate #2383
Conversation
@alexander-demichev There are things that needs to be addressed to get functional parity with the original impl for external infra and being able to run machines gracefully in this environment. Specifically: Drop default ssh name assumption https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/2124/files#diff-4d2ea189a70ea097ffeeb50f78cc676a224386d4fe17b1bec9c83a4731d9c379R215 Drop security groups assumption https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/2124/files#diff-4d2ea189a70ea097ffeeb50f78cc676a224386d4fe17b1bec9c83a4731d9c379R364 Drop LB assumption https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/2124/files#diff-4d2ea189a70ea097ffeeb50f78cc676a224386d4fe17b1bec9c83a4731d9c379R176 |
@@ -343,6 +343,17 @@ func (m *MachineScope) IsEKSManaged() bool { | |||
return m.InfraCluster.InfraCluster().GetObjectKind().GroupVersionKind().Kind == "AWSManagedControlPlane" | |||
} | |||
|
|||
func (m *MachineScope) IsExternallyManaged() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexander-demichev can you please include "fix #2356" in the PR description? Other than this nit https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/2383/files#r629435596 this lgtm overall. |
@@ -173,7 +173,7 @@ func (s *Service) CreateInstance(scope *scope.MachineScope, userData []byte) (*i | |||
} | |||
input.SubnetID = subnetID | |||
|
|||
if !scope.IsEKSManaged() && s.scope.Network().APIServerELB.DNSName == "" { | |||
if !scope.IsExternallyManaged() && !scope.IsEKSManaged() && s.scope.Network().APIServerELB.DNSName == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was under the impression that it being externally managed means there should be no reconciliation at all? So do we not need something like
if scope.IsExternallyManaged() {
// Should not reconcile
return nil, nil
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is related to instance creation, we are skipping logic that is relying on infrastructure created by the managed cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that if we are using external but still provide the APIServer DNSName then we still work as expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic is only used to check that a load balancer was created.
controllers/awscluster_controller.go
Outdated
err := r.Get(ctx, key, awsCluster) | ||
if err != nil { | ||
log.V(4).Error(err, "Failed to get AWS cluster") | ||
panic(fmt.Sprintf("Failed to get AWS cluster %T", err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's inline the error with the if here.
Also why %T rather than %v?
Thanks for this. Looks ok from a cursory glance with the comments from @enxebre , but will take a deeper look |
@@ -173,7 +173,7 @@ func (s *Service) CreateInstance(scope *scope.MachineScope, userData []byte) (*i | |||
} | |||
input.SubnetID = subnetID | |||
|
|||
if !scope.IsEKSManaged() && s.scope.Network().APIServerELB.DNSName == "" { | |||
if !scope.IsExternallyManaged() && !scope.IsEKSManaged() && s.scope.Network().APIServerELB.DNSName == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that if we are using external but still provide the APIServer DNSName then we still work as expected?
if scope.IsExternallyManaged() { | ||
return nil, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to have the same effect as the changes on lines 191 right?
Also, do we not still need security groups? Or are they just fetched explicitly from the AWSCluster in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to have the same effect as the changes on lines 191 right?
I think this is called in more places, so we possibly want to drop the other one and keep this logic here within the func.
Also, do we not still need security groups? Or are they just fetched explicitly from the AWSCluster in this case?
I expect machines with externallyManaged infra to set their desired security groups explicitly through the API contract https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/api/v1alpha4/awsmachine_types.go#L100
and don't do any infra naming convention assumptions which would be driven by the not running infra controller. In fact without this change this would fail
return nil, awserrors.NewFailedDependency(fmt.Sprintf("%s security group not available", sg)) |
controllers/awscluster_controller.go
Outdated
|
||
if err := r.Get(ctx, key, awsCluster); err != nil { | ||
log.V(4).Error(err, "Failed to get AWS cluster") | ||
panic(fmt.Sprintf("Failed to get AWS cluster %v", err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think panicking here is probably not the way to go, as this could happen for example when the API is unavailable or if the Cluster resource has been created before the AWSCluster resource.
IMO, the solution here when there is an error is to log the error (as you've done) and fail closed (ie return nil). It means we miss the potential event from the Cluster object, but it's better than panicking IMO.
AFAIK the events from the cluster mapping are a bonus anyway, so should be safe to fail closed here
Should we also ensure this in this PR?
|
pkg/cloud/services/ec2/instances.go
Outdated
ids, err := s.GetCoreSecurityGroups(scope) | ||
if err != nil { | ||
return nil, err | ||
if !scope.IsExternallyManaged() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are discriminating here https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/2383/files#r630312878 seems we can drop this.
@@ -104,6 +105,13 @@ func (r *AWSCluster) ValidateUpdate(old runtime.Object) error { | |||
) | |||
} | |||
|
|||
if annotations.IsExternallyManaged(oldC) && !annotations.IsExternallyManaged(r) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to check the other way around as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that the user can add this annotation and disable infrastructure management, am I wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going from managed to externally managed should be allowed. The other way around is not allowed.
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Thanks for working on this @alexander-demichev
Shall we add documentation for externally managed clusters? A separate issue to track it is fine. |
+1 @sedefsavas I think we need to make some updates to the developers notes about predicates in general. This isn't just a problem for this predicate but also things like the paused predicate. I don't think we ever really reached a conclusion on this in the thread |
Created #2412 to track docs. |
@randomvariable @sedefsavas @alexander-demichev @JoelSpeed any objection to proceed with this? |
I don't see the documentation as a blocker for this, we should prioritise updating that before we make this update for other providers though IMO. Happy to proceed from my perspective. |
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sedefsavas The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Add externally managed predicate. Clusters marked with
"cluster.x-k8s.io/managed-by"
annotation should be skipped from reconciliation.kubernetes-sigs/cluster-api#4135
kubernetes-sigs/cluster-api#4135
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Checklist:
Release note: