-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Do not allow overriding the Name
tag so that the AWSMachine controller can reliably find its EC2 instance
#4630
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks @mjlshen! This is awesome but I don't think it handles all the edge cases where the providerID might still not be reliable. E.g. aws API eventual consistency [1], process crash right after the API call... So the tags fallback must be trustable. The name of the replicas under a scalable resource is an implementation detail and letting it being overridden is a bug. I think we should consider reverting the original change that enabled it to be overridden, or if there's still justification for the network use case which drove that change, then we need to differentiate so this is not allowed for instances. |
Name
tag so that the AWSMachine controller can reliably find its EC2 instance
This reverts commit e0dbf3e. This revert allows the findInstance function to reliably find the corresponding EC2 instance for a given AWSMachine again. Previously, when we allowed users to specify a Name tag, findInstance would not be able to uniquely guarantee that it could find an EC2 instance since all created EC2 instances would have the same Name tag.
Users should be able to set the If CAPA cannot find the instance correctly because it is tagged with |
Ah the I think this means we need to introduce a new tag to uniquely link an EC2 instance to an AWS machine CR |
True, that's definitely a problem since we only reconcile a single object and can't reason about others (such as "3 AWSMachine objects with same name tag"). Can you point me to the code which sets the EC2 instance name if no |
Sorry for the delayed response - yes! cluster-api-provider-aws/pkg/cloud/services/ec2/instances.go Lines 123 to 131 in b003e69
Where cluster-api-provider-aws/api/v1beta2/tags.go Lines 243 to 264 in abd444c
Name tag now has precedence over the AWSMachine .metadata.name in v1beta2 now)
|
It would be good to get @AverageMarcus input on this as well. |
It's been quite a while since I've been deeply involved in CAPA code so I'm gonna keep this high level... My general thought is that the users should be able to configure any of the tags they like, including Some background on the initial introduction of the configurable
The driving force behind the initial change for us (Giant Swarm) is we wanted to allow for some complex network configurations where different machine pools could have different subnets and wanted to have control over how those subnets were named so they'd be identifiable. I forget what exactly the problem was with them all having the same name but if I remember correctly it was related to configuring transit gateway attachments and not being able to pick the correct subnet. |
Just in case my comment got lost on the bug report: #4629 (comment) I agree with @AverageMarcus |
Yeah, a new tag makes sense to me too, given that we want to be able to override a |
Name
tag so that the AWSMachine controller can reliably find its EC2 instanceName
tag so that the AWSMachine controller can reliably find its EC2 instance
@mjlshen: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/kind bug
What this PR does / why we need it:
This commit handles an edge-case where an error occurs after EC2 instance creation and before the output of a successful EC2 instance creation is returned. This causes the AWSMachine controller to attempt to find the EC2 instance by a set of filters instead of being able to explicitly find an EC2 instance by ID. Currently, the logic for finding an EC2 instance is not guaranteed to find the correct instance because one of the filters is by the
Name
tag, which can be overwritten by the user in the v1beta2 API.In essence, this PR reverts #3991 because one of the assumptions in the backing issue, #3989:
Was true for subnets, but is not true for EC2 instances.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #4629
Special notes for your reviewer:
I'm not sure if this is the right way to solve this problem, two other possibilities I thought of are:
Name
tag for subnets, but not for EC2 instancesfindInstances
more idempotent when overriding theName
tagcluster-api-provider-aws/pkg/cloud/services/ec2/instances.go
Lines 46 to 53 in abd444c
findInstances
can filter on that new tag instead of aName
tag that a user can overwrite, resulting in a unique EC2 instance match.Checklist:
Release note: