Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use AWS API for EKS authentication and authorization #206

Merged
merged 8 commits into from
Mar 11, 2024
Merged

Use AWS API for EKS authentication and authorization #206

merged 8 commits into from
Mar 11, 2024

Conversation

Nuru
Copy link
Contributor

@Nuru Nuru commented Mar 4, 2024

Major Breaking Changes

Warning

This release has major breaking changes and requires significant manual intervention
to upgrade existing clusters. Read the migration document
for more details.

what

  • Use the AWS API to manage EKS access controls instead of the aws-auth ConfigMap
  • Remove support for creating an extra security group, deprecated in v2
  • Add IPv6 service CIDR output
  • Update test framework to go v1.21, Kubernetes 1.29, etc.

why

  • Remove a large number of bugs, hacks, and flaky behaviors
  • Encourage separation of concerns (use another module to create a security group)
  • Requested and authored by @colinh6
  • Stay current

references

@Nuru Nuru requested a review from a team as a code owner March 4, 2024 10:52
@Nuru
Copy link
Contributor Author

Nuru commented Mar 4, 2024

/terratest

@Nuru
Copy link
Contributor Author

Nuru commented Mar 6, 2024

/terratest

@z0rc
Copy link
Contributor

z0rc commented Mar 6, 2024

I'm working on migrating my test cluster. Haven't reached plan stage yet, just changing code to new module variables. I'd appreciate, if there were examples about new access_* variables.

With module version 3.0.0, I have somewhat typical configuration:

  map_additional_iam_roles = [
    {
      rolearn  = replace(data.aws_iam_role.administrator_access.arn, "${data.aws_iam_role.administrator_access.path}/", "")
      username = "devops"
      groups   = ["system:masters", "devops"]
    },
    {
      rolearn  = data.aws_iam_role.gitlab_ci.arn
      username = "gitlab-ci"
      groups   = ["system:masters", "ci"]
    },
    {
      rolearn  = aws_iam_role.karpenter_node.arn
      username = "system:node:{{EC2PrivateDNSName}}"
      groups   = ["system:bootstrappers", "system:nodes"]
    },
    {
      rolearn  = aws_iam_role.fargate.arn
      username = "system:node:{{SessionName}}"
      groups   = ["system:bootstrappers", "system:nodes", "system:node-proxier"]
    },
  ]

And it isn't clear what would new access_* values should look like without checking module's code.

AFAIU I should:

  • use access_entry_map for data.aws_iam_role.administrator_access.arn and data.aws_iam_role.gitlab_ci.arn, as they known at plan phase
  • add data.aws_iam_session_context.current.issuer_arn to access_entry_map
  • use access_entries_for_nodes for aws_iam_role.karpenter_node.arn, as it's created as part of root module and isn't known at plan phase
  • drop association for aws_iam_role.fargate.arn as it should be managed automatically

Is this right?

Also there is some abbreviations to simplify things at https://github.com/cloudposse/terraform-aws-eks-cluster/pull/206/files#diff-7ca243b22dbf3bdfd94ff409bd87a336e7f6f61601041b57d470ae3b2f11e71fR6-R14, which aren't documented.

@Nuru
Copy link
Contributor Author

Nuru commented Mar 6, 2024

@z0rc Thank you for the feedback.

Answers to your questions are supposed to be in the migration document, and the README in the form of documentation of the input variables.

Particularly regarding the abbreviations, there is the migration doc and the variable description. There is alos a brief example, as always, in examples/complete

I grant that there probably should be more in the README, but are you saying you didn't see the documentation, or that you read it but it did not satisfy you? (Your answer will guide my improvements.)

P.S. I'm not sure where data.aws_iam_session_context.current.issuer_arn came from, but in general we do not recommend implicit configuration like that. Otherwise I think you have it right.

I updated the migration doc with an example transformation, using your configuration as a starting point.

@Nuru
Copy link
Contributor Author

Nuru commented Mar 6, 2024

/terratest

Copy link

@hans-d hans-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too much involved in EKS, so leaving the more specific reviews for others.

Would suggest to add more "BREAKING CHANGE" to commit and PR titles, so it cannot be overlooked.

.github/renovate.json Outdated Show resolved Hide resolved
@Nuru Nuru requested review from a team as code owners March 7, 2024 03:30
@Nuru Nuru requested a review from joe-niland March 7, 2024 03:30
@Nuru
Copy link
Contributor Author

Nuru commented Mar 7, 2024

Not too much involved in EKS, so leaving the more specific reviews for others.

Would suggest to add more "BREAKING CHANGE" to commit and PR titles, so it cannot be overlooked.

We're going to rely on the major version number bump to do the heavy lifting of "BREAKING CHANGE". The mitigating factors are that the new module will simply error out in a number of ways if you blindly try to update, and in general none of the changes should cause you to lose data. Worst consequence I can foresee from blindly plowing ahead is a security group or rule might change that cuts off network access.

@Nuru
Copy link
Contributor Author

Nuru commented Mar 7, 2024

/terratest

@z0rc
Copy link
Contributor

z0rc commented Mar 7, 2024

@Nuru thanks!

With provided example I was able to understand access entries relationships and migration path. I believe I was a bit overwhelmed by amount of migration documentation. Ultimately I'd be able to do migration myself given more time.

P.S. I'm not sure where data.aws_iam_session_context.current.issuer_arn came from

I was reading example at https://github.com/cloudposse/terraform-aws-eks-cluster/blob/v4-rc/examples/complete/main.tf#L39-L48, plus https://github.com/cloudposse/terraform-aws-eks-cluster/blob/v4-rc/docs/migration-v3-v4.md#error-creating-eks-access-entry I wrongly assumed that I must add this entry now.

Overall I'm very impressed with how many thoughts were put into migration process. Much appreciated!

@z0rc
Copy link
Contributor

z0rc commented Mar 7, 2024

I migrated my test cluster to access_config.authentication_mode = "API" successfully.

versions.tf Outdated Show resolved Hide resolved
@mergify mergify bot added triage Needs triage needs-cloudposse Needs Cloud Posse assistance labels Mar 9, 2024
Copy link

mergify bot commented Mar 9, 2024

Important

Cloud Posse Engineering Team Review Required

This pull request modifies files that require Cloud Posse's review. Please be patient, and a core maintainer will review your changes.

To expedite this process, reach out to us on Slack in the #pr-reviews channel.

Copy link

mergify bot commented Mar 9, 2024

This pull request now has conflicts. Could you fix it @Nuru? 🙏

@mergify mergify bot added the conflict This PR has conflicts label Mar 9, 2024
@Nuru Nuru requested a review from hans-d March 9, 2024 11:40
@mergify mergify bot added needs-test Needs testing and removed conflict This PR has conflicts labels Mar 9, 2024
@Nuru
Copy link
Contributor Author

Nuru commented Mar 9, 2024

/terratest

@Nuru Nuru added no-release Do not create a new release (wait for additional code changes) major Breaking changes (or first stable release) labels Mar 9, 2024
@Nuru Nuru requested a review from z0rc March 10, 2024 00:23
@Nuru
Copy link
Contributor Author

Nuru commented Mar 11, 2024

/terratest

@Nuru Nuru merged commit ff27afa into main Mar 11, 2024
11 checks passed
@Nuru Nuru deleted the v4-rc branch March 11, 2024 15:57
@mergify mergify bot removed needs-cloudposse Needs Cloud Posse assistance triage Needs triage labels Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major Breaking changes (or first stable release) needs-test Needs testing no-release Do not create a new release (wait for additional code changes)
Projects
None yet
3 participants