-
Notifications
You must be signed in to change notification settings - Fork 4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(eks): enable ipv6 for eks cluster (#25819)
## Description This change enables IPv6 for EKS clusters ## Reasoning * IPv6-based EKS clusters will enable service owners to minimize or even eliminate the perils of IPv4 CIDR micromanagement * IPv6 will enable very-large-scale EKS clusters * My working group ( Amazon SDO/ECST ) recently attempted to enable IPv6 using L1 Cfn EKS constructs, but failed after discovering a CDKv2 issue which results in a master-less EKS cluster. Rather than investing in fixing this interaction we agreed to contribute to aws-eks (this PR) ## Design * This change treats IPv4 as the default networking configuration * A new enum `IpFamily` is introduced to direct users to specify `IP_V4` or `IP_V6` * ~~This change adds a new Sam layer dependency~~ Dependency removed after validation it was no longer necessary ## Testing I consulted with some team members about how to best approach testing this change, and I concluded that I should duplicate the eks-cluster test definition. I decided that this was a better approach than redefining the existing cluster test to use IPv6 for a couple of reasons: 1. EKS still requires IPv4 under the hood 2. IPv6 CIDR and subnet association isn't exactly straightforward. My example in eks-cluster-ipv6 is the simplest one I could come up with 3. There's additional permissions and routing configuration that's necessary to get the cluster tests to succeed. The differences were sufficient to motivate splitting out the test, in my opinion. I ran into several issues running the test suite, primarily related to out-of-memory conditions which no amount of RAM appeared to help. `NODE_OPTIONS--max-old-space-size=8192` did not improve this issue, nor did increasing it to 12GB. Edit: This ended up being a simple fix, but annoying to dig out. The fix is `export NODE_OPTIONS=--max-old-space-size=8192`. Setting this up in my .rc file did not stick, either. MacOS Ventura for those keeping score at home. The bulk of my testing was performed using a sample stack definition (below), but I was unable to run the manual testing described in `aws-eks/test/MANUAL_TEST.md` due to no access to the underlying node instances. Edit, I can run the MANUAL_TESTS now if that's deemed necessary. Updated: This sample stack creates an ipv6 enabled cluster with an example nginx service running. Sample: ```ts import { App, Duration, Fn, Stack, aws_ec2 as ec2, aws_eks as eks, aws_iam as iam, } from 'aws-cdk-lib'; import { getClusterVersionConfig } from './integ-tests-kubernetes-version'; const app = new App(); const env = { region: 'us-east-1', account: '' }; const stack = new Stack(app, 'my-v6-test-stack-1', { env }); const vpc = new ec2.Vpc(stack, 'Vpc', { maxAzs: 3, natGateways: 1, restrictDefaultSecurityGroup: false }); const ipv6cidr = new ec2.CfnVPCCidrBlock(stack, 'CIDR6', { vpcId: vpc.vpcId, amazonProvidedIpv6CidrBlock: true, }); let subnetcount = 0; let subnets = [...vpc.publicSubnets, ...vpc.privateSubnets]; for ( let subnet of subnets) { // Wait for the ipv6 cidr to complete subnet.node.addDependency(ipv6cidr); _associate_subnet_with_v6_cidr(subnetcount, subnet); subnetcount++; } const roles = _create_roles(); const cluster = new eks.Cluster(stack, 'Cluster', { ...getClusterVersionConfig(stack), vpc: vpc, clusterName: 'some-eks-cluster', defaultCapacity: 0, endpointAccess: eks.EndpointAccess.PUBLIC_AND_PRIVATE, ipFamily: eks.IpFamily.IP_V6, mastersRole: roles.masters, securityGroup: _create_eks_security_group(), vpcSubnets: [{ subnets: subnets }], }); // add a extra nodegroup cluster.addNodegroupCapacity('some-node-group', { instanceTypes: [new ec2.InstanceType('m5.large')], minSize: 1, nodeRole: roles.nodes, }); cluster.kubectlSecurityGroup?.addEgressRule( ec2.Peer.anyIpv6(), ec2.Port.allTraffic(), ); // deploy an nginx ingress in a namespace const nginxNamespace = cluster.addManifest('nginx-namespace', { apiVersion: 'v1', kind: 'Namespace', metadata: { name: 'nginx', }, }); const nginxIngress = cluster.addHelmChart('nginx-ingress', { chart: 'nginx-ingress', repository: 'https://helm.nginx.com/stable', namespace: 'nginx', wait: true, createNamespace: false, timeout: Duration.minutes(5), }); // make sure namespace is deployed before the chart nginxIngress.node.addDependency(nginxNamespace); function _associate_subnet_with_v6_cidr(count: number, subnet: ec2.ISubnet) { const cfnSubnet = subnet.node.defaultChild as ec2.CfnSubnet; cfnSubnet.ipv6CidrBlock = Fn.select(count, Fn.cidr(Fn.select(0, vpc.vpcIpv6CidrBlocks), 256, (128 - 64).toString())); cfnSubnet.assignIpv6AddressOnCreation = true; } export function _create_eks_security_group(): ec2.SecurityGroup { let sg = new ec2.SecurityGroup(stack, 'eks-sg', { allowAllIpv6Outbound: true, allowAllOutbound: true, vpc, }); sg.addIngressRule( ec2.Peer.ipv4('10.0.0.0/8'), ec2.Port.allTraffic(), ); sg.addIngressRule( ec2.Peer.ipv6(Fn.select(0, vpc.vpcIpv6CidrBlocks)), ec2.Port.allTraffic(), ); return sg; } export namespace Kubernetes { export interface RoleDescriptors { masters: iam.Role, nodes: iam.Role, } } function _create_roles(): Kubernetes.RoleDescriptors { const clusterAdminStatement = new iam.PolicyDocument({ statements: [new iam.PolicyStatement({ actions: [ 'eks:*', 'iam:ListRoles', ], resources: ['*'], })], }); const eksClusterAdminRole = new iam.Role(stack, 'AdminRole', { roleName: 'some-eks-master-admin', assumedBy: new iam.AccountRootPrincipal(), inlinePolicies: { clusterAdminStatement }, }); const assumeAnyRolePolicy = new iam.PolicyDocument({ statements: [new iam.PolicyStatement({ actions: [ 'sts:AssumeRole', ], resources: ['*'], })], }); const ipv6Management = new iam.PolicyDocument({ statements: [new iam.PolicyStatement({ resources: ['arn:aws:ec2:*:*:network-interface/*'], actions: [ 'ec2:AssignIpv6Addresses', 'ec2:UnassignIpv6Addresses', ], })], }); const eksClusterNodeGroupRole = new iam.Role(stack, 'NodeGroupRole', { roleName: 'some-node-group-role', assumedBy: new iam.ServicePrincipal('ec2.amazonaws.com'), managedPolicies: [ iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonEKSWorkerNodePolicy'), iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonEC2ContainerRegistryReadOnly'), iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonEKS_CNI_Policy'), iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonSSMManagedInstanceCore'), iam.ManagedPolicy.fromAwsManagedPolicyName('CloudWatchAgentServerPolicy'), ], inlinePolicies: { assumeAnyRolePolicy, ipv6Management, }, }); return { masters: eksClusterAdminRole, nodes: eksClusterNodeGroupRole }; } ``` ## Issues Edit: Fixed Integration tests, specifically the new one I contributed, failed with an issue in describing a Fargate profile: ``` 2023-06-01T16:24:30.127Z 6f9b8583-8440-4f13-a48f-28e09a261d40 INFO { "describeFargateProfile": { "clusterName": "Cluster9EE0221C-f458e6dc5f544e9b9db928f6686c14d5", "fargateProfileName": "ClusterfargateprofiledefaultEF-1628f1c3e6ea41ebb3b0c224de5698b4" } } --------------------------- 2023-06-01T16:24:30.138Z 6f9b8583-8440-4f13-a48f-28e09a261d40 INFO { "describeFargateProfileError": {} } --------------------------- 2023-06-01T16:24:30.139Z 6f9b8583-8440-4f13-a48f-28e09a261d40 ERROR Invoke Error { "errorType": "TypeError", "errorMessage": "getEksClient(...).describeFargateProfile is not a function", "stack": [ "TypeError: getEksClient(...).describeFargateProfile is not a function", " at Object.describeFargateProfile (/var/task/index.js:27:51)", " at FargateProfileResourceHandler.queryStatus (/var/task/fargate.js:83:67)", " at FargateProfileResourceHandler.isUpdateComplete (/var/task/fargate.js:49:35)", " at FargateProfileResourceHandler.isCreateComplete (/var/task/fargate.js:46:21)", " at FargateProfileResourceHandler.isComplete (/var/task/common.js:31:40)", " at Runtime.isComplete [as handler] (/var/task/index.js:50:21)", " at Runtime.handleOnceNonStreaming (/var/runtime/Runtime.js:74:25)" ] } ``` I am uncertain if this is an existing issue or one introduced by this change, or something related to my local build. Again, I had abundant issues related to building aws-cdk and the test suites depending on Jupiter's position in the sky. ## Collaborators Most of the work in this change was performed by @wlami and @jagu-sayan (thank you!) Fixes #18423 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
- Loading branch information
Showing
468 changed files
with
19,186 additions
and
3,325 deletions.
There are no files selected for viewing
Binary file renamed
BIN
+14.9 MB
...a0709bf456e34b06ea7c96111eecb2fddd054.zip → ...4617be249fd84bed5bdbb6ffbb581650aea83.zip
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
...er.js.snapshot/asset.75dfa9114a30421542432dcb55212f010f591a9e3bd203ae98ba3f9bedf5bb31.zip
Binary file not shown.
File renamed without changes.
2 changes: 1 addition & 1 deletion
2
...ae4e4e1bca8d4aa818c442e1878ddf/cluster.js → ...f5494eef7df73b7957fe9c4ef93e17/cluster.js
Large diffs are not rendered by default.
Oops, something went wrong.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Binary file modified
BIN
+0 Bytes
(100%)
...er.js.snapshot/asset.c475180f5b1bbabac165414da13a9b843b111cd3b6d5fae9c954c006640c4064.zip
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 1 addition & 1 deletion
2
...troller.js.snapshot/awscdkclusteralbcontrollerDefaultTestDeployAssert78AE94CA.assets.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 1 addition & 1 deletion
2
...ws-cdk-testing/framework-integ/test/aws-eks/test/integ.alb-controller.js.snapshot/cdk.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
{"version":"31.0.0"} | ||
{"version":"32.0.0"} |
2 changes: 1 addition & 1 deletion
2
...cdk-testing/framework-integ/test/aws-eks/test/integ.alb-controller.js.snapshot/integ.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file renamed
BIN
+14.9 MB
...a0709bf456e34b06ea7c96111eecb2fddd054.zip → ...4617be249fd84bed5bdbb6ffbb581650aea83.zip
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
...ng.js.snapshot/asset.75dfa9114a30421542432dcb55212f010f591a9e3bd203ae98ba3f9bedf5bb31.zip
Binary file not shown.
File renamed without changes.
2 changes: 1 addition & 1 deletion
2
...ae4e4e1bca8d4aa818c442e1878ddf/cluster.js → ...f5494eef7df73b7957fe9c4ef93e17/cluster.js
Large diffs are not rendered by default.
Oops, something went wrong.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Binary file modified
BIN
+0 Bytes
(100%)
...ng.js.snapshot/asset.c475180f5b1bbabac165414da13a9b843b111cd3b6d5fae9c954c006640c4064.zip
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.