-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(eks): enable ipv6 for eks cluster #25819
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pull request linter has failed. See the aws-cdk-automation comment below for failure reasons. If you believe this pull request should receive an exemption, please comment and provide a justification.
A comment requesting an exemption should contain the text Exemption Request
. Additionally, if clarification is needed add Clarification Request
to a comment.
Working on ironing out the build failures. I neglected to enable the IPv6 CIDR in the inbound rules for the cluster security group, and the subnets weren't quite wired up right. There's an outstanding issue with describeFargateProfile which is occurring on this line (shown in the OP) that I'm trying to figure out. This sample successfully maps a user to the masters RBAC and a node group is eligible to join. import {
App, Fn, Stack,
aws_ec2 as ec2,
aws_eks as eks,
aws_iam as iam,
} from 'aws-cdk-lib';
import { getClusterVersionConfig } from './integ-tests-kubernetes-version';
cost myAccount = ''
const app = new App();
const env = { region: 'us-east-1', account: myAccount };
const stack = new Stack(app, 'my-v6-test-stack-1', { env });
const vpc = new ec2.Vpc(stack, 'Vpc', { maxAzs: 3, natGateways: 1, restrictDefaultSecurityGroup: false });
const ipv6cidr = new ec2.CfnVPCCidrBlock(stack, 'CIDR6', {
vpcId: vpc.vpcId,
amazonProvidedIpv6CidrBlock: true,
});
let subnetcount = 0;
let subnets = [...vpc.publicSubnets, ...vpc.privateSubnets];
for ( let subnet of subnets) {
// Wait for the ipv6 cidr to complete
subnet.node.addDependency(ipv6cidr);
_associate_subnet_with_v6_cidr(subnetcount, subnet);
subnetcount++;
}
const cluster = new eks.Cluster(stack, 'Cluster', {
...getClusterVersionConfig(stack),
vpc: vpc,
clusterName: 'some-eks-cluster',
defaultCapacity: 1,
endpointAccess: eks.EndpointAccess.PUBLIC_AND_PRIVATE,
ipFamily: eks.IpFamily.IP_V6,
securityGroup: _create_eks_security_group(),
vpcSubnets: [{ subnets: subnets }],
});
const importedRole = iam.Role.fromRoleName(stack, 'imported-role', 'some-role);
cluster.awsAuth.addMastersRole(importedRole);
function _associate_subnet_with_v6_cidr(count: number, subnet: ec2.ISubnet) {
const cfnSubnet = subnet.node.defaultChild as ec2.CfnSubnet;
cfnSubnet.ipv6CidrBlock = Fn.select(count, Fn.cidr(Fn.select(0, vpc.vpcIpv6CidrBlocks), 256, (128 - 64).toString()));
cfnSubnet.assignIpv6AddressOnCreation = true;
}
export function _create_eks_security_group(): ec2.SecurityGroup {
let sg = new ec2.SecurityGroup(stack, 'eks-sg', {
allowAllIpv6Outbound: true,
allowAllOutbound: true,
vpc,
});
sg.addIngressRule(
ec2.Peer.ipv4('10.0.0.0/8'), ec2.Port.allTraffic(),
);
sg.addIngressRule(
ec2.Peer.ipv6(Fn.select(0, vpc.vpcIpv6CidrBlocks)), ec2.Port.allTraffic(),
);
return sg;
} |
✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.
All tests are passing on my machine. With any luck the build will succeed here. There are a couple of deficiencies / quality-of-life changes that this PR highlights:
Outstanding work for this PR: The testing surface area for this change is potentially very large given we're completely redefining the networking. I'm happy to add additional tests, but I'd like to have some direction on those tests before I go make them. |
@@ -365,6 +367,23 @@ removing these ingress/egress rules in order to restrict access to the default s | |||
|
|||
|
|||
|
|||
### @aws-cdk/aws-kms:aliasNameRef |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I added everything in my workspace following this change. I had a merge from aws-cdk:main mid-change that I suspect this came from. Let me confirm that this is just duplicative from that merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like this was something from that merge as far as I can tell. I don't know where this came from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this change and rebuilding returns this change
/** | ||
* EKS cluster IP family. | ||
*/ | ||
export enum IpFamily { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already an AddressFamily
in aws-ec2
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*/
export enum AddressFamily {
IP_V4 = 'IPv4',
IP_V6 = 'IPv6',
}
While similar it's not the same as the definition this was modeled after (ipFamily is looking for lowercase values of these enums). Although largely redundant I think that keeping the enum name consistent with the property name is more intuitive.
> For more details visit [Configuring the Amazon VPC CNI plugin for Kubernetes to use IAM roles for service accounts](https://docs.aws.amazon.com/eks/latest/userguide/cni-iam-role.html#cni-iam-role-create-role) | ||
|
||
```ts | ||
const ipv6Management = new iam.PolicyDocument({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do this automatically for the user if the cluster was created with ipFamily === 'ipv6'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly. I think the real fix here is just getting the default CNI managed policy to add these permissions. I'm struggling to think of why such a change would be rejected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm taking a look at making a change to be able to do this. It'll increase the scope of the change, but unless we modify the default node group's role permissions to include ipv6 this will be a headache for everyone because default node groups will never succeed out of the box with this change.
A cheap(er) way to do it would be to just directly modify the nodeGroup role to always include ipv6, but I think having the ability to measure the cluster ipFamily through CDK has value, too.
Pull request has been modified.
This update adds the supplemental IPv6 networking permissions to the default node group role if the Cluster ipFamily is IP_V6. This required plumbing the IpFamily throughout the cluster construct. Edit: This change enabled me to revert some of the customization in the cluster-ipv6 test and restore it to look like the ipv4 cluster test it was cloned from. |
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
Description
This change enables IPv6 for EKS clusters
Reasoning
Design
IpFamily
is introduced to direct users to specifyIP_V4
orIP_V6
This change adds a new Sam layer dependencyDependency removed after validation it was no longer necessaryTesting
I consulted with some team members about how to best approach testing this change, and I concluded that I should duplicate the eks-cluster test definition. I decided that this was a better approach than redefining the existing cluster test to use IPv6 for a couple of reasons:
I ran into several issues running the test suite, primarily related to out-of-memory conditions which no amount of RAM appeared to help.
NODE_OPTIONS--max-old-space-size=8192
did not improve this issue, nor did increasing it to 12GB. Edit: This ended up being a simple fix, but annoying to dig out. The fix isexport NODE_OPTIONS=--max-old-space-size=8192
. Setting this up in my .rc file did not stick, either. MacOS Ventura for those keeping score at home.The bulk of my testing was performed using a sample stack definition (below), but I was unable to run the manual testing described in
aws-eks/test/MANUAL_TEST.md
due to no access to the underlying node instances. Edit, I can run the MANUAL_TESTS now if that's deemed necessary.Updated: This sample stack creates an ipv6 enabled cluster with an example nginx service running.
Sample:
Issues
Edit: Fixed
Integration tests, specifically the new one I contributed, failed with an issue in describing a Fargate profile:
I am uncertain if this is an existing issue or one introduced by this change, or something related to my local build. Again, I had abundant issues related to building aws-cdk and the test suites depending on Jupiter's position in the sky.
Collaborators
Most of the work in this change was performed by @wlami and @jagu-sayan (thank you!)
Fixes #18423
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license