Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Introduce edge subnets to support AWS Local Zones #4882

Merged
merged 3 commits into from
Apr 23, 2024

Conversation

mtulio
Copy link
Contributor

@mtulio mtulio commented Mar 19, 2024

What type of PR is this?

/kind feature
/kind api-change

What this PR does / why we need it:

This PR implements support of managed subnets and carrier gateway for AWS Local Zones. Feature request #4874 .

There API is changed to introduce the following fields:

  • SubnetSpec.ZoneType: representation of subnet's zone type
  • SubnetSpec.ParentZoneName: representation of subnet's parent zone name (an availability zone in the Region which the edge zone is tied)

The subnets in AWS Local Zones are not eligible to create core components for the cluster, like NAT Gateway, Control Plane nodes, and Network Load Balancers, keeping compatibility with existing flow when edge subnets are added.

To create subnets in edge zones, the subnet must be added for each zone you want to create the subnet in NetworkSpec.Subnets. For example to create subnets in Local Zone us-east-1-nyc-1a, set:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSCluster
metadata:
  name: aws-cluster-localzone
spec:
  region: us-east-1
  networkSpec:
    vpc:
      cidrBlock: "10.0.0.0/20"
    subnets:
    # regular zones (availability zones. Eg us-east-1a, us-east-1b...)
    [...]
    # Subnets in Local Zones of New York location (public and private)
    - availabilityZone: us-east-1-nyc-1a
      cidrBlock: "10.0.128.0/25"
      id: "cluster-subnet-private-us-east-1-nyc-1a"
      isPublic: false
    - availabilityZone: us-east-1-nyc-1a
      cidrBlock: "10.0.128.128/25"
      id: "cluster-subnet-public-us-east-1-nyc-1a"
      isPublic: true

Which issue(s) this PR fixes: Relates to (not fixes completely) #4874

Special notes for your reviewer:

Checklist:

  • squashed commits
  • includes documentation
  • includes emojis
  • adds unit tests
  • adds or updates e2e tests

Release note:

Add support to provision subnets on AWS Local Zone infrastructure.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 19, 2024
@k8s-ci-robot k8s-ci-robot added needs-priority needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 19, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @mtulio. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 19, 2024
@mtulio mtulio force-pushed the CORS-2899-edge-zones branch from 5fd5cf9 to a7ace3c Compare March 19, 2024 12:39
@mtulio mtulio changed the title ✨ Support of managing subnets on AWS Local Zones WIP | ✨ Support of managing subnets on AWS Local Zones Mar 19, 2024
@mtulio mtulio changed the title WIP | ✨ Support of managing subnets on AWS Local Zones ✨ WIP | Support of managing subnets on AWS Local Zones Mar 19, 2024
@mtulio
Copy link
Contributor Author

mtulio commented Mar 19, 2024

/assign @vincepri @nrb
cc @r4f4 @patrickdillon

@mtulio
Copy link
Contributor Author

mtulio commented Mar 19, 2024

Hey @vincepri @nrb would you mind stamp ok-to-test? I am in the final phase of this PR to wrap up the Local Zones proposal. OpenShift CI is progressing/provisioning with this version (openshift/installer#8173).

@nrb
Copy link
Contributor

nrb commented Mar 20, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 20, 2024
@mtulio mtulio force-pushed the CORS-2899-edge-zones branch from a7ace3c to b0c9f0f Compare March 21, 2024 17:29
Copy link
Contributor

@r4f4 r4f4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM so far. Left a few nitpick comments, feel free to ignore.

api/v1beta2/network_types.go Outdated Show resolved Hide resolved
pkg/cloud/services/network/natgateways.go Outdated Show resolved Hide resolved
@mtulio mtulio force-pushed the CORS-2899-edge-zones branch 2 times, most recently from 12ceddc to 846e806 Compare March 28, 2024 02:44
@mtulio
Copy link
Contributor Author

mtulio commented Mar 28, 2024

LGTM so far. Left a few nitpick comments, feel free to ignore.

Thanks @r4f4 . Feedback addressed. 👍🏽

I also removed some references of Wavelength Zones which will be addressed as "edge zone" in another PR.

@mtulio mtulio force-pushed the CORS-2899-edge-zones branch from 846e806 to 9ee7479 Compare March 28, 2024 17:19
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 28, 2024
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 19, 2024
@mtulio
Copy link
Contributor Author

mtulio commented Apr 19, 2024

Just needs updated files.
generated files are out of date, run make generate

interesting, I ran it before latest change, I ran now resulting in no changes. Let me check if it was something cached.

@mtulio you need to rebase on top of current master. #4897 bumped the version of controller-gen which is changing the results.

Thanks @r4f4 . Rebased from current master, verify is now passing. Triggering e2e's with rebased version:

/test pull-cluster-api-provider-aws-e2e
/test pull-cluster-api-provider-aws-e2e-eks

@mtulio
Copy link
Contributor Author

mtulio commented Apr 19, 2024

Timeout.
/test pull-cluster-api-provider-aws-e2e

@mtulio
Copy link
Contributor Author

mtulio commented Apr 19, 2024

/test pull-cluster-api-provider-aws-e2e

@mtulio
Copy link
Contributor Author

mtulio commented Apr 20, 2024

Just needs updated files.
generated files are out of date, run make generate

interesting, I ran it before latest change, I ran now resulting in no changes. Let me check if it was something cached.

@mtulio you need to rebase on top of current master. #4897 bumped the version of controller-gen which is changing the results.

Thanks @r4f4 . Rebased from current master, verify is now passing. Triggering e2e's with rebased version:

/test pull-cluster-api-provider-aws-e2e /test pull-cluster-api-provider-aws-e2e-eks

@nrb @richardcase okay, e2e* are passing after rebase. 🚀

@richardcase
Copy link
Member

@mtulio - would you be able to squash your commits? It doesn't have to be squashed into 1 if you want to tell a story by the separate commits.

mtulio added 2 commits April 22, 2024 10:49
Introducing the mechanism to query the zone information
from the subnet's AvailabilityZone, saving the ZoneType and
the ParentZoneName in the SubnetSpec, both for managed and unmanaged.

The ZoneType is used to group the zones from regular and the edge zones.
Regular zones are with type 'availability-zone', and the edge zones are
types 'local-zone' and 'wavelength-zone'.

The following statements are valid for edge subnets:
- private subnets supports egress traffic only using NAT Gateway in the
  region.
- IPv6 subnets is not supported in edge zones
- subnet tags (kubernetes.io/role/*) for load balancer are not set in
  edge subnets. Edge subnets should not be elected by CCM to create
  service load balancers. Use ALB ingress instead.

✨ edge subnets/test: unit for subnets in Local Zones

Added unit tests to validate scenarios suing managed and unmanaged
subnets in AWS Local Zones, alongside new describe availability zones
API calls introduced in the subnet reconciliation loop.

✨ edge subnets/unit: fixes unit tests to describe zone calls

The edge subnets feature introduce a new AWS API call to describe zones,
DescribeAvailabilityZonesWithContext, to lookup zone attributes based in
the zone names in the reconciliator, and the create subnets.

The two new calls is required to support unmanaged subnets (BYO VPC),
where the method createSubnet() is not called.

There are some unit tests calling the create subnet flow, this change
add the mock calls for those calls.
✨ edge subnets/routes: supporting custom routes for Local Zones

Isolate the route table lookup into dedicated methods for private and
public subnets to allow more complex requirements for edge zones, as
well introduce unit tests for each scenario to cover edge cases.

There is no change for private and public subnets for regular
zones (standard flow), and the routes will be assigned accordainly
the existing flow: private subnets uses nat gateways per public zone,
and internet gateway for public zones's tables.

For private and public subnets in edge zones, the following changes is
introduced according to each rule:

General:

- IPv6 subnets is not be supported in AWS Local Zones,
  zone, consequently no ip6 routes will be created
- nat gateways is not supported, default gateway's route for private
  subnets will use nat gateways from the zones in the Region
(availability-zone's zone type)
- one route table by zone's role by zone (standard flow)

Private tables for Local Zones:
- default route's gateways is assigned using nat gateway created in
  the region (availability-zones).

Public tables for Local Zones:
- default route's gateway is assigned using internet gateway

The changes in the standard flow (without edge subnets' support) was
isolated in the PR kubernetes-sigs#4900

✨ edge subnets/nat-gw: support private routing in Local Zones

Introduce the support to lookup a nat gateway for edge zones when
creating private subnets.

Currently CAPA requires a NAT Gateway in the public subnet for each zone
which requires private subnets to define default nat gateway in the
private route table for each zone.

NAT Gateway resource isn't globally supported by Local Zones, thus
private subnets in Local Zones are created with default route gateway
using a nat gateway selected in the Region (regular availability zones)
based in the Parent Zone* for the edge subnet.

*each edge zone is "tied" to a zone named "Parent Zone", a zone type
availability-zone (regular zones) in the region.
@mtulio mtulio force-pushed the CORS-2899-edge-zones branch from 5103d91 to e417b92 Compare April 22, 2024 13:50
@mtulio
Copy link
Contributor Author

mtulio commented Apr 22, 2024

@mtulio - would you be able to squash your commits? It doesn't have to be squashed into 1 if you want to tell a story by the separate commits.

Hey @richardcase , sure! Thanks for the feedback. I just squashed into 3 commits. Let me know if it looks better. Thanks!

@nrb
Copy link
Contributor

nrb commented Apr 22, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 22, 2024
Copy link
Member

@richardcase richardcase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just the one comment around adding some CEL validation (or even a validation to the webhook).

// private route table to egress traffic to the internet.
//
// +optional
ParentZoneName *string `json:"parentZoneName,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have CEL to enforce the cross field validation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or use validation in the webhook.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richardcase I am thiking the best strategy here. I think we need to prevent the ParentZoneName when not local-zone for ZoneType, so something like this:

 // SubnetSpec configures an AWS Subnet.
+// +kubebuilder:validation:XValidation:rule="has(self.zoneType) && self.zoneType == 'local-zone' ? (has(self.parentZoneName) && size(self.parentZoneName) > 0)
 : !has(self.parentZoneName)",message="parentZoneName must be set if zoneType is 'local-zone'"
 type SubnetSpec struct {
        // ID defines a unique identifier to reference this resource.
....

Do you think it makes sense? is it enough?

Copy link
Contributor Author

@mtulio mtulio Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or use validation in the webhook.

yeah, when thinking in the follow up PRs (Wavelength), it seems to be better to create a validation in the webhook. Let me do it instead of CEL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Webhook added

This change introduce support of required network components to deploy
subnets on AWS Local Zones infrastructure.

The SubnetSpec API is introducing the field ZoneType and ParentZoneName
to handle the zone information for the subnet, discovered when
reconciling the subnet.

✨ edge subnets/API/gen: introduce edge subnets for Local Zones

Generate API changes to suppoer edge subnets for Local Zones.

✨ edge subnets/API/test: added unit to Local Zones

Testing new methods and workflow added to the API to
SubnetSpec (zone information).

✨ edge subnets/docs: added guide subnets on Local and Wavelength zones

Create a dedicated document, "topic", with instructions to deploy
network infrastructure (subnets, gateways and route tables) in "edge
zones" - Local Zone and Wavelength Zone infrastructure.
@mtulio mtulio force-pushed the CORS-2899-edge-zones branch from e417b92 to 2011294 Compare April 22, 2024 16:24
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 22, 2024
@mtulio
Copy link
Contributor Author

mtulio commented Apr 22, 2024

Hey @nrb @richardcase validation added in the subnet's webhook to prevent setting empty ParentZoneName when ZoneType is local-zone, addressing Richard's comment #4882 (comment). Let me know what do you think. Thanks

@nrb nrb added this to the v2.5.0 milestone Apr 22, 2024
Comment on lines +251 to +255
if subnet.ZoneType != nil && subnet.IsEdge() {
if subnet.ParentZoneName == nil {
allErrs = append(allErrs, field.Invalid(field.NewPath("subnets"), r.Spec.NetworkSpec.Subnets, "ParentZoneName must be set when ZoneType is 'local-zone'."))
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for the reviewer:

// IsEdge returns the true when the subnet is created in the edge zone,
// Local Zones.

Copy link
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 23, 2024
@richardcase
Copy link
Member

We should follow-up on a CAPA approach to CEL vs validation webhooks. But lets not let that block this.

So from my side:

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: richardcase

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 23, 2024
@k8s-ci-robot k8s-ci-robot merged commit a64bed0 into kubernetes-sigs:main Apr 23, 2024
19 checks passed
@mtulio mtulio deleted the CORS-2899-edge-zones branch April 23, 2024 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants