Improve validation of PodCIDR and ServiceClusterIPRange #15623

johngmyers · 2023-07-12T04:24:34Z

Removes the requirement (on non-GCE) that the ServiceClusterIPRange be within NonMasqueradeCIDR, as kube-proxy reroutes the ServiceClusterIPRange before masquerading.

Prohibits overlap between the PodCIDR and the ServiceClusterIPRange

Prohibits overlap between either the PodCIDR and the ServiceClusterIPRange and any subnet CIDR or IPv6CIDR. (except IPv6 doesn't have a PodCIDR, so can't overlap).

Adds validation of any podCIDR and requires it be within any NonMasqueradeCIDR.

johngmyers · 2023-07-12T05:01:13Z

/retest

johngmyers · 2023-07-16T20:59:37Z

/cc @justinsb

hakman · 2023-07-19T04:29:07Z

pkg/apis/kops/validation/validation_test.go

@@ -460,6 +461,7 @@ func Test_Validate_AdditionalPolicies(t *testing.T) {
 			Networking: kops.NetworkingSpec{
 				NetworkCIDR:           "10.10.0.0/16",
 				NonMasqueradeCIDR:     "100.64.0.0/10",
+				PodCIDR:               "100.96.0.0/11",


Shouldn't PodCIDR and ServiceClusterIPRange be both part of NonMasqueradeCIDR?

ServiceClusterIPRange doesn't have to be part of NonMasqueradeCIDR. That is because only things within the cluster network can talk to ServiceClusterIPRange and kube-proxy reroutes ServiceClusterIPRange before masquerading. GCE currently doesn't require ServiceClusterIPRange to be in NonMasqueradeCIDR; this PR drops that validation for all of the other cloud providers as well.

In this case, does is the NonMasqueradeCIDR still used for anything? It was dropped by kubelet when Docker shim was removed.

It's used by ContainerdBuilder and AWSCloudControllerManagerOptionsBuilder. It is also used as an exclusion for the egress proxy.

The use by AWSCloudControllerManagerOptionsBuilder looks to me like it's a bug. It should be using the PodCIDR instead.

Yup, same thought. Not sure about the other places, but I suspect we may get rid of it also.
The PR looks ok as is. We can merge it and see about NonMasqueradeCIDR later. Your choice.

We can handle AWSCloudControllerManagerOptionsBuilder as a separate PR. That one we might want to backport.

k8s-ci-robot · 2023-07-19T06:06:10Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hakman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [hakman]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

hakman · 2023-07-19T07:14:39Z

/retest

minkimipt · 2024-02-09T14:16:29Z

we are running a number of production clusters that were setup with kops 1.18 or even earlier on AWS. We’ve been upgrading kops and k8s over the last years and we are currently on kops 1.27 and k8s 1.25. We are expanding our infra into other regions and in aim to keep our older clusters upgraded too. We’ve recently kicked off creation of new clusters, that we do directly with kops 1.28. We are following the same subnetting scheme, that we used to apply for all of our clusters, but we’ve hit some problems in kops 1.28, which leaves us in doubt how to continue further. Here’s the particular problem that we are facing when trying to create a new cluster:

Error: completed cluster failed validation: spec.networking.serviceClusterIPRange: Forbidden: serviceClusterIPRange "10.22.128.0/17" must not overlap podCIDR "10.22.128.0/17"

Those subnets indeed overlap, but that hasn’t been an issue until kops 1.28 and we’ve been able to track that change down to this PR, which introduced the serviceClusterIPRange check, i.e.:

			if subnet.Overlap(podCIDR, serviceClusterIPRange) {
				allErrs = append(allErrs, field.Forbidden(fldPath.Child("serviceClusterIPRange"), fmt.Sprintf("serviceClusterIPRange %q must not overlap podCIDR %q", serviceClusterIPRange, podCIDR)))
			}

Since our clusters used to work well while those subnets were overlapping, could we make this check optional? Is there any risk adding a command line parameters that disables this check? Will we hit any issues in kops 1.28 and later and k8s 1.28 if this check is not done?

@johngmyers may I ask you to elaborate about the reason of this change?

justinsb · 2024-02-10T18:46:38Z

@minkimipt thanks for reporting and sorry about the issue. IIRC when we merged this we believed this was not supported / caused other problems. However, as it's a regression and existings clusters at least appear to be working fine, I think we should reduce this to a warning / disable it entirely while we track down exactly what goes wrong when these CIDRs overlap, and if it is indeed a problem, figure out how to move the CIDRs to be non-overlapping. I propose we track as #16340

ebdekock · 2024-05-20T08:51:21Z

@minkimipt thanks for reporting and sorry about the issue. IIRC when we merged this we believed this was not supported / caused other problems. However, as it's a regression and existings clusters at least appear to be working fine, I think we should reduce this to a warning / disable it entirely while we track down exactly what goes wrong when these CIDRs overlap, and if it is indeed a problem, figure out how to move the CIDRs to be non-overlapping. I propose we track as #16340

@justinsb @johngmyers we are running into the same issue, except for the subnet range. We are trying to update a cluster where the subnet and service IP range overlap. We are hitting this validation, preventing the update:

https://github.com/kubernetes/kops/pull/15623/files#diff-ae412ac68b83570fd50e9d5d63873f060eb8f503f5e44be78d710076294c3285R623-R625

I have the same questions here:

Since our clusters used to work well while those subnets were overlapping, could we make this check optional? Is there any risk adding a command line parameters that disables this check? Will we hit any issues in kops 1.28 and later and k8s 1.28 if this check is not done?

I see we've reverted the pod overlap validation: a1bba9d
Are the subnets different or would the same be possible?

minkimipt · 2024-05-20T08:55:06Z

@ebdekock we were able to upgrade kops to 1.28.5 where that validation was disabled in a cluster where we had those networks overlapping. Just to confirm that it's not a problem anymore.

ebdekock · 2024-05-20T09:08:41Z

@ebdekock we were able to upgrade kops to 1.28.5 where that validation was disabled in a cluster where we had those networks overlapping. Just to confirm that it's not a problem anymore.

I was waiting for the 1.28.5 release, as I thought it would also fix my issue, but our overlap is with the subnet CIDR and not the PodCIDR

johngmyers · 2024-05-20T15:51:29Z

The issue is that if the podCDR and serviceClusterIPRange overlap, then a pod could be assigned the same IP as a service. Then traffic intended for the pod would be routed to the wrong place.

Similarly, if the subnet range overlaps with the serviceClusterIPRange, then a node could be assigned the same IP as a service, causing routing problems for host-network pods.

I would say the "appear to be working fine" is merely that problems are rarely detected, despite being there. I would not be in favor of weakening the validations and would suggest the clusters' networking configurations be fixed.

ebdekock · 2024-05-21T08:07:18Z

I would say the "appear to be working fine" is merely that problems are rarely detected, despite being there. I would not be in favor of weakening the validations and would suggest the clusters' networking configurations be fixed.

Yup, that makes total sense and we agree that fixing the underlying issue is the best way forward. We would like to get the cluster updated and need to prioritize getting the networking fixed. Would it be possible to get this validation behind a temporary flag (that's off by default) to give us time to correct the issue?

ebdekock · 2024-06-13T12:15:25Z

I would say the "appear to be working fine" is merely that problems are rarely detected, despite being there. I would not be in favor of weakening the validations and would suggest the clusters' networking configurations be fixed.

We have fixed the networking like you suggested and no longer need this, thanks!

Improve validation of PodCIDR and ServiceClusterIPRange

36373b1

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/api labels Jul 12, 2023

k8s-ci-robot requested review from hakman and olemarkus July 12, 2023 04:24

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 12, 2023

k8s-ci-robot requested a review from justinsb July 16, 2023 20:59

hakman reviewed Jul 19, 2023

View reviewed changes

hakman approved these changes Jul 19, 2023

View reviewed changes

k8s-ci-robot assigned hakman Jul 19, 2023

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 19, 2023

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 19, 2023

johngmyers mentioned this pull request Jul 19, 2023

Fix AWS CCM defaults for IPAM to match KCM #15670

Merged

k8s-ci-robot merged commit d5c2458 into kubernetes:master Jul 19, 2023

k8s-ci-robot added this to the v1.28 milestone Jul 19, 2023

johngmyers deleted the service-ip-range branch July 19, 2023 14:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve validation of PodCIDR and ServiceClusterIPRange #15623

Improve validation of PodCIDR and ServiceClusterIPRange #15623

johngmyers commented Jul 12, 2023

johngmyers commented Jul 12, 2023

johngmyers commented Jul 16, 2023

hakman Jul 19, 2023

johngmyers Jul 19, 2023

hakman Jul 19, 2023

johngmyers Jul 19, 2023

johngmyers Jul 19, 2023

hakman Jul 19, 2023 •

edited

Loading

johngmyers Jul 19, 2023

k8s-ci-robot commented Jul 19, 2023

hakman commented Jul 19, 2023

minkimipt commented Feb 9, 2024

justinsb commented Feb 10, 2024 •

edited

Loading

ebdekock commented May 20, 2024

minkimipt commented May 20, 2024

ebdekock commented May 20, 2024

johngmyers commented May 20, 2024

ebdekock commented May 21, 2024

ebdekock commented Jun 13, 2024

Improve validation of PodCIDR and ServiceClusterIPRange #15623

Improve validation of PodCIDR and ServiceClusterIPRange #15623

Conversation

johngmyers commented Jul 12, 2023

johngmyers commented Jul 12, 2023

johngmyers commented Jul 16, 2023

hakman Jul 19, 2023

Choose a reason for hiding this comment

johngmyers Jul 19, 2023

Choose a reason for hiding this comment

hakman Jul 19, 2023

Choose a reason for hiding this comment

johngmyers Jul 19, 2023

Choose a reason for hiding this comment

johngmyers Jul 19, 2023

Choose a reason for hiding this comment

hakman Jul 19, 2023 • edited Loading

Choose a reason for hiding this comment

johngmyers Jul 19, 2023

Choose a reason for hiding this comment

k8s-ci-robot commented Jul 19, 2023

hakman commented Jul 19, 2023

minkimipt commented Feb 9, 2024

justinsb commented Feb 10, 2024 • edited Loading

ebdekock commented May 20, 2024

minkimipt commented May 20, 2024

ebdekock commented May 20, 2024

johngmyers commented May 20, 2024

ebdekock commented May 21, 2024

ebdekock commented Jun 13, 2024

hakman Jul 19, 2023 •

edited

Loading

justinsb commented Feb 10, 2024 •

edited

Loading