Doc: KIND complex network scenarios #1337

aojea · 2020-02-17T16:30:56Z

There were several PR and demand to implement this in KIND.
However, I think that KIND can serve better as a building block for complex scenarios that can be easily scripted, avoiding adding complexity to the project.

aojea · 2020-02-17T16:32:15Z

/assign @BenTheElder
/cc @howardjohn @qinqon @neiljerram

k8s-ci-robot · 2020-02-17T16:32:18Z

@aojea: GitHub didn't allow me to request PR reviews from the following users: neiljerram, howardjohn, qinqon.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/assign @BenTheElder
/cc @howardjohn @qinqon @neiljerram

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

nelljerram

Just a couple of typos that I spotted.

nelljerram · 2020-02-17T16:41:07Z

site/content/docs/user/networking-scenarios.md

+
+## Multiple clusters
+
+As we explained before, all KIND clusters are sahring the same docker network, that means that all the cluster nodes have direct connectivity.


typo "sahring"

nelljerram · 2020-02-17T16:41:35Z

site/content/docs/user/networking-scenarios.md

+
+As we explained before, all KIND clusters are sahring the same docker network, that means that all the cluster nodes have direct connectivity.
+
+If we want to spawn multiple cluster and provide Pod to Pod connectivity between different clusters, first we have to configure the cluster networking parameters to avoid address overlapping.


"multiple clusters"

tao12345666333

Great!

tao12345666333 · 2020-02-17T16:46:23Z

site/content/docs/user/networking-scenarios.md

+    inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0
+{{< /codeFromInline >}}
+
+That means that Pods will be able to reach other dockers containers that does not belong to any KIND cluster, however, the docker container will not be able to answer to the Pod IP address until we intall the correspoding routes.


reach other dockers containers

maybe should change to:

reach other docker containers

I had the same doubt, but with the amount of "containers", pods, ... that we have in these virtualized environments I think that maybe is good be explicit about this

tao12345666333

/lgtm

k8s-ci-robot · 2020-02-17T17:22:13Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: aojea, tao12345666333
To complete the pull request process, please assign bentheelder
You can assign the PR to them by writing /assign @bentheelder in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

nelljerram · 2020-02-17T17:25:45Z

@aojea Would this be a good time to talk more about your comment at #939 (comment) ? I understand that you and @BenTheElder had concerns about my proposal at the time, but I am not sure you are right that my objective can be easily achieved in existing other ways.

aojea · 2020-02-17T18:25:12Z

@aojea Would this be a good time to talk more about your comment at #939 (comment) ? I understand that you and @BenTheElder had concerns about my proposal at the time, but I am not sure you are right that my objective can be easily achieved in existing other ways.

/hold

@neiljerram my understanding is that you want to automate in KIND:

attaching new networks to the KIND nodes
adding custom static routes to the KIND nodes
adding loopback interfaces with custom IPs to the KIND nodes

if that's correct I can document how to do it, it seems easy to automate with a script or in the same way that kubeadm friends are doing with kinder https://github.com/kubernetes/kubeadm/tree/master/kinder#usage

IMHO that seems a very intrusive and specific change to target multihomed nodes environments, that are not very common on cloud environments ... bear in mind that main goal of KIND is testing Kubernetes.
Personally, what I'm more afraid of is about having more dependencies on libnetwork, is really opinionated for things like IPv6 or DNS behavior, I don know what those docker network connect commands will change ...
However, @BenTheElder may have another opinion ...

nelljerram · 2020-02-17T18:42:01Z

Thanks @aojea for your interest in this.

At the time of that PR, I was modelling a multi-homed infrastructure, with two independent planes of connectivity between any two nodes. Obviously the idea is that if one of the connectivity planes fails in some way, we still have connectivity between all the nodes over the other plane.

My reason for thinking that this needs integration in KIND is as follows.

For a setup like this to be resilient, it is important that connections to or from a node do not use an interface-specific address as their source or destination IP. Because then the connection cannot continue if the connectivity fails adjacent to that interface-specific address. So instead we want to set up a "loopback address" on each node and arrange for all outgoing connections to use that address.
That includes the connections that are established for the functioning of the Kubernetes control plane itself. For example each node's kubelet connecting to the API server should have src IP = loopback address of kubelet node.
Therefore, I think, the provisioning of the loopback address, and the routing to loopback addresses on other nodes, must be done as part of the KIND cluster setup.

WDYT? Am I still missing other possible approaches here?

aojea · 2020-02-18T07:30:21Z

/hold cancel

Thanks @aojea for your interest in this.

At the time of that PR, I was modelling a multi-homed infrastructure, with two independent planes of connectivity between any two nodes. Obviously the idea is that if one of the connectivity planes fails in some way, we still have connectivity between all the nodes over the other plane.

My reason for thinking that this needs integration in KIND is as follows.

For a setup like this to be resilient, it is important that connections to or from a node do not use an interface-specific address as their source or destination IP. Because then the connection cannot continue if the connectivity fails adjacent to that interface-specific address. So instead we want to set up a "loopback address" on each node and arrange for all outgoing connections to use that address.

That includes the connections that are established for the functioning of the Kubernetes control plane itself. For example each node's kubelet connecting to the API server should have src IP = loopback address of kubelet node.

Therefore, I think, the provisioning of the loopback address, and the routing to loopback addresses on other nodes, must be done as part of the KIND cluster setup.

WDYT? Am I still missing other possible approaches here?

yeah, I totally understand your point from the Network engineering perspective, but that setup needs a routing protocol to work and do the failover, I know that Calico and kube-router gives that possibility allowing you to peer with the leaf switches, but as I've said before this is a very specific scenario for bare metal environments, where you don't have an IaaS handling the infrastructure.

For "cloud-native" environments, the IaaS + cloud-controller-manager and Kubernetes + controller loops handle the "resilience" of the environment, i.e. the VMs only need one interface because the network is "virtual" and the IaaS handles it, for the Kubernetes workloads the controller loops handle the pods and services, restarts containers that fail, replaces containers, kills containers that don’t respond to your user-defined health check, and doesn’t advertise them to clients until they are ready to serve. Basically everything is cattle .... or should be 😉

Specifically to KIND, the network is a Linux bridge, everything is SW and in the same host, if one interface or bridge fails we'll have a bigger problem 😄

aojea · 2020-02-18T07:34:13Z

/retest

nelljerram · 2020-02-18T20:10:23Z

@aojea Many thanks. So if I understand correctly, I think your position can be summarised as:

You agree that KIND-level support would be needed, for someone to correctly model that topology using KIND.
But you don't want to add the complexity for that to KIND, because you don't think it's an important enough use case for the Kubernetes community.

Is that right?

aojea · 2020-02-18T21:41:44Z

You agree that KIND-level support would be needed, for someone to correctly model that topology using KIND.

my point is that I don't see the need to implement it in KIND because you can do it just after the cluster creation, this is an example with bash, using python, go , ... you can easily build much more complex topologies and parametrize it:

LOOPBACK_PREFIX="1.1.1."
MY_BRIDGE="my_net2"
MY_ROUTE=10.0.0.0/24
MY_GW=172.16.17.1
# Create 2nd network
docker network create ${MY_BRIDGE}
# Create kubernetes cluster
kind create cluster
# Configure nodes to use the second network
for n in $(kind get nodes); do
  # Connect the node to the second network
  docker network connect ${MY_BRIDGE} ${n}
  # Configure a loopback address
  docker exec ${n} ip addr add ${LOOPBACK_PREFIX}${i}/32 dev lo
  # Add static routes
   docker exec ${n} ip route add ${MY_ROUTE} via {$MY_GW}
done

But you don't want to add the complexity for that to KIND, because you don't think it's an important enough use case for the Kubernetes community.

Is not just that, KIND is gating kubernetes and is used as CI in a big amount of the Kubernetes ecosystem projects, I'm afraid that the risk of introducing this change could affect the stability of the project, hence all these CIs . You can't imagine the amount of hours that @BenTheElder mainly, @amwat, I and others have spent debugging flakiness and optimizing KIND

k8s-ci-robot · 2020-02-18T21:48:41Z

New changes are detected. LGTM label has been removed.

nelljerram · 2020-02-19T00:29:55Z

@aojea

my point is that I don't see the need to implement it in KIND because you can do it just after the cluster creation,

I described in my previous comment why this is not good enough: I need the Kubernetes control plane connections to be using loopback addresses, and IIRC those are set up during cluster creation. Do you think I've got something wrong there?

aojea · 2020-02-19T08:20:10Z

@aojea

my point is that I don't see the need to implement it in KIND because you can do it just after the cluster creation,

I described in my previous comment why this is not good enough: I need the Kubernetes control plane connections to be using loopback addresses, and IIRC those are set up during cluster creation. Do you think I've got something wrong there?

ok, now I got it, sorry for the confusion but I wasn't understanding your point ...

It can be done after the cluster setup, is a bit tricky though.

When creating the cluster add the loopback IP address you are going to use for the control-plane to the certificate SAN (the apiserver binds to "all-interfaces" by default)

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
# add the loopback to apiServer cert SANS
kubeadmConfigPatchesJSON6902:
- group: kubeadm.k8s.io
  kind: ClusterConfiguration
  patch: |
    - op: add
      path: /apiServer/certSANs/-
      value: my-loopback

After the cluster has been created, modify the kube-apiserver --advertise-address flag in /etc/kubernetes/manifests/kube-apiserver.yaml
(is a static pod manifest, once you write the file it restarts the pod with the new config)

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-apiserver
    tier: control-plane
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-apiserver
    - --advertise-address=172.17.0.4

and then change in all the kubelet the node-ip flag

root@kind-worker:/# more /var/lib/kubelet/kubeadm-flags.env 
KUBELET_KUBEADM_ARGS="--container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false --node-ip=172.17.0.4"

and restart them systemctl restart kubelet to use the new config

aojea · 2020-02-22T17:09:01Z

/assign @BenTheElder

please review
https://deploy-preview-1337--k8s-kind.netlify.com/docs/user/networking-scenarios/

neolit123 · 2020-02-22T17:30:33Z

site/content/docs/user/networking-scenarios.md

+---
+# Using KIND to emulate complex network scenarios [Linux Only]
+
+KIND runs Kubernetes cluster in Docker, and leverages Docker networking for all the network features: portmapping, IPv6, containers connectivity, ...


...connectivity, etc.

portmapping -> port mapping

neolit123 · 2020-02-22T17:31:53Z

site/content/docs/user/networking-scenarios.md

+       valid_lft forever preferred_lft forever
+{{< /codeFromInline >}}
+
+Docker also creates iptables NAT rules on the docker host that masquerade the traffic from the containers connected to docker0 bridge to connect to the outside world.


the docker host -> the Docker host

neolit123 · 2020-02-22T17:32:49Z

site/content/docs/user/networking-scenarios.md

+
+## Multiple clusters
+
+As we explained before, all KIND clusters are sharing the same docker network, that means that all the cluster nodes have direct connectivity.


docker network -> Docker network

neolit123 · 2020-02-22T17:33:36Z

site/content/docs/user/networking-scenarios.md

+
+{{< /codeFromInline >}}
+
+Then we just need to install the routes obtained from cluterA in each node of clusterB and viceversa:


viceversa -> vice versa

neolit123 · 2020-02-22T17:34:06Z

site/content/docs/user/networking-scenarios.md

+
+### Example: Multiple network interfaces and Multi-Home Nodes
+
+There can be scenarios that requite multiple interfaces in the KIND nodes to test multi-homing, VLANS, CNI plugins, ... 


...CNI plugins, etc.

neolit123 · 2020-02-22T17:34:56Z

site/content/docs/user/networking-scenarios.md

+    inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0
+{{< /codeFromInline >}}
+
+That means that Pods will be able to reach other Docker containers that does not belong to any KIND cluster, however, the Docker container will not be able to answer to the Pod IP address until we install the correspoding routes.


correspoding -> corresponding

neolit123 · 2020-02-22T17:37:41Z

site/content/docs/user/networking-scenarios.md

+    - --advertise-address=172.17.0.4
+```
+
+and then change in all the nodes the kubelet `node-ip` flag:


and then change the node-ip flag for the kubelets on all the nodes:

neolit123 · 2020-02-22T17:38:30Z

site/content/docs/user/networking-scenarios.md

+KUBELET_KUBEADM_ARGS="--container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false --node-ip=172.17.0.4"
+ ```
+
+and restart them `systemctl restart kubelet` to use the new config


Finally restart the kubelets to use the new configuration with systemctl restart kubelet.

important to note here is that calling kubeadm init / join again on the node will override /var/lib/kubelet/kubeadm-flags.env. alternative is to use /etc/default/kubelet
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/kubelet-integration/#the-kubelet-drop-in-file-for-systemd

let me add it as a note, due to the ephemeral nature of the nodes I don't expect people to issue those commands ... but 🤷‍♂️

neolit123 · 2020-02-24T11:49:37Z

site/content/docs/user/networking-scenarios.md


+It's important to note that calling `kubeadm init / join` again on the node will override `/var/lib/kubelet/kubeadm-flags.env`. An [alternative is to use /etc/default/kubelet](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/kubelet-integration/#the-kubelet-drop-in-file-for-systemd).S


trailing S after the .

aojea · 2020-02-29T10:56:18Z

/retest
/assign @BenTheElder

kihahu · 2020-03-09T04:48:07Z

site/content/docs/user/networking-scenarios.md

+    inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0
+{{< /codeFromInline >}}
+
+That means that Pods will be able to reach other Docker containers that does not belong to any KIND cluster, however, the Docker container will not be able to answer to the Pod IP address until we install the corresponding routes.


Since you are referring to multiple containers, use do instead of does

BenTheElder · 2020-04-14T09:05:52Z

sorry for the immense delay. I'd been hoping to get #148 done faster. I'd still like to hold off detailing networking internals until after I'm doing taking a swing at changing them :D

aojea · 2020-04-28T09:01:34Z

/hold
this need to be updated to match current status, we have custom bridges now and cluster restart 😄

masaeedu · 2020-07-29T02:19:28Z

site/content/docs/user/networking-scenarios.md

+ip route add 10.110.2.0/24 via 172.17.0.2
+
+$kubectl --context kind-clusterB get nodes -o=jsonpath='{range .items[*]}{"ip route add "}{.spec.podCIDR}{" via "}{.status.addresses[?(@.type=="InternalIP")].address}{"\n"}{end}'
+ip route add 10.120.0.0/24 via 172.17.0.7


Is this supposed to be 220? Also why are there three results here when each cluster has two nodes?

heh, good catch on both things, is 220 and the config should have 3 nodes

fejta-bot · 2020-10-27T08:25:22Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-11-26T09:08:16Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2020-12-26T09:52:59Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2020-12-26T09:53:07Z

@fejta-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Doc: KIND complex network scenarios

1b56f9b

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 17, 2020

k8s-ci-robot requested review from BenTheElder and neolit123 February 17, 2020 16:31

k8s-ci-robot assigned BenTheElder Feb 17, 2020

nelljerram reviewed Feb 17, 2020

View reviewed changes

tao12345666333 reviewed Feb 17, 2020

View reviewed changes

aojea added 2 commits February 17, 2020 18:14

Add services section

93292fd

Fix types and address reviews

f4b683c

tao12345666333 approved these changes Feb 17, 2020

View reviewed changes

k8s-ci-robot assigned tao12345666333 Feb 17, 2020

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 17, 2020

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 17, 2020

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 18, 2020

doc multihomed nodes

bcd5549

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 18, 2020

aojea added 2 commits February 19, 2020 09:29

Configure kubernetes to use loopback addresses

fbcd6e3

automte multicluster network setup

763f698

aojea force-pushed the complex_network branch from 43600df to 763f698 Compare February 22, 2020 13:20

neolit123 reviewed Feb 22, 2020

View reviewed changes

neolit123 reviewed Feb 24, 2020

View reviewed changes

Address comments

7defa85

aojea force-pushed the complex_network branch from 0b2794b to 7defa85 Compare February 24, 2020 11:54

Use shorter title

622f161

aojea mentioned this pull request Mar 8, 2020

how to connect between multiple clusters #1386

Closed

kihahu reviewed Mar 9, 2020

View reviewed changes

BenTheElder mentioned this pull request Apr 28, 2020

Multi-Cluster Services API kubernetes/enhancements#1646

Merged

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 28, 2020

aojea marked this pull request as draft June 16, 2020 20:29

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 16, 2020

masaeedu reviewed Jul 29, 2020

View reviewed changes

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 27, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 26, 2020

k8s-ci-robot closed this Dec 26, 2020


		## Multiple clusters

		As we explained before, all KIND clusters are sahring the same docker network, that means that all the cluster nodes have direct connectivity.


		As we explained before, all KIND clusters are sahring the same docker network, that means that all the cluster nodes have direct connectivity.

		If we want to spawn multiple cluster and provide Pod to Pod connectivity between different clusters, first we have to configure the cluster networking parameters to avoid address overlapping.


		{{< /codeFromInline >}}

		Then we just need to install the routes obtained from cluterA in each node of clusterB and viceversa:


		### Example: Multiple network interfaces and Multi-Home Nodes

		There can be scenarios that requite multiple interfaces in the KIND nodes to test multi-homing, VLANS, CNI plugins, ...


		It's important to note that calling `kubeadm init / join` again on the node will override `/var/lib/kubelet/kubeadm-flags.env`. An [alternative is to use /etc/default/kubelet](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/kubelet-integration/#the-kubelet-drop-in-file-for-systemd).S

Doc: KIND complex network scenarios #1337

Doc: KIND complex network scenarios #1337

Conversation

aojea commented Feb 17, 2020

aojea commented Feb 17, 2020

k8s-ci-robot commented Feb 17, 2020

nelljerram left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tao12345666333 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tao12345666333 left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Feb 17, 2020

nelljerram commented Feb 17, 2020

aojea commented Feb 17, 2020

nelljerram commented Feb 17, 2020 • edited Loading

aojea commented Feb 18, 2020 • edited Loading

aojea commented Feb 18, 2020

nelljerram commented Feb 18, 2020

aojea commented Feb 18, 2020 • edited Loading

k8s-ci-robot commented Feb 18, 2020

nelljerram commented Feb 19, 2020

aojea commented Feb 19, 2020

aojea commented Feb 22, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aojea commented Feb 29, 2020

Choose a reason for hiding this comment

BenTheElder commented Apr 14, 2020

aojea commented Apr 28, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fejta-bot commented Oct 27, 2020

fejta-bot commented Nov 26, 2020

fejta-bot commented Dec 26, 2020

k8s-ci-robot commented Dec 26, 2020

nelljerram commented Feb 17, 2020 •

edited

Loading

aojea commented Feb 18, 2020 •

edited

Loading

aojea commented Feb 18, 2020 •

edited

Loading