-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CNI issue related to picking up wrong CNI #10985
Conversation
551473f
to
a9afc9f
Compare
bf853eb
to
d69e24d
Compare
/ok-to-test |
kvm2 Driver Times for Minikube (PR 10985): 49.8s 48.6s 48.5s Averages Time Per Log
docker Driver Times for Minikube (PR 10985): 18.0s 17.8s 18.0s Averages Time Per Log
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for this PR This seems to be fixing the issue
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: medyagh, prezha The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
this only solves the problem for docker runtime. From kubelet:
I still think the fix provided in #10384 is more generic and solves the problem for all CNIs and container runtimes |
fixes #10984
and also:
helps #7538
fixes #8055
fixes #8480
fixes #8949
fixes #9825
fixes #10044
fixes #10969
(and maybe others...)
examples
given in the original issue description #10984 and also in other issues' description listed above
problem
dns resolution doesn't work in multinode clusters with kindnet cni in pods on nodes where CoreDNS is not present
explanation
as also discussed in #8480, #10384, containers/podman#2370 => cri-o/cri-o#2121, cri-o/cri-o#2411, and specifically in k8s documentation:
Network Plugin Requirements:
and also because of:
kubeadm "Init workflow" - step 8:
and so, yes, because eg
87-podman-bridge.conflist
already exists in/etc/cni/net.d
, it gets picked up by kubelet and duringkubeadm init phase addon
brings CoreDNS immediately up within the 'wrong' (ie, 10.88.0.x) networknote: other pods (created after cluster is up and the requested cni addon is deployed) will comply and get net from cni and defined Pod CIDR range of 10.244.0.0/24, but will not be able to reach CoreDNS (residing in another network) on the other node (usually - on master node, but can 'migrate' to a minion node after restart)
proposal
idea with this pr is to test alternative to empty/restore of
/etc/cni/net.d
before/afterkubeadm init
, proposing to use a separate (empty) directory for cni configuration and thus forcing CoreDNS to wait for cni to initialise first and then get the 'right' ip for itself (btw, while waiting, CoreDNS deployment will be rescaled to 1, so it'll start with one instance only)this can be achived by adapting cni deployment just by changing the
volumes / name: cni-cfg
-hostPath / path
from default/etc/cni/net.d
to eg/etc/cni/net.mk
(configurable via cni.CustomCNIConfDir and/or--extra-config=kubelet.cni-conf-dir
minikube flag), and supplying this custom path also to kubelet via--cni-conf-dir
flagnote:
type: DirectoryOrCreate
should also be added to cni deployment for this hostPath - if not already there (as is the case with kindnet, but some other cnis have it)also, in this pr there are a couple of other smaller fixes related to the TestMultiNode/serial/DeployApp2Nodes dns test so it works as expected
multinode-demo
note: i've also updated multinode-demo deployment yaml so that it actually created pods on different nodes (as intended) and updated the example on the website to reflect it