Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClusterIP services not accessible when using flannel CNI from host machines in Kubernetes #1243

Closed
nonsense opened this issue Jan 13, 2020 · 47 comments
Labels

Comments

@nonsense
Copy link

nonsense commented Jan 13, 2020

I am trying to access a Kubernetes service through its ClusterIP, from a pod that is attached to its host's network and has access to DNS, with:

  hostNetwork: true
  dnsPolicy: ClusterFirstWithHostNet

However the host machine has no ip routes setup for the service CIDR, for example

➜  ~ k get services
NAME             TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
kubernetes       ClusterIP   100.64.0.1      <none>        443/TCP    25m
redis-headless   ClusterIP   None            <none>        6379/TCP   19m
redis-master     ClusterIP   100.64.63.204   <none>        6379/TCP   19m
➜  ~ k get pods -o wide
NAME                       READY   STATUS      RESTARTS   AGE   IP              NODE                                             NOMINATED NODE   READINESS GATES
redis-master-0             1/1     Running     0          18m   100.96.1.3      ip-172-20-39-241.eu-central-1.compute.internal   <none>           <none>
root@ip-172-20-39-241:/home/admin# ip route
default via 172.20.32.1 dev eth0
10.32.0.0/12 dev weave proto kernel scope link src 10.46.0.0
100.96.0.0/24 via 100.96.0.0 dev flannel.11 onlink
100.96.1.0/24 dev cni0 proto kernel scope link src 100.96.1.1
100.96.2.0/24 via 100.96.2.0 dev flannel.11 onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
172.20.32.0/19 dev eth0 proto kernel scope link src 172.20.39.241

Expected Behavior

I expect that I should be able to reach services running on Kubernetes from the host machines, but I can only access headless services - those that return a pod IP.

The pod CIDR has ip routes setup, but the services CIDR doesn't.

Current Behavior

Can't access services through their ClusterIPs from host network.

Possible Solution

If I manually add an ip route to 100.64.0.0/16 via 100.96.1.1, ClusterIP are accessible. But this route is not there by default.

Your Environment

  • Flannel version: v0.11.0
  • kops version: Version 1.17.0-alpha.1 (git-501baf7e5)
  • Backend used (e.g. vxlan or udp): vxlan
  • Kubernetes version (if used):
  • Operating System and version:
  • Link to your project (optional):
@nonsense nonsense changed the title ClusterIP services not accessible when using flannel from host machines ClusterIP services not accessible when using flannel from host machines in Kubernetes Jan 13, 2020
@nonsense nonsense changed the title ClusterIP services not accessible when using flannel from host machines in Kubernetes ClusterIP services not accessible when using flannel CNI from host machines in Kubernetes Jan 13, 2020
@choryuidentify
Copy link

Exactly same as my experience. My setup is Kubernetes 1.17.2 + Flannel.
when 'hostNetwork: true' is set, this behavior be appearing.

@nonsense
Copy link
Author

nonsense commented Feb 6, 2020

Our workaround is to manually add the route to DNS through a DaemonSet as soon as there is at least one pod running on all workers (so that the cni0 interface appears).

@MansM
Copy link

MansM commented Feb 15, 2020

issue on kubernetes/kubernetes:
kubernetes/kubernetes#87852

Our workaround is to manually add the route to DNS through a DaemonSet as soon as there is at least one pod running on all workers (so that the cni0 interface appears).

@nonsense have an example?

@MansM
Copy link

MansM commented Feb 15, 2020

using @mikebryant 's workaround did the trick for me now:
#1245 (comment)

@rdxmb
Copy link

rdxmb commented Feb 21, 2020

Just changed to host-gw and realized then that the problem was much bigger than I supposed: There is a big routing problem with k8 1.17 and flannel with vxlan , which affects ClusterIP, NodePorts and even LoadBalancerIPs managed by metallb.

Changing to host-gw fixes all of them. I wonder why this is not fixed or at least documented in a very prominent way.

Here ist my report of response time of a minio-Service (in seconds) before and after changing. The checks run on the nodes itself.

Screenshot_20200221_085815

Screenshot_20200221_085848

@rdxmb
Copy link

rdxmb commented Feb 21, 2020

On a second datacenter, the response time was even more than a minute. I had to increase the monitoring-timeout to get these values.

Screenshot_20200221_090543

Screenshot_20200221_090631

@nonsense
Copy link
Author

issue on kubernetes/kubernetes:
kubernetes/kubernetes#87852

Our workaround is to manually add the route to DNS through a DaemonSet as soon as there is at least one pod running on all workers (so that the cni0 interface appears).

@nonsense have an example?

Yes, here it is: https://github.com/ipfs/testground/blob/master/infra/k8s/sidecar.yaml#L23

Note that this won't work, unless you have one pod on every host (i.e. another DaemonSet), so that cni0 exists. I know this is a hack, but I don't have a better solution.

In our case the first pod we expect on every host is s3fs - https://github.com/ipfs/testground/blob/master/infra/k8s/kops-weave/s3bucket.yml

@MansM
Copy link

MansM commented Feb 21, 2020

@nonsense I fixed it by changing the backend of flannel to host-gw instead of vxlan:

kubectl edit cm -n kube-system kube-flannel-cfg
  • replace vxlan with host-gw
  • save
  • not sure if needed, but I did it anyway: kubectl delete pods -l app=flannel -n kube-system

maybe this works for you as well

@MansM
Copy link

MansM commented Mar 9, 2020

Setting up a new cluster with flannel and not able to get any communication to work. I tried the host-gw change

kubectl edit cm -n kube-system kube-flannel-cfg
  • replace vxlan with host-gw
  • save
  • not sure if needed, but I did it anyway: kubectl delete pods -l app=flannel -n kube-system

maybe this works for you as well

but the issue persists. Would there be additional changes required? This is just a basic cluster setup and flannel configuration, all from scratch.

If you have issues with all network traffic and not just reaching services from pods hostnetwork: true, you have some other issues

@archever
Copy link

archever commented Mar 9, 2020

the same problem to me.
try add a route to cni0 fixed for me:

ip r add 10.96.0.0/16 dev cni0

@tobiasvdp
Copy link

the 'host-gw' option is only possible to infrastructures that support layer2 interaction.
most cloud providers don't.

@davesargrad
Copy link

Hi. It turns out that host-gw fixed my problem as well: #1268. To me this is a critical bug somewhere in the vxlan based pipline.

@Capitrium
Copy link

I had similar issues after upgrading our cluster from 1.16.x to 1.17.x (specifically uswitch/kiam#378). Using host-gw is not an option for me as our cluster runs on AWS, but I was able to fix it by reverting kube-proxy back to 1.16.8.

I also can't reproduce this issue on our dev cluster after replacing kube-proxy with kube-router running in service-proxy mode (tested with v1.0.0-rc1).

Could this issue be caused by changes in kube-proxy?

@mariusgrigoriu
Copy link

Just curious, how many folks running into this issue are using hyperkube?

@tkislan
Copy link

tkislan commented Mar 31, 2020

I tried reverting from 1.17.3 to 1.16.8, but I was still experiencing the same problem.
Only way how to fix this is to have DaemonSet running, and call ip r add 10.96.0.0/12 dev cni0 on every Node to fix the routing .. after that, it starts to route correctly

@LuckySB
Copy link

LuckySB commented Apr 11, 2020

try on node and on pod with hostnetwork:true (podnet 10.244.2.0/24)
coredns running on another node with podnet 10.244.1.0/24

without ip route add 10.96.0.0/16
ip packet sends to coredns pod with src ip 10.244.2.0
IP 10.244.2.0.31782 > 10.244.1.3.domain: 38996+ [1au] A? kubernetes.default. (59)

and tcpdump not show this packet on another side on vxlan tunnel
tcpdump -ni flanel.1

With route

10.96.0.0/16 dev cni0 scope link

src ip changed to addres from cni0, not a flanel.1 interface

4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether 0a:af:85:2e:82:f5 brd ff:ff:ff:ff:ff:ff
    inet 10.244.2.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
5: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
    link/ether ee:39:df:66:22:f3 brd ff:ff:ff:ff:ff:ff
    inet 10.244.2.1/24 scope global cni0
       valid_lft forever preferred_lft forever

and acces to service ipnet works fine.

well
direct access to dns pod working
dig @10.244.1.8 kubernetes.default.svc.cluster.local
and tcpdump show udp request with 10.244.2.0 src address
but acces to cluster 10.96.0.10 ip not!

i try to remove iptables rule created by kube-proxe

iptables -t nat -D POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING

and i get answer from coredns


;; ANSWER SECTION:
kubernetes.default.svc.cluster.local. 30 IN A   10.96.0.1

also work with
iptables -t nat -I POSTROUTING 1 -o eth0 -j ACCEPT

@Gacko
Copy link
Contributor

Gacko commented May 6, 2020

Sorry for being late to the party... I just installed a clean v1.17 cluster, no duplicate iptables routes in there. So it seems like they only occur after upgrading. Anyways the issue persists. I'll continue investigating...

@zhangguanzhang
Copy link
Contributor

see this
kubernetes/kubernetes#88986 (comment)

@kubealex
Copy link
Contributor

kubealex commented Jun 6, 2020

Just a side note, the issue doesn't happen on the node where the pods balanced by the service are deployed:

NAME                                        READY   STATUS      RESTARTS   AGE   IP           NODE                   NOMINATED NODE   READINESS GATES
ingress-nginx-admission-create-fppsm        0/1     Completed   0          26m   10.244.2.2   k8s-worker-0.k8s.lab   <none>           <none>
ingress-nginx-admission-patch-xnfcw         0/1     Completed   0          26m   10.244.2.3   k8s-worker-0.k8s.lab   <none>           <none>
ingress-nginx-controller-69fb496d7d-2k594   1/1     Running     0          26m   10.244.2.6   k8s-worker-0.k8s.lab   <none>           <none>
[kube@k8s-worker-0 ~]$ curl 10.100.76.252
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.17.10</center>
</body>
</html>
[kube@k8s-master-0 ~]$ curl 10.100.76.252
^C

@malikbenkirane
Copy link

malikbenkirane commented Jun 25, 2020

Our workaround is to manually add the route to DNS through a DaemonSet as soon as there is at least one pod running on all workers (so that the cni0 interface appears).

@nonsense could you please provide another example manifest for this

Yes, here it is: https://github.com/ipfs/testground/blob/master/infra/k8s/sidecar.yaml#L23

Ends on 404

@nonsense
Copy link
Author

nonsense commented Jun 25, 2020

@malikbenkirane change ipfs/testground to testground/infra - repo moved - https://github.com/testground/infra/blob/master/k8s/sidecar.yaml

@malikbenkirane
Copy link

@malikbenkirane change ipfs/testground to testground/infra - repo moved - https://github.com/testground/infra/blob/master/k8s/sidecar.yaml

Thanks, I like the idea. Though I've found using calico rather than flannel working for me. I just had set --flannel-backend=none and followed calico k3s steps changing pod cidr accordingly.

@mohideen
Copy link

I had the same issue on a HA cluster provisioned by kubeadm with RHEL7 nodes. Both the options (turning of tx-checksum-ip-generic / switching to host-gw from vxlan) worked. Settled with the host-gw option.

This did not affect a RHEL8 cluster provisioned by kubeadm (also that was not a HA cluster).

@Gacko
Copy link
Contributor

Gacko commented Jul 23, 2020

I guess this can be closed since the related issues have been fixed in Kubernetes.

@rdxmb
Copy link

rdxmb commented Jul 27, 2020

@Gacko could you link the issue/PR for that, please?

@rafzei
Copy link

rafzei commented Jul 30, 2020

@rdxmb this one: #92035 and changelog

@rdxmb
Copy link

rdxmb commented Jul 31, 2020

@rafzei thanks 👍

@muthu31kumar
Copy link

+1

@immanuelfodor
Copy link

I've bumped into the same issue with an RKE v1.19.3 k8s cluster running on CentOS 8 with firewalld completely disabled. The CNI plugin is Canal which uses both Flannel and Calico. Only pods running with hostNetwork: true and ClusterFirstWithHostNet were affected, they couldn't get DNS resolution on nodes that weren't running a CoreDNS pod. As I had 3 nodes and my CoreDNS replica count was set to 2 by the autoscaler, only pods on the 3rd node were affected. As RKE doesn't support manual CoreDNS autoscaling parameters (open issue here: rancher/rke#2247), my solution was to explicitly set the Flannel backend to host-gw from an implicit vxlan in the RKE cluster.yml file. See the docs here: https://rancher.com/docs/rke/latest/en/config-options/add-ons/network-plugins/#canal-network-plug-in-options After that, I did an rke up to apply the changes but it did not have any effect at first, therefore I also needed to reboot all nodes to fix the issue. Now all pods with hostNetwork: true and ClusterFirstWithHostNet on all nodes are working fine.

 network:
   plugin: canal
-  options: {}
+  options:
+    # workaround to get hostnetworked pods DNS resolution working on nodes that don't have a CoreDNS replica running
+    # do the rke up then reboot all nodes to apply
+    # @see: https://github.com/coreos/flannel/issues/1243#issuecomment-589542796
+    # @see: https://rancher.com/docs/rke/latest/en/config-options/add-ons/network-plugins/
+    canal_flannel_backend_type: host-gw
   mtu: 0
   node_selector: {}
   update_strategy: null

@Hitendraverma
Copy link

I am also getting intermittent issue while running stateful sets in k8 with hostNetwork.
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet

I got it resolved by following below.

  • Migrated my Kube-dns to Core-dns.
  • Adding Envoy for permanent fix of intermittent issue of DNS lookup

Also , you can temporarily fix this by running your DNS pod on the same node on which your Application pod is running.
Schedule DNS pod by node selector or by making your other nodes SchedulingDisabled.

@legoguy1000
Copy link

I just upgraded our K8S bare metal cluster running on physical servers v1.23.13 to flannel v0.20.1 from v0.17.0 and are having this issue. My pods with hostNetwork: true can't connect to any other service via ClusterIPs. I fixed by adding the static route via the cni0 interface as suggested #1243 (comment).

@stale
Copy link

stale bot commented May 5, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.