-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't access anything on the 10.43.x.x range #1247
Comments
This was happening to me while trying to "auto-install" a Helm Chart via
The helm-install pod was scheduled to a Raspberry Pi 4 node. Deleted the Pod which caused it to be scheduled on an x86_64 node and it ran fine (running a mixed CPU Arch.). Running |
If you run iptables-save, it tells you that you need to run iptables-legacy-save to see the rest of the rules output. That's where I am seeing all of the kubernetes rules listed. |
After being plagued with the issue for several days, I decided to try a removal of iptables, and all of a sudden my services are working and I am able to get DNS resolution on everything. The issue related above, 977, talks about some iptables conflicts, and placement of the REJECT rule. I haven't dug into a correct order on Raspbian yet, but this was a quick fix for me to get everything up and running. |
Hi There, is there any news here? I am experiencing this issue and I've been trying to solve it for several weeks on and off with no success. I see things like
in the Traefik log or
in the coredns log.
I have tried
I run |
I got the same issue with Ubuntu 20! Error All 10.43.x.x IPs seem not be be working! Any Ideas / Solution onto this? EDIT: |
You can try to run K3S (I'm not sure of what component exactly) regenerates iptables rules each 30 seconds, and adds Currently I have the issue with traefik for some ingresses but not all. I'm still digging to find out how the rules are generated and diagnose it. |
The kubelet is responsible for most of the forwarding rules. The remainder are handled by the network policy controller, although their tables will likely be empty if you don't have any policies in your cluster to restrict communication. Do you perhaps have your host-based firewall (ufw, firewalld, etc) enabled? |
In my case, there were no problem at all: all worked as expected, but I just did not known it. |
Ah yeah. Kubernetes network policy would definitely block it, by design. |
I found the problem on my arch arm and regular arch is related to this #1812 I doesn't look like the
But it's nft, not legacy:
quick workaround is to change the links and then restart your cluster:
Can test if it's working by doing this
If it's broken you'll get |
|
I just cleared out the cluster and did k3s-uninstall.sh on all nodes. Then I did the following and rebooted to make sure there were no legacy rules.
I also ensured the original symlinks were restored:
Then I brought the cluster back up. I'm using this ansible role and example except with one worker node, servicelb disabled, and traefik disabled. https://github.com/PyratLabs/ansible-role-k3s/blob/main/documentation/quickstart-ha-cluster.md Once it's back up I'm still getting |
I tested as detailed before and the google.com can't be resolved by dnsutils pod Then I went to each node and changed the symlink for iptables and ip6tables to point to iptables-legacy and ip6tables-legacy, ran k3s-uninstall.sh on each node of the cluster, and rebuilt the cluster again with ansible, and then tested again. Now it resolves properly. |
The rules check is here, can you compare the output on your systems? (iptables-legacy-save || true; ip6tables-legacy-save || true) 2>/dev/null | grep '^-' |
I reset the cluster, updated all nodes, cleared all iptables rules, and re-installed iptables. I downloaded the
|
I'm curious what was initially there that lead it to detect legacy iptables, though. |
Okay, I think this might be this issue then? After re-installing iptables it changes the links but the detect script is saying nft.
To test this I just spun up the cluster again and I can't resolve anything from a pod. This is creating legacy rules though.
|
I found another package for arch After install I have this.
Spinning up the cluster now to see if this resolves things. |
Now after spinning up the cluster I get this
Which is good I think. But something else going on now. If I open a shell into a busybox container in the default namespace I get a nslookup timeout I can't ping the dns server that nslookup is using, which is the ip of kube-dns service. |
Most distros have an update-alternatives script that you are supposed to use to do this sort of thing, as opposed to symlinking things manually. You might check to see if Arch has a similar tool that you're intended to use. |
I was only changing symlinks to test. When I removed iptables and installed iptables-nft it removed all the old executables and symlinks so everything is as intended on all nodes now. I'm going through these steps now so hopefully that will shed some light on the problem. https://rancher.com/docs/rancher/v2.x/en/troubleshooting/dns/ |
I've re-imaged a few times now and tried both iptables and iptables-nft packages. I'm pretty certain at this point that it's something wrong with the iptables rules that k3s is adding because I can get outside the pod and resolve with dns if I set the server in nslookup to 1.1.1.1 or 8.8.8.8. I just can't communicate with the coredns service on 10.43.0.10. I don't know how it started working before but it's certainly not working now no matter what I try. |
So in your current state, should your system be using iptables or nftables? Which one is k3s adding rules to? |
It doesn't work in either scenario. Nft rules or legacy.
|
I was able to get this working which uses standard k8s and iptables. https://github.com/raspbernetes/k8s-cluster-installation |
I have exactly the same issue after a fresh installation of k3s on a fresh CentOS 8 VM (Virtualbox). Is k3s even supposed to work with CentOS 8? |
On my CentOS 8 machine, the package |
Same issue on NixOS, |
Same issue, on RHEL 8.4 (aws ami) without iptables, k3s v1.20.2+k3s1 |
oh my. I thought this didn't apply because I have no firewall. But the |
shouldn't disable-cloud-controller set to true disable both the below services if they are enabled ? systemctl disable nm-cloud-setup.service nm-cloud-setup.timer |
Network Manager's interference with container virtual interfaces is a separate issue from firewalld/ufw blocking traffic... so you need to ensure both are disabled. |
@brandond if we stop these services before the RKE2 install, would it still require a reboot ? If it doesn't then as part of our infra automation we will stop these if they exists and are active before RKE2 install systemctl disable nm-cloud-setup.service nm-cloud-setup.timer |
IMO this issue turns into a FAQ related to network configuration. Comments either are or should be written in the documentation. |
It is covered in the documentation, I linked it up above. |
This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions. |
Version:
v1.0.1
Describe the bug
The internal connection on range 10.43.x.x doesn't seem to work. Old iptables has been anabled, system is debian based.
To Reproduce
Use the playbook in contributions in the repo, update the version, and only install the service (master) not node
Expected behavior
K3S cluster installs and starts up
Actual behavior
Pods fail and timeout on 10.43.x.x ip's
Additional context
Error: Get https://10.43.0.1:443/api/v1/namespaces/kube-system/secrets?labelSelector=OWNER%!D(MISSING)TILLER: dial tcp 10.43.0.1:443: i/o timeout
panic: Get https://10.43.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication: dial tcp 10.43.0.1:443: i/o timeout
The text was updated successfully, but these errors were encountered: