Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolvConf value ignored if systemd-resolved active - override value exhibits race condition between kubelet and systemd-networkd #2111

Closed
hickeng opened this issue Apr 22, 2020 · 9 comments · Fixed by kubernetes/kubernetes#90394
Assignees
Labels
area/UX kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Milestone

Comments

@hickeng
Copy link

hickeng commented Apr 22, 2020

What keywords did you search in kubeadm issues before filing this one?

resolvConf, resolv-conf, resolved, dns

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version):

kubeadm version: &version.Info{Major:"1", Minor:"17+", GitVersion:"v1.17.4-2+a00aae1e6a4a69", GitCommit:"a00aae1e6a4a698595445ec86aab1502a495c1ce", GitTreeState:"clean", BuildDate:"2020-04-21T14:37:28Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"17+", GitVersion:"v1.17.4-2+a00aae1e6a4a69", GitCommit:"a00aae1e6a4a698595445ec86aab1502a495c1ce", GitTreeState:"clean", BuildDate:"2020-04-21T14:38:23Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"17+", GitVersion:"v1.17.4-2+a00aae1e6a4a69", GitCommit:"a00aae1e6a4a698595445ec86aab1502a495c1ce", GitTreeState:"clean", BuildDate:"2020-04-21T14:36:02Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
n/a
  • OS (e.g. from /etc/os-release):
VMware Photon OS 3.0"
  • Kernel (e.g. uname -a):
Linux 42066f5b2159256bef8e84ca8ff4e219 4.19.112-1.ph3-esx #1-photon SMP Fri Mar 27 09:35:09 UTC 2020 x86_64 GNU/Linux
  • Others:

What happened?

When systemd-resolved is enabled kubeadm ignores the value specified in resolvConf in favour of the systemd managed file /run/systemd/resolve/resolv.conf.
The specific value used in this case was:

resolvConf: /run/systemd/resolve/stub-resolv.conf

This is a problem for two reasons:

  1. I want to use stub resolver in this instance and need a way of specifying it.
  2. Direct use of the /run/systemd/resolve/resolv.conf introduces a race between systemd and kubelet. We have observed intermittent instances of containers being created with /etc/resolv.conf (inside the container) only containing the leading comment block but no DNS entries. Hypothesis is that kubelet is racing with systemd regenerating the file.

On an environment with DHCP configured DNS running systemctl restart systemd-networkd in a separate shell generates the following output. It can be seen that there are multiple (7 in this case) steps in regenerating this file, and all but the last are missing the DNS servers.

root@42066f5b2159256bef8e84ca8ff4e219 [ /run/systemd/resolve ]# while inotifywait -e modify -e create -e close_write  /run/systemd/resolve; do cat resolv.conf;done
Setting up watches.
Watches established.
/run/systemd/resolve/ CREATE .#resolv.confJJJx5C
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

# No DNS servers known.
Setting up watches.
Watches established.
/run/systemd/resolve/ CREATE .#resolv.confFjPqMZ
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

# No DNS servers known.
Setting up watches.
Watches established.
/run/systemd/resolve/ CREATE .#resolv.confPJxJG8
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

Setting up watches.
Watches established.
/run/systemd/resolve/ CREATE .#resolv.confTlGXGk
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

Setting up watches.
Watches established.
/run/systemd/resolve/ CREATE .#resolv.confpLAKTU
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

Setting up watches.
Watches established.
/run/systemd/resolve/ CREATE .#resolv.confvZGw3m
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

Setting up watches.
Watches established.
/run/systemd/resolve/ CREATE .#resolv.confPXyK0J
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 10.195.12.31
nameserver 10.172.40.1
Setting up watches.
Watches established.

What you expected to happen?

kubeadm honours the explicit value when present in the config.
kubeadm documents the race with systemd-networkd, or choses a different means of supplying DNS.

How to reproduce it (as minimally and precisely as possible)?

In a system with systemd-resolved enabled specify resolvConfig in kubeadm.yaml

kind: KubeletConfiguration
metadata:
  name: kubeadm-kubelet
resolvConf: /run/systemd/resolve/stub-resolv.conf

The generated /var/lib/kubelet/kubeadm-flags.env file contains:
--resolv-conf=/run/systemd/resolve/resolv.conf
instead of:
--resolv-conf=/run/systemd/resolve/stub-resolv.conf

Anything else we need to know?

https://github.com/kubernetes/kubernetes/blob/8d8aa39598534325ad77120c120a22b3a990b5ea/cmd/kubeadm/app/phases/kubelet/flags.go#L113

This behaviour was added in kubernetes/kubernetes#64665

@neolit123
Copy link
Member

neolit123 commented Apr 22, 2020

The generated /var/lib/kubelet/kubeadm-flags.env file contains:
--resolv-conf=/run/systemd/resolve/resolv.conf
instead of:
--resolv-conf=/run/systemd/resolve/stub-resolv.conf

the kubelet is removing deprecated flags soon, so we need to stop adding --resolv-conf in /var/lib/kubelet/kubeadm-flags.env completely.

xref #949

@neolit123 neolit123 added area/UX kind/bug Categorizes issue or PR as related to a bug. labels Apr 22, 2020
@neolit123 neolit123 added this to the v1.19 milestone Apr 22, 2020
@neolit123 neolit123 added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Apr 22, 2020
@SataQiu
Copy link
Member

SataQiu commented Apr 22, 2020

/assign

@rosti
Copy link

rosti commented Apr 22, 2020

The generated /var/lib/kubelet/kubeadm-flags.env file contains:
--resolv-conf=/run/systemd/resolve/resolv.conf
instead of:
--resolv-conf=/run/systemd/resolve/stub-resolv.conf

the kubelet is removing deprecated flags soon, so we need to stop adding --resolv-conf in /var/lib/kubelet/kubeadm-flags.env completely.

xref #949

Indeed. The original logic, however, still needs to be kept the same. That is trying to set resolvConf in the config to /run/systemd/resolve/resolv.conf but only if that's not already set by users.

However, if no stub overwrite is provided, the original race between the kubelet and systemd-resolved would still remain.

@rosti
Copy link

rosti commented Apr 22, 2020

So it seems that we are missing these from the systemd service file for the kubelet.

After=network-online.target
Wants=network-online.target

@rosti
Copy link

rosti commented Apr 23, 2020

@hickeng about the race between kubelet and systemd-resolved, PhotonOS 3.0 uses systemd v239. Digging through the source code of that version of systemd, I came to the conclusion that it's unlikely for a partial /run/systemd/resolve/resolv.conf to be visible.
They seem to be composing that file and flushing it out into a temporary file, which they then rename into the old one.
What I am thinking here is, that the systemd-resolved is unable to detect DNS servers for some reason (probably connected to the systemd-networkd restart).

Relevant systemd source code can be found here:
https://github.com/systemd/systemd/blob/de7436b02badc82200dc127ff190b8155769b8e7/src/resolve/resolved-resolv-conf.c#L305

@hickeng
Copy link
Author

hickeng commented May 11, 2020

@hickeng about the race between kubelet and systemd-resolved, PhotonOS 3.0 uses systemd v239. Digging through the source code of that version of systemd, I came to the conclusion that it's unlikely for a partial /run/systemd/resolve/resolv.conf to be visible.

Completely agree - this isn’t a partially written file being accessed, it’s a file with contents that are not what we expect.

What I am thinking here is, that the systemd-resolved is unable to detect DNS servers for some reason (probably connected to the systemd-networkd restart).

I don’t think it’s that it cannot detect DNS, but that it clears the DNS and reapplies them. Quite possible this window doesn’t exist with a static IP/DNS config.
It is likely waiting until the DHCP lease is renewed before adding in DNS derived entries. This is completely sane for systemd in isolation.

The problem is that kubelet can create a container in that time frame and that container gets an empty resolve.conf which is not fixed up when systemd updates the conf.

While this isn’t an easy thing to fix I think it’s still an issue, particularly as the network being restarted is likely to induce restarts of workloads that use DNS, and networking meaning probable surge of newly created containers over the period of interest.

Relevant systemd source code can be found here:

https://github.com/systemd/systemd/blob/de7436b02badc82200dc127ff190b8155769b8e7/src/resolve/resolved-resolv-conf.c#L305

@fungaren
Copy link

fungaren commented Sep 19, 2023

The code here seems not correct:

https://github.com/kubernetes/kubernetes/blob/v1.28.2/pkg/kubelet/network/dns/dns.go#L225-L274

If the resolv.conf is empty, an empty array and no error is returned.

Then, if the dnsPolicy is default (HostDNS), the 127.0.0.1 will be used as the dns server

https://github.com/kubernetes/kubernetes/blob/v1.28.2/pkg/kubelet/network/dns/dns.go#L316-L317

Then, containerd will copy the /etc/resolv.conf from the host:

https://github.com/containerd/containerd/blob/v1.7.6/pkg/cri/server/sandbox_run_linux.go#L282-L287

Finally, that will cause forwarding loop to coredns.

@neolit123
Copy link
Member

neolit123 commented Sep 19, 2023

The code here seems not correct:

kubelet bugs must be reported in kubernetes/kubernetes.
sig node and sig network own that code.

@fungaren
Copy link

Okay, I created: kubernetes/kubernetes#120748

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/UX kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
5 participants