Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: support for Docker DNS proxy, or equivalent #370

Closed
Georgiy-Tugai opened this issue Jan 21, 2021 · 4 comments · Fixed by #375
Closed

Feature request: support for Docker DNS proxy, or equivalent #370

Georgiy-Tugai opened this issue Jan 21, 2021 · 4 comments · Fixed by #375

Comments

@Georgiy-Tugai
Copy link
Contributor

Georgiy-Tugai commented Jan 21, 2021

NB: I'm not sure if this is something specific to my setup.

For some reason, after updating my Gentoo system, DNS resolution from inside Docker containers managed by dfw in nftables mode ceased to work. It seems that Docker is attempting to create iptables forwarding rules for its DNS proxy, and failing.

time="2021-01-18T18:41:48+01:00" level=error msg="set up rule failed, [-t nat -I DOCKER_OUTPUT -d 127.0.0.11 -p udp --dport 53 -j DNAT --to-destination 127.0.0.11:53982]"
time="2021-01-18T18:41:48+01:00" level=error msg="set up rule failed, [-t nat -I DOCKER_POSTROUTING -s 127.0.0.11 -p udp --sport 53982 -j SNAT --to-source :53]"
time="2021-01-18T18:41:48+01:00" level=error msg="set up rule failed, [-t nat -I DOCKER_OUTPUT -d 127.0.0.11 -p tcp --dport 53 -j DNAT --to-destination 127.0.0.11:32987]"
time="2021-01-18T18:41:48+01:00" level=error msg="set up rule failed, [-t nat -I DOCKER_POSTROUTING -s 127.0.0.11 -p tcp --sport 32987 -j SNAT --to-source :53]"

I would like to request a dfw option for creating equivalent rules on docker's behalf. As the ports are randomized, this is tricky, but not impossible; a netstat -lp call from inside the container shows that dockerd is listening on two ports, one TCP and one UDP.

Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.11:32987        0.0.0.0:*               LISTEN      3624/dockerd
udp        0      0 127.0.0.11:53982        0.0.0.0:*                           3624/dockerd

Alternatively, dfw could run its own DNS proxy, obtaining container names from the network configuration data to provide 'equivalent' behaviour for docker stacks which use container names to connect services together.

For anyone who stumbles on this from Google, my temporary workaround is to bind-mount the host's /etc/resolv.conf file into the container, which prevents Docker from overriding the DNS with its 127.0.0.11 address. This allows me to run single containers, but not stacks.


Looks like https://crates.io/crates/procfs gives us the netstat bits, but I'm not sure how that will interact with dfw itself being inside a container; may have to bind-mount the host's /proc or something?

Some playing with nsenter and netstat indicates that the answer to that seems to be yes; at least read-only access to /proc will be needed to get the network namespace of the other docker container.

@pitkley
Copy link
Owner

pitkley commented Jan 23, 2021

Hi @Georgiy-Tugai, thank you for reaching out and for all the detail you provided!

As documented in the getting started guide for nftables, Docker should be configured to not manipulate any iptables rules by setting the daemon option iptables to false. This is required for dfw to function correctly, and in my experience does not break DNS resolution; at least I never had issues.

I cannot rule out that Docker >=20.10.0 might have "broke" compatibility by introducing new DNS-resolution or firewall-handling behaviour, and although a quick check of the release notes does not necessarily suggest this, I currently cannot confirm this since I have not yet upgraded the Docker daemons in my setup past 19.03.

Maybe you can check if you have iptables-handling in your Docker daemon deactivated, and if it isn't, whether deactivating fixes your issues? I'll also try to get around to upgrading to Docker 20.10 to test this myself, too!

@Georgiy-Tugai
Copy link
Contributor Author

Thanks for your quick response! 😄

I do have iptables set to false, of course. Going by docker debug logs (see my first post) and moby/libnetwork#1085 (comment) docker ignores the iptables setting specifically when it comes to the eDNS system.

/etc/conf.d/docker

DOCKER_OPTS="--storage-driver zfs --data-root /var/lib/docker --cgroup-parent docker --init --log-driver json-file --log-opt max-size=10m --log-opt max-file=3 --log-opt compress=true --iptables=false"

# prevent race between dfw and nftables init
rc_after="nftables"

The breakage initially happened when I upgraded from app-emulation/docker-19.03.8 to app-emulation/docker-19.03.14 and from sys-kernel/gentoo-sources-4.19.86 to sys-kernel/gentoo-sources-5.4.80-r1. I later upgraded to app-emulation/docker-20.10.1 hoping that it might help.

I also suspect kernel configuration issues, given the major version upgrade, but I haven't found anything weird in the iptables/nftables section so far. I have both iptables and nftables built as modules, and ipt_MASQUERADE blacklisted since it is known to interact poorly with the nftables version.

@Georgiy-Tugai
Copy link
Contributor Author

I attempted to reproduce the issue on a Debian Buster VM. DNS resolution in that environment worked correctly, which led me to re-examine the kernel configuration aspect.

Manually invoking the failing iptables commands in the container's namespace gave cryptic error messages; could not find "tcp" and the like. Translating the rule via iptables-translate allowed the rule to install successfully via an nft invocation, but that is not what Docker is doing.

In the end, it turned out that I was missing the nft_compat module (CONFIG_NFT_COMPAT) which seems to be required even if iptables-nft is being used rather than iptables-legacy.

@pitkley This can now be demoted from "blocking issue" to "neat feature maybe"; might be worth adding some blurb about this to the nftables guide, I guess, if nothing else.
Sorry for bothering you with what turned out to be my kernel config silliness 😄

For people who might find this later, here's the relevant fragment of my kernel configuration; I'm using a config-merging tool based on https://github.com/ulfalizer/Kconfiglib so this is a "delta" config relative to x86+Gentoo defaults.

CONFIG_BRIDGE=y
CONFIG_IPVLAN=m
CONFIG_MACVLAN=m
CONFIG_TUN=m
CONFIG_IPVTAP=m
CONFIG_MACVTAP=m
CONFIG_VETH=m
CONFIG_VXLAN=m

CONFIG_NET_KEY=y

# IP Payload Compression Protocol (RFC3173), typically needed for IPsec
CONFIG_INET6_IPCOMP=y

# Support for INET socket monitoring, used by tools such as ss
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
CONFIG_INET_UDP_DIAG=y
CONFIG_INET_RAW_DIAG=y
CONFIG_UNIX_DIAG=y

# allow setting weird netfilter modules
CONFIG_NETFILTER_ADVANCED=y

# let arp/iptables see bridged traffic... does this apply to nftables?
CONFIG_BRIDGE_NETFILTER=m

# Ethernet bridge stuff
CONFIG_BRIDGE_NF_EBTABLES=m
CONFIG_BRIDGE_EBT_DNAT=m
CONFIG_BRIDGE_EBT_IP=m
CONFIG_BRIDGE_EBT_SNAT=m
CONFIG_BRIDGE_EBT_T_NAT=m

# turn off iptables
CONFIG_IP_NF_IPTABLES=n
CONFIG_IP6_NF_IPTABLES=n
CONFIG_IP_NF_FILTER=n
CONFIG_IP6_NF_FILTER=n
CONFIG_IP_NF_TARGET_REJECT=n
CONFIG_IP6_NF_TARGET_REJECT=n
CONFIG_IP_NF_NAT=n
CONFIG_IP_NF_TARGET_MASQUERADE=n
CONFIG_IP_NF_MANGLE=n
CONFIG_IP6_NF_MANGLE=n
CONFIG_IP6_NF_MATCH_IPV6HEADER=n

# shut up linter
CONFIG_NETFILTER_XT_TARGET_NFLOG=m
CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
CONFIG_NETFILTER_XT_MATCH_POLICY=m
CONFIG_NETFILTER_XT_MATCH_STATE=m

# nftables
CONFIG_NF_TABLES=m
CONFIG_NF_TABLES_INET=y
CONFIG_NF_TABLES_IPV4=y
CONFIG_NF_TABLES_IPV6=y
CONFIG_NF_TABLES_BRIDGE=m

CONFIG_NF_FLOW_TABLE=m

# needed for some iptables-compatibility stuff... sigh
CONFIG_NETFILTER_XTABLES=m
CONFIG_NFT_COMPAT=m

CONFIG_NFT_NAT=m

CONFIG_NFT_MASQ=m

CONFIG_NFT_CONNLIMIT=m
CONFIG_NFT_COUNTER=m
CONFIG_NFT_CT=m
CONFIG_NFT_LIMIT=m
CONFIG_NFT_LOG=m
CONFIG_NFT_OBJREF=m
CONFIG_NFT_QUOTA=m

CONFIG_NFT_REDIR=m

CONFIG_NFT_REJECT=m
CONFIG_NFT_REJECT_INET=m
CONFIG_NFT_REJECT_IPV4=m
CONFIG_NFT_REJECT_IPV6=m

CONFIG_NFT_TUNNEL=m
CONFIG_NF_TABLES_SET=m

# stuff that doesn't work because SECURITY=n
CONFIG_NETLABEL=n
CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=n
CONFIG_NETFILTER_XT_TARGET_SECMARK=n

# docker wants these
CONFIG_IP_VS=m
CONFIG_IP_VS_PROTO_TCP=y
CONFIG_IP_VS_PROTO_UDP=y
CONFIG_IP_VS_NFCT=y
CONFIG_IP_VS_RR=m
CONFIG_DUMMY=y
CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
CONFIG_NETFILTER_XT_MATCH_IPVS=m

pitkley added a commit that referenced this issue Jan 26, 2021
bors bot added a commit that referenced this issue Jan 26, 2021
375: Add troubleshooting docs for issue described in #370 r=pitkley a=pitkley

Closes #370.

Co-authored-by: Pit Kleyersburg <pitkley@googlemail.com>
@bors bors bot closed this as completed in f251d4a Jan 26, 2021
@pitkley
Copy link
Owner

pitkley commented Jan 26, 2021

I'm glad you were able to resolve it, and I greatly appreciate the amount of detail you have provided. I have added a section to the troubleshooting document which references this issue! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants