Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wireguard: Peer name resolution fails with dnscrypt-proxy2 enabled #171079

Open
anpandey opened this issue Apr 30, 2022 · 2 comments
Open

wireguard: Peer name resolution fails with dnscrypt-proxy2 enabled #171079

anpandey opened this issue Apr 30, 2022 · 2 comments
Labels
0.kind: bug 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md

Comments

@anpandey
Copy link
Contributor

Describe the bug

I'm using dnsscrypt-proxy2 listening locally on localhost port 53 as my system DNS server. I also have services.wireguard enabled in a network namespace with a domain name as a peer endpoint. However, wg fails to properly resolve the endpoint address when the generated systemd service is run.

Apr 30 14:55:22 thinkpad-x1 wireguard-wg1-peer-<snip>-refresh-start[3270]: Name or service not known: `example.tld:50000'

This is what the relevant part of my configuration.nix looks like:

  networking = {
    nameservers = [ "127.0.0.1" "::1" ];
    networkmanager = {
      enable = true;
      dns = "none";
    };
    wireguard.interfaces = {
      wg1 = {
        preSetup = "${pkgs.iproute}/bin/ip netns add sn";
        postShutdown = "${pkgs.iproute}/bin/ip netns del sn";
        ips = [ "10.100.0.2/32" ];
        interfaceNamespace = "sn";
        listenPort = 50000;
        peers = [
          {
            allowedIPs = [ "0.0.0.0/0" ];
            endpoint = "example.tld:50000";
            dynamicEndpointRefreshSeconds = 14400;
          }
        ];
      };

Additional context

I can confirm with dnscrypt-proxy2 disabled and NetworkManager DNS enabled, name resolution for the endpoint works. My guess is that the wg invocations are run in the specified network namespace (where dnscrypt-proxy2 is not reachable), so DNS resolution fails.

Also interesting is that by directly using the IP address of the endpoint (so that the connection is usable), curl is able to use the locally running dnscrypt-proxy2 instance.

$ sudo -E ip netns exec sn curl http://areallylongdomain.com/
curl: (6) Could not resolve host: areallylongdomain.com

and running tcpdump:

$ sudo tcpdump -i lo 'port 53'
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on lo, link-type EN10MB (Ethernet), snapshot length 262144 bytes
16:32:30.720305 IP localhost.40461 > localhost.domain: 24251+ [1au] A? areallylongdomain.com. (50)
16:32:30.737990 IP localhost.domain > localhost.40461: 24251 NXDomain 0/1/1 (123)

My guess is that wg is doing something different for name resolution (but it looks like it uses getaddrinfo()), or that the endpoint needs to be fully set up for name resolution to work.

A fix for this might be similar to what's needed in #169128, where we can resolve the domain name for the endpoint before all other configuration (e.g. moving the wg1 interface to its own namespace)

Notify maintainers

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

$ nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 5.17.3, NixOS, 21.11 (Porcupine)`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.3.16`
 - channels(root): `"nixos-21.11.337193.5fb3a179605, nixos-unstable, nixpkgs"`
@anpandey
Copy link
Contributor Author

After a bit of investigation, wg isn't actually doing anything different from programs like curl for DNS resolution.

This is happening because dnscrypt starts listening for requests only after it detects the network is up. If wireguard-wg1-peer-xxxx.service starts before that, my guess is wg quits entirely (probably because it's getting a connection refused) instead of retrying in cases like when the network is down.

@anpandey
Copy link
Contributor Author

I found a workaround to have systemd do the retries since wg thinks it's an unrecoverable error:

systemd.services."wireguard-wg1-peer-xxxxx-refresh" = {
  serviceConfig = {
    Restart = "on-failure";
    RestartSec = 60;
  };
};

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Nov 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md
Projects
None yet
Development

No branches or pull requests

1 participant