-
-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wireguard doesn't bring up peers #63869
Comments
I am experiencing the same issue, since #62325.
Based upon this comment, I suspected this to be the culprit, but a rebuild with that changed to @zx2c4: From reading #61971, I understand why you wished to improve the retry logic within My primary concerns are:
Instead, given Type=simple
Restart=on-failure Can you foresee any issues with this? |
I think re-adding
should be done as I can't see any problem with it. |
What's the status here? Wireguard still seems to be broken. I have the same bug and can't use it.
|
This is not a reason to script things improperly. Familiarize yourself and use the right tool for the task.
If your resolver is configured correctly, you should be able to distinguish between "domain record doesnt exist on the internet" and "dont have a responding dns resolver yet". |
Hello, I'm a bot and I thank you in the name of the community for opening this issue. To help our human contributors focus on the most-relevant reports, I check up on old issues to see if they're still relevant. This issue has had no activity for 180 days, and so I marked it as stale, but you can rest assured it will never be closed by a non-human. The community would appreciate your effort in checking if the issue is still valid. If it isn't, please close it. If the issue persists, and you'd like to remove the stale label, you simply need to leave a comment. Your comment can be as simple as "still important to me". If you'd like it to get more attention, you can ask for help by searching for maintainers and people that previously touched related code and @ mention them in a comment. You can use Git blame or GitHub's web interface on the relevant files to find them. Lastly, you can always ask for help at our Discourse Forum or at #nixos' IRC channel. |
I still have this problem. |
When dns resolution fails with a permanent error ("Name or service not known" instead of "Temporary failure in name resolution"), wireguard won't retry despite WG_ENDPOINT_RESOLUTION_RETRIES=infinity. Ideally, dns would probably never report a permanent error for an existing name, but unfortunately this *does* happen (maybe *especially* with dynamic dns?) and cannot easily be fixed by the wireguard setup's admin. I can't think of a scenario where it is *essential* to not retry after a negative dns response (given that the endpoint has been configured, the dns name quite certainly exists), right?. On the other hand, a machine that drops out of the vpn can be very annoying... -> This change should improve reliability/connectivity. somewhat related thread: NixOS#63869 (cherry picked from commit d53ea20f47160624cdeb0589a5c65f609dd8cffb)
On the same machine I've got tailscale enabled with enabled Magic DNS. However, in my case was the issue in
(In output above the real name was replaced with This specific issue was FIXED BY disallowing IPv4LL for dhcpcd:
Now, dhcpcd is considered to be started only after it has got real ipv4 lease and set default route:
Although I got it fixed for myself by fixing dhcpcd-IPv4LL culprit, I believe wireguard setup should be more robust and resilient against unstable network configuration. Mainly because bug is present in the default "out of the box" configuration, and user must do some amount of research to mitigate the issue. Also, this will break in future, again, when something else breaks |
I'm also having this issue (NixOS 22.05 on a Surface Pro 3). I worked around it by forcing the old behavior ( { config, lib, ... }:
# Workaround for an issue where the Wireguard module doesn't bring up peers
# when the peer unit fails, often because of DNS not being available at
# system startup. See: https://github.com/NixOS/nixpkgs/issues/63869
#
# Also watch: https://github.com/NixOS/nixpkgs/pull/140890
with lib;
let
peerUnitServiceName = peer:
let
dynamicRefreshEnabled = peer.peer.dynamicEndpointRefreshSeconds != 0;
keyToUnitName = replaceChars
[ "/" "-" " " "+" "=" ]
[ "-" "\\x2d" "\\x20" "\\x2b" "\\x3d" ];
unitName = keyToUnitName peer.peer.publicKey;
refreshSuffix = optionalString dynamicRefreshEnabled "-refresh";
in
"wireguard-${peer.interfaceName}-peer-${unitName}${refreshSuffix}";
cfg = config.networking.wireguard;
allPeers = flatten
(mapAttrsToList (interfaceName: interfaceCfg:
map (peer: { inherit interfaceName peer;}) interfaceCfg.peers
) cfg.interfaces);
peerServiceNames = map peerUnitServiceName allPeers;
serviceOverride = serviceName:
nameValuePair serviceName {
serviceConfig = {
Type = mkForce "simple";
Restart = "on-failure";
RestartSec = "5";
};
};
in {
systemd.services = listToAttrs (map serviceOverride peerServiceNames);
}
+1, a minimal Wireguard setup should not require comparatively arcane DHCP tweaks. |
@Majiir you might want to use this PR #140890 {... }:{
disabledModules = [ "services/networking/wireguard.nix" ];
imports = [
# rest of the imports
./path/to/downloaded-pr-wireguard-module.nix
];
} |
Make the dynamic-dns refresh systemd service (controlled via the preexisting option dynamicEndpointRefreshSecond) robust to e.g. dns failures that happen on intermittent network connections. Background: When dns resolution fails with a 'permanent' error ("Name or service not known" instead of "Temporary failure in name resolution"), wireguard won't retry despite WG_ENDPOINT_RESOLUTION_RETRIES=infinity. -> This change should improve reliability/connectivity. somewhat related thread: NixOS#63869
Make the dynamic-dns refresh systemd service (controlled via the preexisting option dynamicEndpointRefreshSecond) robust to e.g. dns failures that happen on intermittent network connections. Background: When dns resolution fails with a 'permanent' error ("Name or service not known" instead of "Temporary failure in name resolution"), wireguard won't retry despite WG_ENDPOINT_RESOLUTION_RETRIES=infinity. -> This change should improve reliability/connectivity. somewhat related thread: #63869 (cherry picked from commit 82c5c3c)
Unfortunately #140890, which landed, did not appear to resolve this for me. On boot, the wireguard peer still fails to establish despite having WG_ENDPOINT_RESOLUTION_RETRIES=infinity in the peer unit. Did anyone else see this fixed or otherwise find a workaround? |
Aha. I am on unstable. But what I didn't realise is that's necessary to add some additional configuration (dynamicEndpointRefreshRestartSeconds or dynamicEndpointRefreshSeconds) in order for it to retry. |
Given that this can result in a lockout would it be better to default |
I suppose the rationale behind it is to respect the wireguard default (which is: try to resolve once at setup time), at least that's what I get from the I agree that "please ignore the changed address" should be opt-in. Also, we could treat all the hosts as dynamic and remove one branch IMHO |
Issue description
Using wireguard on a server with several peers added it looks like they aren't brought up properly and since the type is set to "oneshot" it won't even retry. Problem seems to be related to dns not being available.
Steps to reproduce
Setup a WG server somewhere
Setup a wg on nixos where the server is set as peer and domain name is used instead of ip address (never tried with ip alone though), like
Rebuild
Reboot
Try to ping wg server at 10.8.0.1
--> 100% packet loss
For some reason systemctl does show it as started:
Looking at that start unit file it has this content:
and
ip addr show
also lists the ip:Looking at the unit files
systemctl list-unit-files | grep wireguard
this pop ups:(How to make list-unit-files provide the full name of the file?)
Looking at the status of the peer unit file systemctl status wireguard-wg_jl-peer-{public key}.service, this is returned:
The unit file /nix/store/f5kp5jammckjcpgjd7r8fa8gh0y5kzrj-unit-wireguard-wg_jl-peer-{public key}.service/wireguard-wg_jl-peer-{public key}.service contains:
And the ExecStart /nix/store/08ilzr9yicqvb5wz70c3przyl235vy9w-unit-script-wireguard-wg_jl-peer-{public key}-start contains:
So, once the system is up and running, I have to re-issue
systemctl restart 'wireguard-wg_jl-peer-{public key}.service'
and then it works.For some reason those peer unit files aren't properly executed - likely because dns isn't available at that point.
Changein them from oneshot to simple with retry on fail could improve the situation.
Technical details
Please run
nix-shell -p nix-info --run "nix-info -m"
and paste theresults.
"x86_64-linux"
Linux 4.19.55, NixOS, 19.09pre183832.20b993ef2c9 (Loris)
yes
yes
nix-env (Nix) 2.2.2
""
"nixos-19.09pre183832.20b993ef2c9"
/nix/var/nix/profiles/per-user/root/channels/nixos
The text was updated successfully, but these errors were encountered: