Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/initrd-ssh: switch from Dropbear to OpenSSH #82603

Merged
merged 1 commit into from
Mar 28, 2020

Conversation

emilazy
Copy link
Member

@emilazy emilazy commented Mar 14, 2020

Motivation for this change

Dropbear lags behind OpenSSH significantly in both support for modern
key formats like ssh-ed25519, let alone the recently-introduced
U2F/FIDO2-based sk-ssh-ed25519@openssh.com (as I found when I switched
my authorizedKeys over to it and promptly locked myself out of my
server's initrd SSH, breaking reboots), as well as security features
like multiprocess isolation. Using the same SSH daemon for stage-1 and
the main system ensures key formats will always remain compatible, as
well as more conveniently allowing the sharing of configuration and
host keys.

The main reason to use Dropbear over OpenSSH would be initrd space
concerns, but NixOS initrds are already large (17 MiB currently on my
server), and the size difference between the two isn't huge (the test's
initrd goes from 9.7 MiB to 12 MiB with this change). If the size is
still a problem, then it would be easy to shrink sshd down to a few
hundred kilobytes by using an initrd-specific build that uses musl and
disables things like Kerberos support.

This passes the test and works on my server, but more rigorous testing
and review from people who use initrd SSH would be appreciated!

@GrahamcOfBorg test initrd-network-ssh

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@emilazy emilazy force-pushed the nixos-initrd-openssh branch from e96e443 to 9c00766 Compare March 14, 2020 18:29
@ofborg ofborg bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: changelog 8.has: documentation This PR adds or changes documentation 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Mar 14, 2020
@emilazy
Copy link
Member Author

emilazy commented Mar 14, 2020

@GrahamcOfBorg test initrd-network-ssh

@bachp
Copy link
Member

bachp commented Mar 14, 2020

I think it's a good idea to not have two SSH servers to maintain.

Copy link
Member

@lukateras lukateras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Works fine locally for me (tested with qemu-vm.nix and separately with a system with bootloader that supports initrd secrets).

Note reverse-incompatible boot.initrd.network.host{RSA,DSS,ECDSA}Key option removal.

Should be good to merge once test passes.

@alyssais
Copy link
Member

Test failed.-

@lukateras
Copy link
Member

OpenSSH takes some time to start up, as a result test non-deterministically fails. This fixes the issue:

diff --git a/nixos/tests/initrd-network-ssh/default.nix b/nixos/tests/initrd-network-ssh/default.nix
index cc193c6cb6e..7e0930617f6 100644
--- a/nixos/tests/initrd-network-ssh/default.nix
+++ b/nixos/tests/initrd-network-ssh/default.nix
@@ -56,7 +56,17 @@ import ../make-test-python.nix ({ lib, ... }:
   testScript = ''
     start_all()
     client.wait_for_unit("network.target")
-    client.wait_until_succeeds("ping -c 1 server")
+
+
+    def ssh_is_up(_) -> bool:
+        status, _ = client.execute("nc -z server 22")
+        return status == 0
+
+
+    with client.nested("waiting for SSH server to come up"):
+        retry(ssh_is_up)
+
+
     client.succeed(
         "ssh -i /etc/sshKey -o UserKnownHostsFile=/etc/knownHosts server 'touch /fnord'"
     )

@emilazy emilazy force-pushed the nixos-initrd-openssh branch from 9c00766 to b550f41 Compare March 14, 2020 21:22
@emilazy
Copy link
Member Author

emilazy commented Mar 14, 2020

Applied @yegortimoshenko's patch, thanks!

@GrahamcOfBorg test initrd-network-ssh

@emilazy emilazy force-pushed the nixos-initrd-openssh branch from b550f41 to 7fae94b Compare March 19, 2020 00:45
@emilazy
Copy link
Member Author

emilazy commented Mar 19, 2020

Rebased to fix conflict in the release notes.

@GrahamcOfBorg test initrd-network-ssh

cc @volth @Mic92 for having touched this module (relatively) recently

@emilazy emilazy force-pushed the nixos-initrd-openssh branch from 7fae94b to 6bb0441 Compare March 21, 2020 00:38
@emilazy
Copy link
Member Author

emilazy commented Mar 21, 2020

Updated to remove the key generation functionality (and thus remove ssh-keygen from initrds).

@emilazy emilazy force-pushed the nixos-initrd-openssh branch from 6bb0441 to 86898be Compare March 21, 2020 00:40
@emilazy
Copy link
Member Author

emilazy commented Mar 21, 2020

@GrahamcOfBorg test initrd-network-ssh

@emilazy emilazy force-pushed the nixos-initrd-openssh branch 3 times, most recently from caec075 to d35cad4 Compare March 25, 2020 08:24
Dropbear lags behind OpenSSH significantly in both support for modern
key formats like `ssh-ed25519`, let alone the recently-introduced
U2F/FIDO2-based `sk-ssh-ed25519@openssh.com` (as I found when I switched
my `authorizedKeys` over to it and promptly locked myself out of my
server's initrd SSH, breaking reboots), as well as security features
like multiprocess isolation. Using the same SSH daemon for stage-1 and
the main system ensures key formats will always remain compatible, as
well as more conveniently allowing the sharing of configuration and
host keys.

The main reason to use Dropbear over OpenSSH would be initrd space
concerns, but NixOS initrds are already large (17 MiB currently on my
server), and the size difference between the two isn't huge (the test's
initrd goes from 9.7 MiB to 12 MiB with this change). If the size is
still a problem, then it would be easy to shrink sshd down to a few
hundred kilobytes by using an initrd-specific build that uses musl and
disables things like Kerberos support.

This passes the test and works on my server, but more rigorous testing
and review from people who use initrd SSH would be appreciated!
@emilazy emilazy force-pushed the nixos-initrd-openssh branch from d35cad4 to d930466 Compare March 25, 2020 08:27
@emilazy
Copy link
Member Author

emilazy commented Mar 25, 2020

Rebased again.

@GrahamcOfBorg test initrd-network-ssh

@lukateras
Copy link
Member

LGTM. I'd like to merge this on Mar 27 unless anyone objects. cc @Mic92 @adisbladis @cole-h @andrew-d @misuzu

@lukateras lukateras merged commit 5626cb9 into NixOS:master Mar 28, 2020
@emilazy emilazy deleted the nixos-initrd-openssh branch March 28, 2020 19:00
@etu
Copy link
Contributor

etu commented Apr 12, 2020

This PR broke the build of my system, which was to be expected. I just found the PR after I updated the channel on a system that use initrd ssh.

And I read the manual for the new setting and followed the setup of creating keys and configured as described in the example of the manual.

Then it didn't work. And it died on my with the following error (I promise, the file exists):

cp: cannot stat '/persistent/etc/initrd-ssh/ssh_host_ed_25519_key': No such file or directory

So I read the test that was mentioned above after finding this issue through searching the first error message (regarding the old setting being removed) and doing a git blame and looking a bit in the log. And I've read the test to see that the test specify boot.initrd.network.ssh.hostKeys as a list of paths and not as list of strings (as it's described in the manual for boot.initrd.network.ssh.hostKeys).

Broken:

{
  boot.initrd.network.ssh.hostKeys = [
    "/persistent/etc/initrd-ssh/ssh_host_rsa_key"
    "/persistent/etc/initrd-ssh/ssh_host_ed_25519_key"
  ];
}

Works:

{
  boot.initrd.network.ssh.hostKeys = [
    /persistent/etc/initrd-ssh/ssh_host_rsa_key
    /persistent/etc/initrd-ssh/ssh_host_ed_25519_key
  ];
}

I feel that this should be mentioned in the documentation somehow. In text or through an updated example.

Also, big thanks to @emilazy for doing this work, it has always been a bit meh for me that we've had dropbear since it doesn't support all types of keys that are common today. :)

@emilazy
Copy link
Member Author

emilazy commented Apr 12, 2020

Yep, see #84976, #85004; sorry you ran into this before the latter was merged. It's unfortunate to encourage people to put bare paths in there because it'll mean secrets get copied to the store even when the bootloader supports initrd secrets, but it's the best workaround for now with the current initrd secrets implementation.

@etu
Copy link
Contributor

etu commented Apr 12, 2020

Cool! Thanks :)

Just wanted to make sure someone knew about it :)

@elitak
Copy link
Contributor

elitak commented Apr 19, 2020

This change has created many problems for me, especially when combined with nixops deployments; it smacks of fixing something that wasn't broken:

  1. I've no secure way or desire to push state (private host keys) to remote machines
  2. Old system profiles that are broken by a bad path persist in breaking the rebuilds until they are removed by --delete-generations
  3. I haven't even tried to boot these initrds yet, but it sounds like others are having problem with the openssh process starting up reliably.

At least one of these should be possible, going forward:

  1. Select between using Dropbear or OpenSSH in initrd
  2. Auto-generate keys in OpenSSH
  3. Wait for proper secret storage in the nix store to be finished (I'm unclear on the status of this)
  4. Find a way to deterministically generate the host key at run-time from the hardware

I'm willing to accept the security risks of using Dropbear and auto-generated keys. As far as I understand, any key embedded in the initrd is prone to being compromised anyway, and only through using some HW-crypto TPM scheme with secure boot could it be significantly mitigated. Keeping unprivileged users away from the host key but not root is a distinction without a difference, for my purposes.

I may be missing something simple regarding nixops/nixstore and secrets, but my initial impression of these changes is that they add configuration and runtime complexity for no real benefit. Forcing users to add more lines to their configs to do the same thing as before is regressive, and should be avoided at all costs.

@emilazy
Copy link
Member Author

emilazy commented Apr 19, 2020

This PR scratched at least one concrete itch for me (ed25519 key support), but I'm sorry that it's causing you problems.

I understand your objections primarily relate to the initrd secrets mechanism, which was already optionally used by the previous implementation. My original PR supported generating host keys at boot, like the previous implementation. I removed it at the request of the reviewer @Mic92 (#82603 (comment), #82603 (comment)) as it makes very little sense to generate SSH host keys on every boot, and the migration overhead is one ssh-keygen(1) and a single line of NixOS configuration. The functionality could easily be restored, though, and has nothing to do with Dropbear vs. OpenSSH.

I haven't even tried to boot these initrds yet, but it sounds like others are having problem with the openssh process starting up reliably.

I haven't heard any reports of this; @etu's comment was a system build-time issue related to the implementation of initrd secrets that would have happened with the same configuration before this PR, and has been fixed in master. If you know of anyone having issues, please point me to them so I can help fix it.

@emilazy
Copy link
Member Author

emilazy commented Apr 19, 2020

(As far as fixing things that weren't broken goes, I think "using Ed25519 SSH public keys in your NixOS configuration will silently brick your initrd without warning" is definitely something broken, and had to be addressed one way or another; I implemented this PR after having to manually rescue my server boot after migrating off legacy RSA keys.)

@elitak
Copy link
Contributor

elitak commented Apr 19, 2020

NixOS breaking because it no longer has an RSA hostkey to use in initrd is a problem orthogonal to these changes; checks and warnings can always be improved. Rather than continuing to try to explain why the changes bother me, let me propose a solution:

TPM is the superior option for securing the initrd environment. I'm not too familiar with it myself, but I think it should be possible to store the private part of the host key in there, so not even root can ever see it. I don't understand TPM's design well enough to entrust it with my disk keys, but for a host key it should be more than enough. Whether OpenSSH or the initrd environment can be made to use the TPM for its encryption is a matter I would need to research.

Any precautions taken in storing keys outside of TPM are half-measures. Read permissions to /boot/**/*initrd (the default, btw) give an attacker all the material he needs to spoof your host and capture its disk keys from you.

I think configuration.nix should offer admins at minimum 2 options:

  1. I don't care about security or I trust my network; generate a host key every boot
  2. Store the host key material in TPM where nobody can read it, not even root.
  3. (optional, actual) Store a fixed key somewhere on the host's FS

The second option is the best and should be encouraged, but not everyone has a TPM device available. Appropriate warnings should be shown based on whichever is chosen, but forcing admins to manage extra state by default is bad practice, so the first option should be the not-so-silent default.

@emilazy
Copy link
Member Author

emilazy commented Apr 19, 2020

I'm referring to the user authorizedKeys that are used to log in to the initrd sshd, not host keys. No amount of build-time checks can get around the fact that my YubiKey only has 4 OpenPGP key slots (only one of which is an Authentication key, i.e. usable for SSH) and all of them are Ed25519; there is literally no way for me to use my hardware security token to log in to my server without ssh-ed25519 key support (or sk-ssh-{ecdsa,ed25519}@openssh.com, since FIDO supports an arbitrary number of derived key pairs on the device, but again this is OpenSSH only). (As far as read permissions go, I have options = [ "umask=077 " ] set on /boot; arguably this should be default when using initrd secrets but it'd be fiddly to achieve, as /boot is often FAT32 which has no permission support, hence the requirement to use a mount option.)

Deriving a private key from a TPM would potentially be possible for those platforms that support it, but it would be a lot of engineering work (likely requiring extensive patches to the SSH daemon, as TPMs generally do not disclose a private key blob to you but act as an HSM), and we'd still need a basic mechanism to copy things to the initrd for the simple case. Currently NixOS has no real strong verified boot chain or TPM support, so I don't think that pervasively using the TPM really makes sense, but it would definitely be an interesting project to implement.

The following patch should enable generating keys at boot again:

diff --git a/nixos/modules/system/boot/initrd-ssh.nix b/nixos/modules/system/boot/initrd-ssh.nix
index 5a334e69056..30da7b533a6 100644
--- a/nixos/modules/system/boot/initrd-ssh.nix
+++ b/nixos/modules/system/boot/initrd-ssh.nix
@@ -133,19 +133,13 @@ in
         assertion = cfg.authorizedKeys != [];
         message = "You should specify at least one authorized key for initrd SSH";
       }
-
-      {
-        assertion = cfg.hostKeys != [];
-        message = ''
-          You must now pre-generate the host keys for initrd SSH.
-          See the boot.initrd.network.ssh.hostKeys documentation
-          for instructions.
-        '';
-      }
     ];
 
     boot.initrd.extraUtilsCommands = ''
       copy_bin_and_libs ${pkgs.openssh}/bin/sshd
+      ${optionalString (cfg.hostKeys == []) ''
+        copy_bin_and_libs ${pkgs.openssh}/bin/ssh-keygen
+      ''}
       cp -pv ${pkgs.glibc.out}/lib/libnss_files.so.* $out/lib
     '';
 
@@ -181,6 +175,11 @@ in
         chmod 0600 "${initrdKeyPath path}"
       '')}
 
+      ${optionalString (cfg.hostKeys == []) ''
+        ssh-keygen -q -t rsa -N "" -f /etc/ssh/ssh_host_rsa_key
+        ssh-keygen -q -t ed25519 -N "" -f /etc/ssh/ssh_host_ed25519_key
+      ''}
+
       /bin/sshd -e
     '';

If this works for you, I can open a new PR and see what people think about restoring this functionality upstream. I think that recommending people set static host keys is still the most reasonable default, though; ssh(1) really doesn't like host keys changing on every connect and storing the keys in an initrd secret is both simple to set up to be relatively secure (only root or your hosting provider can see the keys), and is never more insecure than generating keys on boot even when the bootloader doesn't support initrd secrets natively.

@edef1c
Copy link
Member

edef1c commented May 11, 2020

While I'm excited to see this PR (since I would rather be rid of Dropbear either way), it seems worth noting that the concrete itch mentioned (Ed25519 support) got covered by Dropbear prior to this getting merged: mkj/dropbear#91

@elitak
Copy link
Contributor

elitak commented May 12, 2020

For something as mission-critical and prone to breakage as the initrd environment, I find it imprudent to make such a substantial change and not leave behind an option to keep the old behavior, at least until the next major release, warning users until then. Being forced to face the potential of bricking your remote host with an update to initrd you have no use for is not a pleasant experience.

I also don't understand the disdain for Dropbear; it does the job of creating a secure channel over which to receive the unlock key. I'm confident many users like myself can trust their networks, and don't have to worry about MITM attacks for stealing keys that could probably be more practically stolen by reading kernel memory on the booted host. It should probably always be an option, if only to keep the initrd closure size down by a few MB.

@emilazy
Copy link
Member Author

emilazy commented May 12, 2020

@edef1c

It's good to hear that Dropbear supports Ed25519 keys now, since it's widely deployed in a lot of embedded environments and Ed25519 keys are getting very common. Still, there's a lot of functionality with no support in Dropbear; other than U2F/FIDO keys, to my knowledge Dropbear doesn't support SSH CAs at all, so even if it had equally good privilege separation there are a lot of configurations that would require OpenSSH.

The supported set of ciphers is also not ideal by modern cryptography standards, with no support for AES-GCM or ChaCha20-Poly1305, our top-preference OpenSSH ciphers (though this too seems like it's being addressed, it's several years after OpenSSH, libssh, tinyssh, ...).

@elitak

Unless you have any concrete problems with this PR that you didn't mention in previous comments, I'm afraid I don't have much more to say than my existing replies. The functionality is tested, this change has been in unstable for a month and a half now, and nobody has had any issues to my knowledge that weren't a result of initrd secrets misconfiguration/infelicities (which are tracked in other issues (#85000, #85563) and orthogonal to the choice of SSH implementation), so while I appreciate that changes to initrd should be made carefully, I don't think your stated reasons for conservatism override the benefits of using OpenSSH (better compatibility, better security architecture, the maintenance and support benefits of only having one SSH implementation used in NixOS).

I believe that maintaining parallel support for Dropbear would be a significant maintenance and testing burden and lead to a lot of additional code complexity for no concrete benefit. If initrd closure size is an issue, then there are several potential avenues to reducing the OpenSSH closure size mentioned in the original commit message that could be explored; I'd be happy to review a PR adding such an option if you'd like to write one. If you're having specific problems with initrd SSH after this change, I'd be happy to help debug them. Otherwise, I understand that you don't personally agree with this change, but I hope you can also understand that I think the benefits outweigh the costs of switching implementation.

then path
else let name = builtins.baseNameOf path; in
builtins.unsafeDiscardStringContext ("/etc/ssh/" +
substring 1 (stringLength name) name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? My testing with a NixOS test seems to indicate that store paths work fine without this. Only builtins.unsafeDiscardStringContext is needed because the path ends up in an attrset key.

Also, paths like "${./some_file}" are strings (isString == true) but still have context that needs to be removed, causing the current implementation to fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context: #91744

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/full-encrypted-nixos-system-on-legacy-boot-with-secrets-and-remote-unlock-for-unstable-20-03/8279/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: changelog 8.has: documentation This PR adds or changes documentation 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants