Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zfs-mount fails because directory isn't empty, screws up bind mounts and NFS #4784

Closed
brando56894 opened this issue Jun 22, 2016 · 15 comments
Closed
Labels
Status: Inactive Not being actively updated

Comments

@brando56894
Copy link

I posted this over at the Arch forums but I haven't gotten any responses after a week. I've been running ZoL on Arch for probably about 6 months and ZoL on Ubuntu 14.10 LTS for about 3 months prior to that, before that I was running it on FreeNAS 9.

Everything was working fine up until a month or so ago when the zfs-mount.service daemon failed to start after a reboot, which causes my bind mounts to fail, which in turn causes my NFS mounts to screw up since they're mounting empty directories and breaks my Usenet KVM since it runs out of space.

I can't seem to figure out what the issue is, it seems like Linux is trying to double mount my pools for some reason, because even though the daemon fails, all the pools (3 in total) are mounted successfully and previous to zfs-mount running since they're already mounted. If I disable the zfs-mount service then the pools never get mounted.

Here's the journal error from zfs-mount.service

 [root@nas ~]# status zfs-mount
● zfs-mount.service - Mount ZFS filesystems
   Loaded: loaded (/usr/lib/systemd/system/zfs-mount.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2016-06-21 21:39:40 EDT; 14h ago
  Process: 4414 ExecStart=/usr/bin/zfs mount -a (code=exited, status=1/FAILURE)
 Main PID: 4414 (code=exited, status=1/FAILURE)

Jun 21 21:39:40 nas.brandongolway.us systemd[1]: Starting Mount ZFS filesystems...
Jun 21 21:39:40 nas.brandongolway.us zfs[4414]: cannot mount '/mnt/downloads': directory is not empty
Jun 21 21:39:40 nas.brandongolway.us zfs[4414]: cannot mount '/mnt/downloads/usenet': directory is not empty
Jun 21 21:39:40 nas.brandongolway.us zfs[4414]: cannot mount '/mnt/safekeeping': directory is not empty
Jun 21 21:39:40 nas.brandongolway.us zfs[4414]: cannot mount '/mnt/storage': directory is not empty
Jun 21 21:39:40 nas.brandongolway.us zfs[4414]: cannot mount '/mnt/storage/multimedia': directory is not empty
Jun 21 21:39:40 nas.brandongolway.us systemd[1]: zfs-mount.service: Main process exited, code=exited, status=1/FAILURE
Jun 21 21:39:40 nas.brandongolway.us systemd[1]: Failed to start Mount ZFS filesystems.
Jun 21 21:39:40 nas.brandongolway.us systemd[1]: zfs-mount.service: Unit entered failed state.
Jun 21 21:39:40 nas.brandongolway.us systemd[1]: zfs-mount.service: Failed with result 'exit-code'.

Once the system is fully loaded I have to destroy my Usenet KVM, manually mount the bind mounts (mount -a doesn't work for some reason, even though they're in /etc/fstab), then restart the Usenet KVM and all is well. I've also tried using service files for the bind mounts but that didn't seem to work either.

I'm using the zfs-linux-git package0.6.5_r62_g16fc1ec_4.6.2_1-1

@FransUrbo
Copy link
Contributor

Everything was working fine up until a month or so ago when the zfs-mount.service daemon failed to start after a reboot, which causes my bind mounts to fail, which in turn causes my NFS mounts to screw up since they're mounting empty directories and breaks my Usenet KVM since it runs out of space.

This happen every now and then but have proven almost impossible to debug (because it can't be reproduced reliably).

The "only" (real) way to solve this is to boot into rescue mode, import your pool WITHOUT mouthing anything but the base filesystem. Then remove everything in there.

Then run "zfs mount -a" until it possibly fail on some other directory. Then make sure nothing is mounted below that, remove all mount points (make sure they're empty!) and then try "mount -a" again. Do this until it can mount and unmount a couple of times.

Then unmount all filesystems, export the pool and reboot. This time it should boot correctly.

@brando56894
Copy link
Author

brando56894 commented Jun 22, 2016

Ah that's wonderful hahaha Luckily I don't reboot often and I'm only using ZoL until FreeNAS 10 becomes usable later this year. I'll give it a try later on. Thanks for the info.

I was thinking that my best solution would be to just forgo the bind mounts and export the mount points directly so it doesn't matter if the zfs-mount fails since the pools still get mounted either way, I'm the only one using my server so security isn't really an issue.

Also by "base filesystem" do you mean / or the base level of my pool (for example I have a pool called Storage with Multimedia and VM datasets, are you saying just mount Storage as /mnt/storage and do an rm -rf /mnt/storage/* ?)

Now that I recall I did have files being written to a top level directory (ex. Storage) instead of a dataset (ex. VMs) and that was confusing the hell out of me because it was mounted in the VM but the file existed in one location but not in another. I'm guessing something like that is what you're referring to?

@tuxoko
Copy link
Contributor

tuxoko commented Jun 23, 2016

@FransUrbo
So, mount would failed on non-empty mount point?
But doesn't linux allow mounting on non-empty directory?

@GregorKopka
Copy link
Contributor

But doesn't linux allow mounting on non-empty directory?

While Linux allows mounting on a non-empty directory, Solaris didn't (where ZFS comes from).

With a current enough ZoL you can zfs set overlay=on on the dataset(s) in question (or a parent dataset, since the property is inherited) to make ZFS behave like linux.

@FransUrbo
Copy link
Contributor

@brando56894 I mean the base of your pool.

@tuxoko, what @GregorKopka said. It was decided that ZFS is ZFS and should behave like ZFS, not Linux. Maybe don't make huge sense for the large [Linux] community, but it makes more sense when you think of ZFS a multi-OS service..

@richardelling
Copy link
Contributor

@GregorKopka your history is incorrect. Solaris always did allow overlay mounts, which led to a significant number of service calls. Service calls are expensive. Thus when ZFS came along, they learned from their prior mistake and restricted overlay mounts. That said, in modern times, one could argue there are use cases for overlay mounts, even though the cost of service calls continues to rise. Gun loaded, pointed down, hoping to miss foot.

@GregorKopka
Copy link
Contributor

@richardelling the SUN didn't shine on my prior to ZFS, sorry for that.

@kpande following that reasoning zfs send/recv shouldn't be recommended because you'll eventually be hit by #4811, I think this line of thinking is a bad idea and should not be pursued any further...

@GregorKopka
Copy link
Contributor

@kpande my point was that the bug you referenced should be fixed instead, let's continue the discussion there.

@tuxoko
Copy link
Contributor

tuxoko commented Jul 1, 2016

@FransUrbo What about making the mountpoint readonly? That should prevent anyone from accidentally put stuff in it, no?

@pyropeter
Copy link

This happens because the order in which fstab-mounts and zfs-mounts happen is undefined.

See zfs-mount.service and one of the auto-generated mount-units.

Systemd orders the auto-generated mounts by filesystem hierachy. See systemd.mount(5).

@Jip-Hop
Copy link

Jip-Hop commented Jan 17, 2018

Any news on this issue?

Because I think I'm affected too.
I followed the Ubuntu-16.04-Root-on-ZFS guide and after a few reboots I'm running into this:

systemctl status zfs-mount.service
● zfs-mount.service - Mount ZFS filesystems
   Loaded: loaded (/lib/systemd/system/zfs-mount.service; static; vendor preset: enabled)
   Active: failed (Result: exit-code) since Wed 2018-01-17 19:20:15 CET; 1min 19s ago
  Process: 1568 ExecStart=/sbin/zfs mount -a (code=exited, status=1/FAILURE)
 Main PID: 1568 (code=exited, status=1/FAILURE)

Jan 17 19:20:15 systemd[1]: Starting Mount ZFS filesystems...
Jan 17 19:20:15 zfs[1568]: cannot mount '/root': directory is not empty
Jan 17 19:20:15 systemd[1]: zfs-mount.service: Main process exited, code=exited, status=1/FAILURE
Jan 17 19:20:15 systemd[1]: Failed to start Mount ZFS filesystems.
Jan 17 19:20:15 systemd[1]: zfs-mount.service: Unit entered failed state.
Jan 17 19:20:15 systemd[1]: zfs-mount.service: Failed with result 'exit-code'.

Any advice on how to fix this?
I guess this is the cause of a lot of other problems I'm seeing, e.g. apache2 not starting because SSLCertificateFile: file '/root/.acme.sh/domain/domain.cer' does not exist or is empty.

Even though it was there for sure before rebooting.

@Jip-Hop
Copy link

Jip-Hop commented Jan 17, 2018

Like I said I followed the guide in the wiki. Therefore I ended up with this version of ZoL:

dmesg | grep ZFS
[    0.000000] Command line: BOOT_IMAGE=/ROOT/ubuntu@/boot/vmlinuz-4.4.0-109-generic root=ZFS=rpool/ROOT/ubuntu ro
[    0.000000] Kernel command line: BOOT_IMAGE=/ROOT/ubuntu@/boot/vmlinuz-4.4.0-109-generic root=ZFS=rpool/ROOT/ubuntu ro
[    3.963731] ZFS: Loaded module v0.6.5.6-0ubuntu16, ZFS pool version 5000, ZFS filesystem version 5

That's not the most recent version I guess. Can this problem be solved by updating to the new release? If so, how should I do that?

@pjgoodall
Copy link

pjgoodall commented Nov 20, 2018

I have a pool i can mount with an alternate root. Export it, and it leaves the alternate root behind.

Ubuntu 18.04.1 LTS
libzfs2linux/bionic-updates,now 0.7.5-1ubuntu16.4 amd64 [installed,automatic]
zfs-zed/bionic-updates,now 0.7.5-1ubuntu16.4 amd64 [installed,automatic]
zfsutils-linux/bionic-updates,now 0.7.5-1ubuntu16.4 amd64 [installed]
[ zpool import -R /recover/foo my-pool
 > ls -l recover
 drwxr-xr-x 6 root root 4096 Nov 20 08:13 foo
 zpool export my-pool
 > ls -l /recover
 drwxr-xr-x 6 root root 4096 Nov 20 08:13 foo
 rm -rf /recover/foo
] forever

same happens with a normal import

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Inactive Not being actively updated
Projects
None yet
Development

No branches or pull requests

9 participants