Skip to content
This repository has been archived by the owner on Feb 24, 2020. It is now read-only.

Unable to GC some containers #1922

Open
jcollie opened this issue Dec 28, 2015 · 24 comments
Open

Unable to GC some containers #1922

jcollie opened this issue Dec 28, 2015 · 24 comments

Comments

@jcollie
Copy link

jcollie commented Dec 28, 2015

Using CentOS 7 (kernel 3.10.0-327.3.1.el7.x86_64), rkt 0.14.0, I'm unable to GC some containers:

[root@svr05 ~]# rkt gc
Garbage collecting pod "42e78965-c60b-4f4f-b412-484cd381fe90"
Error getting stage1 treeStoreID: no such file or directory
Skipping stage1 GC
Unable to remove pod "42e78965-c60b-4f4f-b412-484cd381fe90": remove /var/lib/rkt/pods/exited-garbage/42e78965-c60b-4f4f-b412-484cd381fe90/stage1/rootfs: device or resource busy
[root@svr05 ~]# mount | fgrep 42e78965
[root@svr05 ~]# lsof +D /var/lib/rkt/pods/exited-garbage/42e78965-c60b-4f4f-b412-484cd381fe90
[root@svr05 ~]#  uname -a
Linux svr05.ocjtech.us 3.10.0-327.3.1.el7.x86_64 #1 SMP Wed Dec 9 14:09:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
[root@svr05 ~]# rkt version
rkt version 0.14.0
appc version 0.7.4
[root@svr05 ~]#  rkt list
1 error(s) encountered when listing pods:
----------------------------------------
Unable to read pod 42e78965-c60b-4f4f-b412-484cd381fe90 manifest:
  no such file or directory
----------------------------------------
Misc:
  rkt's appc version: 0.7.4
----------------------------------------

UUID        APP     IMAGE NAME          STATE   NETWORKS
09914839    sabnzbd     ocjtech.us/sabnzbd:0.12     running 
bd8a5ffe    sonarr      ocjtech.us/sonarr:0.10      running 
e40e09e7    privateinternet ocjtech.us/pia:0.4      exited  
        transmission    ocjtech.us/transmission:0.2     
[root@svr05 ~]# mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=1889732k,nr_inodes=472433,mode=755)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
configfs on /sys/kernel/config type configfs (rw,relatime)
/dev/mapper/centos-root on / type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=28,pgrp=1,timeout=300,minproto=5,maxproto=5,direct)
mqueue on /dev/mqueue type mqueue (rw,relatime,seclabel)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
/dev/mapper/centos-home on /home type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
/dev/sda1 on /boot type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
192.168.4.101,192.168.4.102,192.168.4.103:/ on /mnt/ceph type ceph (rw,relatime,name=admin,secret=<hidden>)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=380028k,mode=700)
overlay on /var/lib/rkt/pods/run/bd8a5ffe-0480-4196-aab4-5e8040d7cb6e/stage1/rootfs type overlay (rw,relatime,context="system_u:object_r:svirt_sandbox_file_t:s0:c30,c943",lowerdir=/var/lib/rkt/cas/tree/deps-sha512-78d2e3ef53a4963b3af0dcf533fb91874dc4642b7723e24fd5bf40e1f99ca9df/rootfs,upperdir=/var/lib/rkt/pods/run/bd8a5ffe-0480-4196-aab4-5e8040d7cb6e/overlay/deps-sha512-78d2e3ef53a4963b3af0dcf533fb91874dc4642b7723e24fd5bf40e1f99ca9df/upper,workdir=/var/lib/rkt/pods/run/bd8a5ffe-0480-4196-aab4-5e8040d7cb6e/overlay/deps-sha512-78d2e3ef53a4963b3af0dcf533fb91874dc4642b7723e24fd5bf40e1f99ca9df/work)
overlay on /var/lib/rkt/pods/run/bd8a5ffe-0480-4196-aab4-5e8040d7cb6e/stage1/rootfs/opt/stage2/sonarr/rootfs type overlay (rw,relatime,context="system_u:object_r:svirt_sandbox_file_t:s0:c30,c943",lowerdir=/var/lib/rkt/cas/tree/deps-sha512-b6214b00f1d2cdac36c5bd4b378b108539a9c8b739b450a1405d0020ce09571c/rootfs,upperdir=/var/lib/rkt/pods/run/bd8a5ffe-0480-4196-aab4-5e8040d7cb6e/overlay/deps-sha512-b6214b00f1d2cdac36c5bd4b378b108539a9c8b739b450a1405d0020ce09571c/upper/sonarr,workdir=/var/lib/rkt/pods/run/bd8a5ffe-0480-4196-aab4-5e8040d7cb6e/overlay/deps-sha512-b6214b00f1d2cdac36c5bd4b378b108539a9c8b739b450a1405d0020ce09571c/work/sonarr)
overlay on /var/lib/rkt/pods/run/09914839-3a18-46e4-afe7-e70810683b18/stage1/rootfs type overlay (rw,relatime,context="system_u:object_r:svirt_sandbox_file_t:s0:c149,c713",lowerdir=/var/lib/rkt/cas/tree/deps-sha512-78d2e3ef53a4963b3af0dcf533fb91874dc4642b7723e24fd5bf40e1f99ca9df/rootfs,upperdir=/var/lib/rkt/pods/run/09914839-3a18-46e4-afe7-e70810683b18/overlay/deps-sha512-78d2e3ef53a4963b3af0dcf533fb91874dc4642b7723e24fd5bf40e1f99ca9df/upper,workdir=/var/lib/rkt/pods/run/09914839-3a18-46e4-afe7-e70810683b18/overlay/deps-sha512-78d2e3ef53a4963b3af0dcf533fb91874dc4642b7723e24fd5bf40e1f99ca9df/work)
overlay on /var/lib/rkt/pods/run/09914839-3a18-46e4-afe7-e70810683b18/stage1/rootfs/opt/stage2/sabnzbd/rootfs type overlay (rw,relatime,context="system_u:object_r:svirt_sandbox_file_t:s0:c149,c713",lowerdir=/var/lib/rkt/cas/tree/deps-sha512-5a982fb08d2a7d8a774f2b85eecf0326482a0c3e9aafc18bacb64422a3cbe20f/rootfs,upperdir=/var/lib/rkt/pods/run/09914839-3a18-46e4-afe7-e70810683b18/overlay/deps-sha512-5a982fb08d2a7d8a774f2b85eecf0326482a0c3e9aafc18bacb64422a3cbe20f/upper/sabnzbd,workdir=/var/lib/rkt/pods/run/09914839-3a18-46e4-afe7-e70810683b18/overlay/deps-sha512-5a982fb08d2a7d8a774f2b85eecf0326482a0c3e9aafc18bacb64422a3cbe20f/work/sabnzbd)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime)
overlay on /var/lib/rkt/pods/exited-garbage/e40e09e7-ff47-47b7-a821-eb0d1f319524/stage1/rootfs type overlay (rw,relatime,context="system_u:object_r:svirt_sandbox_file_t:s0:c126,c297",lowerdir=/var/lib/rkt/cas/tree/deps-sha512-78d2e3ef53a4963b3af0dcf533fb91874dc4642b7723e24fd5bf40e1f99ca9df/rootfs,upperdir=/var/lib/rkt/pods/run/e40e09e7-ff47-47b7-a821-eb0d1f319524/overlay/deps-sha512-78d2e3ef53a4963b3af0dcf533fb91874dc4642b7723e24fd5bf40e1f99ca9df/upper,workdir=/var/lib/rkt/pods/run/e40e09e7-ff47-47b7-a821-eb0d1f319524/overlay/deps-sha512-78d2e3ef53a4963b3af0dcf533fb91874dc4642b7723e24fd5bf40e1f99ca9df/work)
overlay on /var/lib/rkt/pods/exited-garbage/e40e09e7-ff47-47b7-a821-eb0d1f319524/stage1/rootfs/opt/stage2/privateinternet/rootfs type overlay (rw,relatime,context="system_u:object_r:svirt_sandbox_file_t:s0:c126,c297",lowerdir=/var/lib/rkt/cas/tree/deps-sha512-bf60d75289e2254fb14edde2de6742da5d4806c57b88634a55f643b08dae7120/rootfs,upperdir=/var/lib/rkt/pods/run/e40e09e7-ff47-47b7-a821-eb0d1f319524/overlay/deps-sha512-bf60d75289e2254fb14edde2de6742da5d4806c57b88634a55f643b08dae7120/upper/privateinternet,workdir=/var/lib/rkt/pods/run/e40e09e7-ff47-47b7-a821-eb0d1f319524/overlay/deps-sha512-bf60d75289e2254fb14edde2de6742da5d4806c57b88634a55f643b08dae7120/work/privateinternet)
overlay on /var/lib/rkt/pods/exited-garbage/e40e09e7-ff47-47b7-a821-eb0d1f319524/stage1/rootfs/opt/stage2/transmission/rootfs type overlay (rw,relatime,context="system_u:object_r:svirt_sandbox_file_t:s0:c126,c297",lowerdir=/var/lib/rkt/cas/tree/deps-sha512-afb419a3ff4475f3b62926fc90a7b093276d853454d560bc9b254f3d5ec01143/rootfs,upperdir=/var/lib/rkt/pods/run/e40e09e7-ff47-47b7-a821-eb0d1f319524/overlay/deps-sha512-afb419a3ff4475f3b62926fc90a7b093276d853454d560bc9b254f3d5ec01143/upper/transmission,workdir=/var/lib/rkt/pods/run/e40e09e7-ff47-47b7-a821-eb0d1f319524/overlay/deps-sha512-afb419a3ff4475f3b62926fc90a7b093276d853454d560bc9b254f3d5ec01143/work/transmission)
proc on /var/lib/rkt/pods/exited-garbage/e40e09e7-ff47-47b7-a821-eb0d1f319524/netns type proc (rw,nosuid,nodev,noexec,relatime)
[root@svr05 ~]# 
@alban
Copy link
Member

alban commented Dec 28, 2015

According to rmdir(2), rmdir returns EBUSY when "pathname is currently used as a mount point or is the root directory of the calling process".

First, can you check if any processes use the directory that cannot be removed as root directory? Something like:

# ROOTFS_INODE=$(stat --format=%i /var/lib/rkt/pods/run/8a45bd13-4625-44d5-b937-8a04e5b9a52c/stage1/rootfs/)
# for i in /proc/[0-9]*/ ; do echo -n "$i $(cat $i/comm) " ; stat -L $i/root | grep Device ; done | grep $ROOTFS_INODE

If you find remaining processes, check if they are in a different pid and mnt namespaces (/proc/[0-9]*/ns/{mnt,pid})

If it is not that, maybe pathname is still used as a mount point in some namespace. Before kernel 3.18, rmdir could return EBUSY when the directory is a mount point in another namespace. This could lead to DoS where the host cannot delete files because they are used as a mount point in a container. This was fixed in torvalds/linux@8ed936b. This might be why this bug is visible on old CentOS kernels but not newer ones.

But the container mnt namespace should be released when the container is terminated. I would not be surprised if old kernels were leaking mnt namespaces in some cases. There were circular references on mnt namespaces in older kernels (torvalds/linux@4ce5d2b).

I don't know how to easily check if it is the case. You could check if any process is in the container mnt namespace (/proc/[0-9]*/ns/mnt). But the mnt namespace could stay alive without any process in it if someone opens a reference on it via /proc/[0-9]*/ns/mnt or if a kernel bug maintains a reference on it.

In any cases, rkt could be patched to continue to GC the other pods when one fails with EBUSY like this.

@blalor
Copy link

blalor commented Dec 28, 2015

I'm also having this problem, also on CentOS 7, same kernel as @jcollie. This host is using rkt 0.13.0.

[root@gocd-server-2b5933f2 ~]# rkt gc --grace-period=0
Garbage collecting pod "9bec4e8c-8b94-4017-b9a3-f7639ba98382"
Error getting stage1 treeStoreID: no such file or directory
[root@gocd-server-2b5933f2 ~]# ROOTFS_INODE=$(stat --format=%i /var/lib/rkt/pods/run/9bec4e8c-8b94-4017-b9a3-f7639ba98382/stage1/rootfs/)
stat: cannot stat ‘/var/lib/rkt/pods/run/9bec4e8c-8b94-4017-b9a3-f7639ba98382/stage1/rootfs/’: No such file or directory

So the rootfs doesn't exist anymore. Has the db gotten out of sync with the filesystem?

@jcollie
Copy link
Author

jcollie commented Dec 28, 2015

Nothing is showing up with those commands either, so I suspect that you're right about a kernel bug. I'd switch the system to CoreOS but it's a baremetal system and I'm not local to the box at the moment, plus I'll need some time to rethink how I do my container networking.

[root@svr05 ~]# rkt gc
Garbage collecting pod "42e78965-c60b-4f4f-b412-484cd381fe90"
Error getting stage1 treeStoreID: no such file or directory
Skipping stage1 GC
Unable to remove pod "42e78965-c60b-4f4f-b412-484cd381fe90": remove /var/lib/rkt/pods/exited-garbage/42e78965-c60b-4f4f-b412-484cd381fe90/stage1/rootfs: device or resource busy
[root@svr05 ~]# stat --format=%i /var/lib/rkt/pods/exited-garbage/42e78965-c60b-4f4f-b412-484cd381fe90/stage1/rootfs
96759102
[root@svr05 ~]# ROOTFS_INODE=$(stat --format=%i /var/lib/rkt/pods/exited-garbage/42e78965-c60b-4f4f-b412-484cd381fe90/stage1/rootfs)
[root@svr05 ~]# for i in /proc/[0-9]*/ ; do echo -n "$i $(cat $i/comm) " ; stat -L $i/root | grep Device ; done | grep $ROOTFS_INODE
[root@svr05 ~]# ls /proc/[0-9]*/ns/mnt
/proc/108/ns/mnt    /proc/19960/ns/mnt  /proc/3015/ns/mnt  /proc/4937/ns/mnt
/proc/10/ns/mnt     /proc/19/ns/mnt     /proc/3030/ns/mnt  /proc/507/ns/mnt
/proc/11/ns/mnt     /proc/1/ns/mnt      /proc/3035/ns/mnt  /proc/514/ns/mnt
/proc/12238/ns/mnt  /proc/20160/ns/mnt  /proc/3036/ns/mnt  /proc/51/ns/mnt
/proc/12239/ns/mnt  /proc/20173/ns/mnt  /proc/3040/ns/mnt  /proc/534/ns/mnt
/proc/12240/ns/mnt  /proc/20185/ns/mnt  /proc/3059/ns/mnt  /proc/54/ns/mnt
/proc/12455/ns/mnt  /proc/20188/ns/mnt  /proc/3060/ns/mnt  /proc/55/ns/mnt
/proc/12473/ns/mnt  /proc/20214/ns/mnt  /proc/3064/ns/mnt  /proc/573/ns/mnt
/proc/12/ns/mnt     /proc/20216/ns/mnt  /proc/30/ns/mnt    /proc/574/ns/mnt
/proc/1343/ns/mnt   /proc/20/ns/mnt     /proc/319/ns/mnt   /proc/575/ns/mnt
/proc/13/ns/mnt     /proc/21/ns/mnt     /proc/31/ns/mnt    /proc/576/ns/mnt
/proc/14697/ns/mnt  /proc/23/ns/mnt     /proc/32/ns/mnt    /proc/577/ns/mnt
/proc/14/ns/mnt     /proc/24/ns/mnt     /proc/381/ns/mnt   /proc/578/ns/mnt
/proc/1533/ns/mnt   /proc/2581/ns/mnt   /proc/382/ns/mnt   /proc/57/ns/mnt
/proc/1535/ns/mnt   /proc/2595/ns/mnt   /proc/38/ns/mnt    /proc/581/ns/mnt
/proc/1536/ns/mnt   /proc/25/ns/mnt     /proc/391/ns/mnt   /proc/586/ns/mnt
/proc/1537/ns/mnt   /proc/26/ns/mnt     /proc/392/ns/mnt   /proc/587/ns/mnt
/proc/1539/ns/mnt   /proc/275/ns/mnt    /proc/39/ns/mnt    /proc/588/ns/mnt
/proc/1576/ns/mnt   /proc/277/ns/mnt    /proc/3/ns/mnt     /proc/589/ns/mnt
/proc/1578/ns/mnt   /proc/27/ns/mnt     /proc/407/ns/mnt   /proc/590/ns/mnt
/proc/15/ns/mnt     /proc/28/ns/mnt     /proc/408/ns/mnt   /proc/606/ns/mnt
/proc/1619/ns/mnt   /proc/290/ns/mnt    /proc/409/ns/mnt   /proc/628/ns/mnt
/proc/1620/ns/mnt   /proc/291/ns/mnt    /proc/40/ns/mnt    /proc/632/ns/mnt
/proc/16282/ns/mnt  /proc/292/ns/mnt    /proc/410/ns/mnt   /proc/634/ns/mnt
/proc/1692/ns/mnt   /proc/293/ns/mnt    /proc/411/ns/mnt   /proc/636/ns/mnt
/proc/16/ns/mnt     /proc/2972/ns/mnt   /proc/412/ns/mnt   /proc/639/ns/mnt
/proc/17/ns/mnt     /proc/297/ns/mnt    /proc/413/ns/mnt   /proc/703/ns/mnt
/proc/1878/ns/mnt   /proc/298/ns/mnt    /proc/41/ns/mnt    /proc/759/ns/mnt
/proc/18/ns/mnt     /proc/2995/ns/mnt   /proc/42/ns/mnt    /proc/761/ns/mnt
/proc/19242/ns/mnt  /proc/299/ns/mnt    /proc/43/ns/mnt    /proc/76/ns/mnt
/proc/1928/ns/mnt   /proc/29/ns/mnt     /proc/4638/ns/mnt  /proc/7/ns/mnt
/proc/19737/ns/mnt  /proc/2/ns/mnt      /proc/4641/ns/mnt  /proc/8/ns/mnt
/proc/19947/ns/mnt  /proc/300/ns/mnt    /proc/483/ns/mnt   /proc/9/ns/mnt
[root@svr05 ~]# 

@alban
Copy link
Member

alban commented Dec 28, 2015

I can reproduce this on CentOS 7 as well.

For some unknown reason, all rkt mounts are still mounted in systemd-udev mount namespace, even though they are correctly umounted in the host mount namespace:

# grep /var/lib/rkt/pods/ /proc/$(pidof systemd-udevd)/mountinfo

@alban
Copy link
Member

alban commented Dec 28, 2015

The mount point / in the systemd-udev mount namespace is a slave mount of the / in the host mount namespace (see "master:1" and "shared:1"):

$ grep '/ / ' /proc/$(pidof systemd-udevd)/mountinfo
43 42 202:1 / / rw,relatime master:1 - xfs /dev/xvda1 rw,seclabel,attr2,inode64,noquota
$ grep '/ / ' /proc/1/mountinfo
60 1 202:1 / / rw,relatime shared:1 - xfs /dev/xvda1 rw,seclabel,attr2,inode64,noquota

With the new code for rkt-fly, rkt gc first sets the rkt mount points as private in the host mount namespace: see needsRemountPrivate

This block the umount propagation event from the host to the udevd mount namespaces, so the rkt mounts are never fully umounted and the rkt mount namespaces are not released.

The same leak exists on my Fedora 23 but since I have a recent kernel with torvalds/linux@8ed936b, I don't have the EBUSY symptom and the mount namespace leak is not visible to the user.

This was introduced by #1856

/cc @steveej

@alban
Copy link
Member

alban commented Dec 28, 2015

@blalor the bug you've got seems to be different: it does not say "remove /var/lib/rkt/.../stage1/rootfs: device or resource busy" but "Error getting stage1 treeStoreID: no such file or directory
". Yours should be fixed in rkt 0.14.0 by #1828.

@alban
Copy link
Member

alban commented Jan 4, 2016

Since systemd-v212, systemd-udevd is started with "MountFlags=slave", see systemd-udevd.service.

The CentOS 7 release has systemd-v208 but has updates for systemd-v219. I don't see the error message on GC with systemd-v208 but I see it with systemd-v219.

In any cases, other services can use systemd's "MountFlags" option, so rkt's GC needs to be fixed.

@alban
Copy link
Member

alban commented Jan 5, 2016

@steveej I think GC should not set any mount point as MS_PRIVATE and rkt-fly should set all its mount points as MS_SLAVE+MS_SHARED. I tested it in this shell script and it seems to do what I want: the mount point gets umounted in the udevd namespace too:
https://gist.github.com/alban/75f605b8606b195008d6

I will continue this branch tomorrow: https://github.com/kinvolk/rkt/commits/alban/udevd

@steveej
Copy link
Contributor

steveej commented Jan 6, 2016

I'm afraid I can't reproduce this locally: https://gist.github.com/steveeJ/3f87d5939b973741d227

While the behavior I see is intended for rkt's use case, I'm not sure if it is intended by Linux.
It's questionable why the umounts are propagated to the mount namespace of systemd-udevd. The mounts obviously lose the master/shared attributes after being declared MS_PRIVATE in the host's namespace.

@alban alban modified the milestones: v0.16.0, v0.15.0 Jan 7, 2016
@alban alban assigned alban and unassigned blixtra Jan 7, 2016
@jonboulle jonboulle modified the milestones: v1.0.0, v0.16.0 Jan 15, 2016
@blalor
Copy link

blalor commented Jan 26, 2016

anyone else tired of trying to keep track of what's really in CentOS 7? kernel, systemd, basically everything else. :-(

@iaguis iaguis modified the milestones: v1+, v1.0.0 Jan 26, 2016
@alban alban removed their assignment Jan 26, 2016
@hightoxicity
Copy link

Hi guys,

Same issue here

sudo /opt/bin/rkt gc --expire-prepared=0s --grace-period=0s
Garbage collecting pod "b49c3a51-161a-4448-a04b-2c8614c983bc"
Error getting stage1 treeStoreID: no such file or directory
Skipping stage1 GC
Unable to remove pod "b49c3a51-161a-4448-a04b-2c8614c983bc": remove /var/lib/rkt/pods/exited-garbage/b49c3a51-161a-4448-a04b-2c8614c983bc/stage1/rootfs/tmp: device or resource busy

My system is under Debian Jessie

@jcollie
Copy link
Author

jcollie commented Feb 3, 2016

Interestingly, something in the latest round of CentOS updates has fixed my GC problem. I'm now on kernel 3.10.0-327.4.5.el7.x86_64. Nothing in the changelog jumped out of me, but perhaps it was a change in some other package, not the kernel.

@alban
Copy link
Member

alban commented Mar 15, 2016

I'm updating the docs about the run-time dependency on Linux 3.18: #2282

alban added a commit to kinvolk/rkt that referenced this issue Mar 16, 2016
Linux 3.18 introduced the fix for unlinking files and directories that
are mount points in another mount namespace. Before Linux 3.18,
unlinking could return EBUSY:
torvalds/linux@8ed936b

With the introduction of rkt-fly, this becomes a visible issue with rkt
gc, see discussion on:
rkt#1922 (comment)

This patch bump the run-time dependency to Linux 3.18
alban added a commit to kinvolk/rkt that referenced this issue Mar 16, 2016
Linux 3.18 introduced the fix for unlinking files and directories that
are mount points in another mount namespace. Before Linux 3.18,
unlinking could return EBUSY:
torvalds/linux@8ed936b

With the introduction of rkt-fly, this becomes a visible issue with rkt
gc, see discussion on:
rkt#1922 (comment)

This patch bump the run-time dependency to Linux 3.18. This also
documents the issues in README.md.
alban added a commit to kinvolk/rkt that referenced this issue Mar 16, 2016
Linux 3.18 introduced the fix for unlinking files and directories that
are mount points in another mount namespace. Before Linux 3.18,
unlinking could return EBUSY:
torvalds/linux@8ed936b

With the introduction of rkt-fly, this becomes a visible issue with rkt
gc, see discussion on:
rkt#1922 (comment)

This patch bump the run-time dependency to Linux 3.18. This also
documents the issues in README.md.
alban added a commit to kinvolk/rkt that referenced this issue Mar 16, 2016
Linux 3.18 introduced the fix for unlinking files and directories that
are mount points in another mount namespace. Before Linux 3.18,
unlinking could return EBUSY:
torvalds/linux@8ed936b

With the introduction of rkt-fly, this becomes a visible issue with rkt
gc, see discussion on:
rkt#1922 (comment)

This patch bump the run-time dependency to Linux 3.18. This also
documents the issues in README.md.
alban added a commit to kinvolk/rkt that referenced this issue Mar 16, 2016
Linux 3.18 introduced the fix for unlinking files and directories that
are mount points in another mount namespace. Before Linux 3.18,
unlinking could return EBUSY:
torvalds/linux@8ed936b

With the introduction of rkt-fly, this becomes a visible issue with rkt
gc, see discussion on:
rkt#1922 (comment)

This patch bumps the run-time dependency to Linux 3.18. This also
documents the issues in README.md.
alban added a commit to kinvolk/rkt that referenced this issue Mar 16, 2016
Linux 3.18 introduced the fix for unlinking files and directories that
are mount points in another mount namespace. Before Linux 3.18,
unlinking could return EBUSY:
torvalds/linux@8ed936b

With the introduction of rkt-fly, this becomes a visible issue with rkt
gc, see discussion on:
rkt#1922 (comment)

This patch bumps the run-time dependency to Linux 3.18. This also
documents the issues in README.md.
@iaguis iaguis modified the milestones: v1.3.0, v1.2.0 Mar 18, 2016
@alban alban modified the milestones: v1.4.0, v1.3.0 Mar 31, 2016
@iaguis iaguis modified the milestones: v1.6.0, v1.4.0 Apr 14, 2016
@iaguis
Copy link
Member

iaguis commented May 12, 2016

Not sure what we should do here...

@iaguis iaguis modified the milestones: v1+, v1.6.0 May 12, 2016
@artem-sidorenko
Copy link
Contributor

@iaguis I'm using the mainline kernel from elrepo and it works fine. Probably this can be just documented in the docs.

I'll do it similar in my chef-rkt cookbook and just provide a warning if kernel <3.18 is used.

@cpuguy83
Copy link

cpuguy83 commented Aug 2, 2017

Seeing this exact issue in docker. Switching systemd-udevd to use the host mountns (comment out MountFlags=slave) takes care of it.
What's strange is I can also enter the mountns for udevd and unmount the offending entry with no issue. Once I do that removal, I can remove in docker.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests