Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible regression when lower layer is under the target mountpoint #364

Closed
flx42 opened this issue Jul 27, 2022 · 5 comments · Fixed by #366
Closed

Possible regression when lower layer is under the target mountpoint #364

flx42 opened this issue Jul 27, 2022 · 5 comments · Fixed by #366

Comments

@flx42
Copy link

flx42 commented Jul 27, 2022

See NVIDIA/enroot#130 for the full details and how it creates an almost-unkillable fuse-overlafys process when the mountpoint is above the lower dir, Red Hat support mentioned that this is an invalid use case, but I went back and verified it worked fine in fuse-overlayfs v0.7 -> v1.6, and it started hanging since this seemingly unrelated commit in v1.7 so I wanted to verify if this pattern should not be used indeed:

$ git bisect bad
4ad759b35a054b6010b348755921977a9dcf6e0e is the first bad commit
commit 4ad759b35a054b6010b348755921977a9dcf6e0e
Author: Giuseppe Scrivano <gscrivan@redhat.com>
Date:   Wed Jul 28 13:03:53 2021 +0200

    fuse-overlayfs: fix read xattrs for devices
    
    always use llistxattr and lgetxattr for listing and reading xattrs so
    that the open/openat2 call doesn't fail when accessing a device.
    
    Closes: https://github.com/containers/fuse-overlayfs/issues/312
    
    Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

 direct.c                 | 20 ++------------------
 tests/fedora-installs.sh |  3 +++
 2 files changed, 5 insertions(+), 18 deletions(-)

The repro code (tested on Ubuntu 5.17 and 5.15 kernels):

bash -eux << 'EOF'
        rootfs="/run/user/$(id -u)/overlay"
        mkdir -p "${rootfs}"/{lower,upper,work}
        curl -SsL https://cdimage.ubuntu.com/ubuntu-base/releases/22.04/release/ubuntu-base-22.04-base-amd64.tar.gz | tar -C "${rootfs}/lower" -xz
        fuse-overlayfs --version
        fuse-overlayfs -f -o "lowerdir=${rootfs}/lower,upperdir=${rootfs}/upper,workdir=${rootfs}/work" "${rootfs}" &
        sleep 5s
        ls "${rootfs}"
EOF

Output with fuse-overlayfs 1.5, it works fine:

+ fuse-overlayfs --version
fuse-overlayfs: version 1.5
FUSE library version 3.10.5
using FUSE kernel interface version 7.31
fusermount3 version: 3.10.5
+ sleep 5s
+ fuse-overlayfs -f -o lowerdir=/run/user/1000/overlay/lower,upperdir=/run/user/1000/overlay/upper,workdir=/run/user/1000/overlay/work /run/user/1000/overlay
+ ls /run/user/1000/overlay
bin  boot  dev  etc  home  lib  lib32  lib64  libx32  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var

Output with fuse-overlayfs 1.7.1, the ls hangs:

+ fuse-overlayfs --version
fuse-overlayfs: version 1.7.1
FUSE library version 3.10.5
using FUSE kernel interface version 7.31
fusermount3 version: 3.10.5
+ sleep 5s
+ fuse-overlayfs -f -o lowerdir=/run/user/1000/overlay/lower,upperdir=/run/user/1000/overlay/upper,workdir=/run/user/1000/overlay/work /run/user/1000/overlay
+ ls /run/user/1000/overlay

And a little while later in the kernel log:

Jul 27 15:59:18 ioctl kernel: INFO: task fuse-overlayfs:1670 blocked for more than 120 seconds.
Jul 27 15:59:18 ioctl kernel:       Tainted: P           OE     5.17.0-1013-oem #14-Ubuntu
Jul 27 15:59:18 ioctl kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 27 15:59:18 ioctl kernel: task:fuse-overlayfs  state:D stack:    0 pid: 1670 ppid:  1609 flags:0x00000002
Jul 27 15:59:18 ioctl kernel: Call Trace:
Jul 27 15:59:18 ioctl kernel:  <TASK>
Jul 27 15:59:18 ioctl kernel:  __schedule+0x240/0x5a0
Jul 27 15:59:18 ioctl kernel:  schedule+0x55/0xd0
Jul 27 15:59:18 ioctl kernel:  schedule_preempt_disabled+0x15/0x20
Jul 27 15:59:18 ioctl kernel:  __mutex_lock.constprop.0+0x2f3/0x4c0
Jul 27 15:59:18 ioctl kernel:  __mutex_lock_slowpath+0x13/0x20
Jul 27 15:59:18 ioctl kernel:  mutex_lock+0x35/0x40
Jul 27 15:59:18 ioctl kernel:  fuse_lock_inode+0x30/0x40
Jul 27 15:59:18 ioctl kernel:  fuse_lookup+0x48/0x1b0
Jul 27 15:59:18 ioctl kernel:  ? d_alloc_parallel+0x2dd/0x570
Jul 27 15:59:18 ioctl kernel:  ? __legitimize_path+0x2d/0x60
Jul 27 15:59:18 ioctl kernel:  __lookup_slow+0x81/0x150
Jul 27 15:59:18 ioctl kernel:  walk_component+0x142/0x1c0
Jul 27 15:59:18 ioctl kernel:  link_path_walk.part.0.constprop.0+0x24b/0x3d0
Jul 27 15:59:18 ioctl kernel:  ? path_init+0x2c2/0x3f0
Jul 27 15:59:18 ioctl kernel:  path_lookupat+0x3e/0x1b0
Jul 27 15:59:18 ioctl kernel:  ? lru_cache_add+0x1c/0x20
Jul 27 15:59:18 ioctl kernel:  filename_lookup+0xcf/0x1d0
Jul 27 15:59:18 ioctl kernel:  ? __check_object_size+0x1a/0x20
Jul 27 15:59:18 ioctl kernel:  ? strncpy_from_user+0x44/0x140
Jul 27 15:59:18 ioctl kernel:  ? getname_flags.part.0+0x4c/0x1b0
Jul 27 15:59:18 ioctl kernel:  user_path_at_empty+0x3f/0x60
Jul 27 15:59:18 ioctl kernel:  path_getxattr+0x4a/0xb0
Jul 27 15:59:18 ioctl kernel:  __x64_sys_lgetxattr+0x21/0x30
Jul 27 15:59:18 ioctl kernel:  do_syscall_64+0x59/0xc0
Jul 27 15:59:18 ioctl kernel:  ? irqentry_exit+0x35/0x40
Jul 27 15:59:18 ioctl kernel:  ? exc_page_fault+0x89/0x180
Jul 27 15:59:18 ioctl kernel:  ? asm_exc_page_fault+0x8/0x30
Jul 27 15:59:18 ioctl kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
Jul 27 15:59:18 ioctl kernel: RIP: 0033:0x7fcf6edfb2ae
Jul 27 15:59:18 ioctl kernel: RSP: 002b:00007ffdc66ab2d8 EFLAGS: 00000202 ORIG_RAX: 00000000000000c0
Jul 27 15:59:18 ioctl kernel: RAX: ffffffffffffffda RBX: 0000556337fbb650 RCX: 00007fcf6edfb2ae
Jul 27 15:59:18 ioctl kernel: RDX: 00007ffdc66ac320 RSI: 000055633697205b RDI: 00007ffdc66ab2e0
Jul 27 15:59:18 ioctl kernel: RBP: 000055633697205b R08: 0000556337fcfb80 R09: 0000556337fd9730
Jul 27 15:59:18 ioctl kernel: R10: 0000000000000010 R11: 0000000000000202 R12: 00007ffdc66ac320
Jul 27 15:59:18 ioctl kernel: R13: 0000000000000010 R14: 00007ffdc66ab2e0 R15: 0000000000000000
Jul 27 15:59:18 ioctl kernel:  </TASK>

And the process is unkillable:

$ pgrep -a fuse-overlayfs ; kill -9 $(pidof fuse-overlayfs) ; sleep 3s ; pgrep -a fuse-overlayfs
5087 fuse-overlayfs -f -o lowerdir=/run/user/1000/overlay/lower,upperdir=/run/user/1000/overlay/upper,workdir=/run/user/1000/overlay/work /run/user/1000/overlay
5087 fuse-overlayfs -f -o lowerdir=/run/user/1000/overlay/lower,upperdir=/run/user/1000/overlay/upper,workdir=/run/user/1000/overlay/work /run/user/1000/overlay
@flx42
Copy link
Author

flx42 commented Jul 28, 2022

Didn't realize the RH issue was public, it's here: https://bugzilla.redhat.com/show_bug.cgi?id=2111285

@flx42
Copy link
Author

flx42 commented Jul 28, 2022

@giuseppe given that you were the one that answered on the RH bugzilla, I imagine it means that you are saying this is not a supported use case then? And it used to work by chance but it won't be fixed?

@giuseppe
Copy link
Member

@flx42 we could fix this simple case to use the fd based syscalls, but I am not sure that would be enough because there might be other cases where it might break since it was never planned to be used this way.

I think it is safer if the lower dir is not under the merged directory

giuseppe added a commit to giuseppe/fuse-overlayfs that referenced this issue Jul 29, 2022
instead of using the lgetxattr and llistxattr system calls on the
entire file path, use the /proc/self/fd/$FD/$RELATIVE_PATH path
instead so that the lookup is relative to the lower dir file
descriptor that is already open.

Closes: containers#364

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe
Copy link
Member

I've opened a PR to address this specific case: #366

I don't promise it doesn't break in other weird ways :-) so please do not use a lower dir that is below the mount directory.

but this was an easy one to address, and probably it simplifies the lookup as well.

@flx42
Copy link
Author

flx42 commented Jul 29, 2022

I will talk with @3XX0 on how to proceed for enroot, but I agree we should change our approach.

Thank you for fixing this particular case!

giuseppe added a commit to giuseppe/fuse-overlayfs that referenced this issue Jul 29, 2022
instead of using the lgetxattr and llistxattr system calls on the
entire file path, use the /proc/self/fd/$FD/$RELATIVE_PATH path
instead so that the lookup is relative to the lower dir file
descriptor that is already open.

Closes: containers#364

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
giuseppe added a commit to giuseppe/fuse-overlayfs that referenced this issue Jul 29, 2022
instead of using the lgetxattr and llistxattr system calls on the
entire file path, use the /proc/self/fd/$FD/$RELATIVE_PATH path
instead so that the lookup is relative to the lower dir file
descriptor that is already open.

Closes: containers#364

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants