Rootful container, NFS volume: lchown operation not permitted #14766

berndbausch · 2022-06-29T01:27:53Z

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description
After setting up an NFS-based volume, I try to launch a container. This fails with an error "lchown: ... operation not permitted". When I repeat the container launch command, it succeeds.

Steps to reproduce the issue:

Create an NFS-based volume
$ sudo podman volume create mynfs --driver local --opt type=nfs --opt o=rw --opt device=192.168.1.16:/srv/nfs
Launch container
$ sudo podman run -it --rm -v mynfs:/myvol alpine sh Error: lchown /var/lib/containers/storage/volumes/mynfs/_data: operation not permitted
Launch container again
$ sudo podman run -it --rm -v mynfs:/myvol alpine sh / #

Describe the results you received:
After creating the volume, the first attempt to launch a container fails. All subsequent attempts succeed.

Describe the results you expected:
All container launches succeed, including the first one.

Additional information you deem important (e.g. issue happens only occasionally):
As far as I can tell, this problem occurs consistently. I tried a different image (fedora) with the same result.
Same result for Podman 3.4.4 on Ubuntu 22.04 and Podman 4.0.3 on Fedora 35.

Output of podman version:

(ubuntu) $ podman version
Version:      3.4.4
API Version:  3.4.4
Go Version:   go1.17.3
Built:        Thu Jan  1 00:00:00 1970
OS/Arch:      linux/amd64
(fedora) $ podman version
Client:       Podman Engine
Version:      4.0.3
API Version:  4.0.3
Go Version:   go1.16.15
Built:        Sat Apr  2 03:21:14 2022
OS/Arch:      linux/amd64

Output of podman info --debug (Ubuntu only):

(ubuntu) $ podman info --debug
host:
  arch: amd64
  buildahVersion: 1.23.1
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: 'conmon: /usr/bin/conmon'
    path: /usr/bin/conmon
    version: 'conmon version 2.0.25, commit: unknown'
  cpus: 4
  distribution:
    codename: jammy
    distribution: ubuntu
    version: "22.04"
  eventLogger: journald
  hostname: moon
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.15.0-40-generic
  linkmode: dynamic
  logDriver: journald
  memFree: 2395836416
  memTotal: 3544547328
  ociRuntime:
    name: crun
    package: 'crun: /usr/bin/crun'
    path: /usr/bin/crun
    version: |-
      crun version 0.17
      commit: 0e9229ae34caaebcb86f1fde18de3acaf18c6d9a
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: 'slirp4netns: /usr/bin/slirp4netns'
    version: |-
      slirp4netns version 1.0.1
      commit: 6a7b16babc95b6a3056b33fb45b74a6f62262dd4
      libslirp: 4.6.1
  swapFree: 3544182784
  swapTotal: 3544182784
  uptime: 1h 1m 2.98s (Approximately 0.04 days)
plugins:
  log:
  - k8s-file
  - none
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  localhost:5000:
    Blocked: false
    Insecure: true
    Location: localhost:5000
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: localhost:5000
store:
  configFile: /home/nobleprog/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.ignore_chown_errors: "true"
  graphRoot: /home/nobleprog/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 1
  runRoot: /run/user/1000/containers
  volumePath: /home/nobleprog/.local/share/containers/storage/volumes
version:
  APIVersion: 3.4.4
  Built: 0
  BuiltTime: Thu Jan  1 00:00:00 1970
  GitCommit: ""
  GoVersion: go1.17.3
  OsArch: linux/amd64
  Version: 3.4.4

Package info (e.g. output of rpm -q podman or apt list podman) (Ubuntu only):

$ apt list podman
Listing... Done
podman/jammy,now 3.4.4+ds1-1ubuntu1 amd64 [installed]

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

The most recent Podman version I tried is 4.0.3.

I did check the Troubleshooting guide.

Additional environment details (AWS, VirtualBox, physical, etc.):

My Ubuntu host is physical (HP Thin Client t630 running Ubuntu 22.04)
My Fedora 35 host runs on VirtualBox on Windows, bridged network.
My NFS server is physical (HP HP Thin Client t520 running Debian 11).

NFS server setup:

$ cat /etc/exports
/srv/nfs    *(rw)
$ ls -ld /srv/nfs
drwxrwxrwx 2 root root 4096 Jun 29 09:38 /srv/nfs

NFS client:
When the container launch succeeds, I see the NFS filesystem mounted as expected (same output on the Fedora and Ubuntu servers, except for the clientaddr obviously):

$ mount -t nfs4
192.168.1.16:/srv/nfs on /var/lib/containers/storage/volumes/mynfs/_data type nfs4 (rw,relatime,vers=4.2,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.1.60,local_lock=none,addr=192.168.1.16)

The text was updated successfully, but these errors were encountered:

vrothberg · 2022-06-29T08:53:50Z

Thank you for reaching out, @berndbausch!

@giuseppe @mheon ideas?

rhatdan · 2022-06-29T10:59:07Z

lchown is not going to be allowed on the server side of an NFS connection. The server does not understand user namespace so from its point of view it sees berndbausch trying to chown a file to a different UID, and prevents it. NFS Server also does not respect Namespaces Capabilities like CAP_CHOWN.

I am not sure why POdman is attempting to the chown, one potential reason would be:
In this case PODMAN is trying to chown the source volume to match the target volume in the image, which does not match the user inside of the container.

rhatdan · 2022-06-29T11:01:18Z

Could you strace podman to see what UIDs it is attempting to chown too? Perhaps this is Podman trying to chown to the current user, when it does not need to and NFS blocs that. Don't know why it works later. Maybe we only chown on a new volume the first time it is used after creation.

giuseppe · 2022-06-29T12:19:48Z

yes, I think we chown only the first time a volume is created. I don't think there is anything we can do from Podman, should we move this issue to a discussion (or close it)?

rhatdan · 2022-06-29T12:30:36Z

The question I have is are we chowning without first checking if the file is owned by the current user.
IE If I am on an NFS share and my uid is 3267.
If I do
chown 3267:3267 nfs/dir
on a file which is 3267:3267
Do we report this error, and NFS give us ENOSUPP?

If yes then we could fix podman to stat the dir before we attempt to chown it, and only chown if necessary.

berndbausch · 2022-06-29T14:39:36Z

podman.zip

Here is the trace, generated with

strace -f -o podman.trace podman run -it --rm -v mynfs:/myvol fedora sh

It doesn't look like it contains the desired information (i.e. the user/group that Podman wants to set the volume to). Is there an strace option I should add?

NFS Servers will thrown ENOTSUPP error if you attempt to chown a directory to the same UID and GID as the directory already has. If volumes are stored on NFS directories this throws an ugly error and then works on the next try. Bottom line don't chown directories that already have the correct UID and GID. Fixes: containers#14766 [NO NEW TESTS NEEDED] Difficult to setup an NFS Server in testing. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>

openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Jun 29, 2022

rhatdan mentioned this issue Jun 29, 2022

Use SafeChown rather then chown for volumes on NFS #14777

Merged

openshift-ci bot closed this as completed in #14777 Jul 18, 2022

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rootful container, NFS volume: lchown operation not permitted #14766

Rootful container, NFS volume: lchown operation not permitted #14766

berndbausch commented Jun 29, 2022

vrothberg commented Jun 29, 2022

rhatdan commented Jun 29, 2022

rhatdan commented Jun 29, 2022

giuseppe commented Jun 29, 2022

rhatdan commented Jun 29, 2022

berndbausch commented Jun 29, 2022 •

edited

Loading

Rootful container, NFS volume: lchown operation not permitted #14766

Rootful container, NFS volume: lchown operation not permitted #14766

Comments

berndbausch commented Jun 29, 2022

vrothberg commented Jun 29, 2022

rhatdan commented Jun 29, 2022

rhatdan commented Jun 29, 2022

giuseppe commented Jun 29, 2022

rhatdan commented Jun 29, 2022

berndbausch commented Jun 29, 2022 • edited Loading

berndbausch commented Jun 29, 2022 •

edited

Loading