Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rootful container, NFS volume: lchown operation not permitted #14766

Closed
berndbausch opened this issue Jun 29, 2022 · 6 comments · Fixed by #14777
Closed

Rootful container, NFS volume: lchown operation not permitted #14766

berndbausch opened this issue Jun 29, 2022 · 6 comments · Fixed by #14777
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@berndbausch
Copy link

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description
After setting up an NFS-based volume, I try to launch a container. This fails with an error "lchown: ... operation not permitted". When I repeat the container launch command, it succeeds.

Steps to reproduce the issue:

  1. Create an NFS-based volume
    $ sudo podman volume create mynfs --driver local --opt type=nfs --opt o=rw --opt device=192.168.1.16:/srv/nfs

  2. Launch container
    $ sudo podman run -it --rm -v mynfs:/myvol alpine sh Error: lchown /var/lib/containers/storage/volumes/mynfs/_data: operation not permitted

  3. Launch container again
    $ sudo podman run -it --rm -v mynfs:/myvol alpine sh / #

Describe the results you received:
After creating the volume, the first attempt to launch a container fails. All subsequent attempts succeed.

Describe the results you expected:
All container launches succeed, including the first one.

Additional information you deem important (e.g. issue happens only occasionally):
As far as I can tell, this problem occurs consistently. I tried a different image (fedora) with the same result.
Same result for Podman 3.4.4 on Ubuntu 22.04 and Podman 4.0.3 on Fedora 35.

Output of podman version:

(ubuntu) $ podman version
Version:      3.4.4
API Version:  3.4.4
Go Version:   go1.17.3
Built:        Thu Jan  1 00:00:00 1970
OS/Arch:      linux/amd64
(fedora) $ podman version
Client:       Podman Engine
Version:      4.0.3
API Version:  4.0.3
Go Version:   go1.16.15
Built:        Sat Apr  2 03:21:14 2022
OS/Arch:      linux/amd64

Output of podman info --debug (Ubuntu only):

(ubuntu) $ podman info --debug
host:
  arch: amd64
  buildahVersion: 1.23.1
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: 'conmon: /usr/bin/conmon'
    path: /usr/bin/conmon
    version: 'conmon version 2.0.25, commit: unknown'
  cpus: 4
  distribution:
    codename: jammy
    distribution: ubuntu
    version: "22.04"
  eventLogger: journald
  hostname: moon
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.15.0-40-generic
  linkmode: dynamic
  logDriver: journald
  memFree: 2395836416
  memTotal: 3544547328
  ociRuntime:
    name: crun
    package: 'crun: /usr/bin/crun'
    path: /usr/bin/crun
    version: |-
      crun version 0.17
      commit: 0e9229ae34caaebcb86f1fde18de3acaf18c6d9a
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: 'slirp4netns: /usr/bin/slirp4netns'
    version: |-
      slirp4netns version 1.0.1
      commit: 6a7b16babc95b6a3056b33fb45b74a6f62262dd4
      libslirp: 4.6.1
  swapFree: 3544182784
  swapTotal: 3544182784
  uptime: 1h 1m 2.98s (Approximately 0.04 days)
plugins:
  log:
  - k8s-file
  - none
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  localhost:5000:
    Blocked: false
    Insecure: true
    Location: localhost:5000
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: localhost:5000
store:
  configFile: /home/nobleprog/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.ignore_chown_errors: "true"
  graphRoot: /home/nobleprog/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 1
  runRoot: /run/user/1000/containers
  volumePath: /home/nobleprog/.local/share/containers/storage/volumes
version:
  APIVersion: 3.4.4
  Built: 0
  BuiltTime: Thu Jan  1 00:00:00 1970
  GitCommit: ""
  GoVersion: go1.17.3
  OsArch: linux/amd64
  Version: 3.4.4

Package info (e.g. output of rpm -q podman or apt list podman) (Ubuntu only):

$ apt list podman
Listing... Done
podman/jammy,now 3.4.4+ds1-1ubuntu1 amd64 [installed]

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

The most recent Podman version I tried is 4.0.3.

I did check the Troubleshooting guide.

Additional environment details (AWS, VirtualBox, physical, etc.):

My Ubuntu host is physical (HP Thin Client t630 running Ubuntu 22.04)
My Fedora 35 host runs on VirtualBox on Windows, bridged network.
My NFS server is physical (HP HP Thin Client t520 running Debian 11).

NFS server setup:

$ cat /etc/exports
/srv/nfs    *(rw)
$ ls -ld /srv/nfs
drwxrwxrwx 2 root root 4096 Jun 29 09:38 /srv/nfs

NFS client:
When the container launch succeeds, I see the NFS filesystem mounted as expected (same output on the Fedora and Ubuntu servers, except for the clientaddr obviously):

$ mount -t nfs4
192.168.1.16:/srv/nfs on /var/lib/containers/storage/volumes/mynfs/_data type nfs4 (rw,relatime,vers=4.2,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.1.60,local_lock=none,addr=192.168.1.16)
@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Jun 29, 2022
@vrothberg
Copy link
Member

Thank you for reaching out, @berndbausch!

@giuseppe @mheon ideas?

@rhatdan
Copy link
Member

rhatdan commented Jun 29, 2022

lchown is not going to be allowed on the server side of an NFS connection. The server does not understand user namespace so from its point of view it sees berndbausch trying to chown a file to a different UID, and prevents it. NFS Server also does not respect Namespaces Capabilities like CAP_CHOWN.

I am not sure why POdman is attempting to the chown, one potential reason would be:
In this case PODMAN is trying to chown the source volume to match the target volume in the image, which does not match the user inside of the container.

@rhatdan
Copy link
Member

rhatdan commented Jun 29, 2022

Could you strace podman to see what UIDs it is attempting to chown too? Perhaps this is Podman trying to chown to the current user, when it does not need to and NFS blocs that. Don't know why it works later. Maybe we only chown on a new volume the first time it is used after creation.

@giuseppe
Copy link
Member

yes, I think we chown only the first time a volume is created. I don't think there is anything we can do from Podman, should we move this issue to a discussion (or close it)?

@rhatdan
Copy link
Member

rhatdan commented Jun 29, 2022

The question I have is are we chowning without first checking if the file is owned by the current user.
IE If I am on an NFS share and my uid is 3267.
If I do
chown 3267:3267 nfs/dir
on a file which is 3267:3267
Do we report this error, and NFS give us ENOSUPP?

If yes then we could fix podman to stat the dir before we attempt to chown it, and only chown if necessary.

@berndbausch
Copy link
Author

berndbausch commented Jun 29, 2022

podman.zip

Here is the trace, generated with

strace -f -o podman.trace podman run -it --rm -v mynfs:/myvol fedora sh

It doesn't look like it contains the desired information (i.e. the user/group that Podman wants to set the volume to). Is there an strace option I should add?

rhatdan added a commit to rhatdan/podman that referenced this issue Jun 29, 2022
NFS Servers will thrown ENOTSUPP error if you attempt to
chown a directory to the same UID and GID as the directory
already has. If volumes are stored on NFS directories this
throws an ugly error and then works on the next try.

Bottom line don't chown directories that already have the correct
UID and GID.

Fixes: containers#14766

[NO NEW TESTS NEEDED] Difficult to setup an NFS Server in testing.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
rhatdan added a commit to rhatdan/podman that referenced this issue Jul 12, 2022
NFS Servers will thrown ENOTSUPP error if you attempt to
chown a directory to the same UID and GID as the directory
already has. If volumes are stored on NFS directories this
throws an ugly error and then works on the next try.

Bottom line don't chown directories that already have the correct
UID and GID.

Fixes: containers#14766

[NO NEW TESTS NEEDED] Difficult to setup an NFS Server in testing.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
mheon pushed a commit to mheon/libpod that referenced this issue Jul 26, 2022
NFS Servers will thrown ENOTSUPP error if you attempt to
chown a directory to the same UID and GID as the directory
already has. If volumes are stored on NFS directories this
throws an ugly error and then works on the next try.

Bottom line don't chown directories that already have the correct
UID and GID.

Fixes: containers#14766

[NO NEW TESTS NEEDED] Difficult to setup an NFS Server in testing.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants