Implement the rootless-cni-infra container imageless #8910

Luap99 · 2021-01-07T20:26:08Z

As proposed by Akihiro Suda make the rootless-cni-infra container use
the host rootfs instead of an image. This works by mounting the host
rootfs in the user namespace to $runroot/rootless-cni-infra
and use this as rootfs for the container.

Second, rewrite the rootless-cni-infra shell script in go to remove the
extra cnitool dependency which is not packaged anywhere. With that we
only need the same dependencies as rootful podman which should be
already installed.

Advantages:

Works for all architectures podman supports.
Works without internet connection.
No extra maintainence of an extra image.

Disadvantages:

Requires the dependencies to be available on the host (e.g. dnsname
plugin). The user may not have control over those.

Problems:

It doesn't unmount the rootfs if the the rootless-cni-infra container
is stopped directly.

Also the image version did not respect the --cni-config-dir option
properly. It mounted the cni config dir only at container create time
but this option can be used on podman run commands which did not
worked if the rootless-cni-infra container was already running.
This is only possible with the rootfs version.

Live upgrading is possible. If the old infra container is still
running podman talks via the old api to the script. Once the
old infra container is deleted the new imageless infra container
will be created and podman can talk via the new api. A version
label is added to the container to distinguish between old and new.

Fixes #8709

Signed-off-by: Paul Holzinger paul.holzinger@web.de

openshift-ci-robot · 2021-01-07T20:26:11Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Luap99

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [Luap99]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Luap99 · 2021-01-08T10:44:19Z

@rhatdan I need your selinux knowledge here. On fedora the rootless-cni-infra script is failing with the following error:

failed to list chains: running [/usr/sbin/iptables -t nat -S --wait]: exit status 4: Fatal: can't open lock file /run/xtables.lock: Permission denied

This access is blocked by selinux. I see this AVC:

type=AVC msg=audit(1610099754.089:998): avc:  denied  { write } for  pid=6975 comm="iptables" name="/" dev="tmpfs" ino=1 scontext=unconfined_u:system_r:iptables_t:s0 tcontext=system_u:object_r:container_file_t:s0:c664,c817 tclass=dir permissive=0

The rooless-cni-infra container is created with selinux disabled so I don't understand why selinux is blocking access. /run is mounted as tmpfs into the container.

On the host the file has the following label:

$ sudo ls -lZ /run | grep xtables
-rw-------.  1 root    root    system_u:object_r:iptables_var_run_t:s0          0  8. Jan 10:51 xtables.lock

Interestingly enough it works on my test VM which has the fedora server spin installed and not workstation.
They are shipped with different iptables versions. Fedora workstation has iptables v1.8.5 (legacy) and fedora server has iptables v1.8.5 (nf_tables) both are fedora 33.

rhatdan · 2021-01-09T11:57:25Z

The issue is the iptables command is attempting to create content inside of a directory labeled container_file_t. Iptables command is confined, and not allowed to write container content.

rhatdan · 2021-01-09T11:59:49Z

Is the /run within the container? IE This is not the /run on the host?

Luap99 · 2021-01-09T12:27:44Z

libpod/rootless_cni_linux.go

+	run := spec.Mount{
+		Destination: "/run",
+		Type:        "tmpfs",
+		Source:      "none",
+		Options:     []string{"rw", "nosuid", "nodev"},
+	}
+	g.AddMount(run)


/run is mounted as tmpfs into the container

Luap99 · 2021-01-09T12:32:13Z

libpod/rootless_cni_linux.go

+		WithCtrNamespace(rootlessCNIInfraContainerNamespace),
+		WithName(rootlessCNIInfraContainerName),
+		WithPrivileged(true),
+		WithSecLabels([]string{"disable"}),


selinux should be disabled for this container so I am not sure why selinux is blocking the access to this file

The problem is the iptables command is confined. The question is who launched it? And why did it transition.

rhatdan · 2021-01-10T10:03:29Z

Hopefully I will have time early next week to attempt this and figure out what is going on.

contrib/rootless-cni-infra/Readme.md

libpod/rootless_cni_linux.go

AkihiroSuda · 2021-01-13T04:04:01Z

libpod/rootless_cni_linux.go

+		return err
+	}
+	// bind mount the rootfs in the userns read only
+	if err := mount.Mount("/", rootfs, "bind", "rbind,rprivate,ro"); err != nil {


It should be noted in comment lines that "rbind,ro" is just a "best effort" to make the tree read-only, as "ro" is not recursive.

Good point. Is it possible to mount everything as readonly?

Probably possible with fuse-overlayfs. We could also parse /proc/self/mountinfo and bind-mount each of the entries with ro, but that seems too complicated.

Kernel patch for recursive read-only is here (unmerged): https://lore.kernel.org/linux-fsdevel/20210112220124.837960-7-christian.brauner@ubuntu.com/T/#u

Would be great to get that in.

pkg/rootless/cni/rootless_cni.go

libpod/rootless_cni_linux.go

Luap99 · 2021-02-04T14:37:35Z

I reworked this to support live migration from a previous version. If the old infra container is still running podman talks via the old api to the script. Once the old infra container is deleted the new imageless infra container will be created and podman can talk via the new api. A version label is added to the container to distinguish between old and new.

Luap99 · 2021-02-04T15:30:58Z

@rhatdan Have you looked at the selinux issue?

For now sudo alternatives --set iptables /usr/sbin/iptables-nft would work for fedora but this is a breaking change for users.

Luap99 · 2021-02-14T16:18:19Z

@rhatdan
I tried to look at the selinux issue and found the problem. Look at the following:

# selinux with container image
$ podman run alpine cat /proc/self/attr/current 
system_u:system_r:container_t:s0:c280,c796
$ podman run --security-opt label=disable alpine cat /proc/self/attr/current 
unconfined_u:system_r:spc_t:s0

# selinux with rootfs
$ mkdir -p ~/rootfs
$ podman unshare mount --rbind -r / ~/rootfs/
$ podman unshare mount -t tmpfs none ~/rootfs/run
$ podman run --rootfs ~/rootfs cat /proc/self/attr/current 
system_u:system_r:container_t:s0:c22,c289
$ podman run --security-opt label=disable --rootfs ~/rootfs cat /proc/self/attr/current 
unconfined_u:system_r:container_runtime_t:s0
# why container_runtime_t and not spc_t?

# set selinux labels manually
# spc_t fails
$ podman run --security-opt "label=user:unconfined_u" --security-opt "label=role:system_r" --security-opt "label=type:spc_t" --security-opt "label=level:s0" --rootfs ~/rootfs  cat /proc/self/attr/current 
{"msg":"exec container process `/usr/bin/cat`: Permission denied","level":"error","time":"2021-02-14T16:08:22.000844579Z"}
# unconfined_t works
$ podman run --security-opt "label=user:unconfined_u" --security-opt "label=role:system_r" --security-opt "label=type:unconfined_t" --security-opt "label=level:s0" --rootfs ~/rootfs  cat /proc/self/attr/current 
unconfined_u:system_r:unconfined_t:s0

I don't know if this is a bug or expected. Either way when I set unconfined_u:system_r:unconfined_t for the cni container it works.

rhatdan · 2021-02-15T15:48:28Z

Do you have AVC Messages?

rhatdan · 2021-02-15T15:51:39Z

ls -lZd ~/rootfs/

There are several issues here. I am thinking that we could grab the "level" label off of the rootfs and set that to run the container with so that we run as container_t, that might be the best solution for this. Check to make sure that rootfs is labeled container_file_t:MCS and then use the MCS.

rhatdan · 2021-02-15T15:57:46Z

The reason you are seeing the difference on spc_t versus container_runtime_t is based on the label of ~/rootfs.

The reason spc_t is failing, is that there is an entrypoint fule on spc_t, So only certain labels are allowed to "transition" to spc_t.

The label of the cat command is probably not allowed as an entrypoint to spc_t.

Similar we have transition rules that say when podman running as container_runtime_t runs certain executables it will transition to spc_t, The label of cat within the rootfs does not have a transition rule, so the transition does not happen, and the process continues to run as container_runtime_t.

As proposed by Akihiro Suda make the rootless-cni-infra container use the host rootfs instead of an image. This works by mounting the host rootfs in the user namespace to `$runroot/rootless-cni-infra` and use this as rootfs for the container. Second, rewrite the rootless-cni-infra shell script in go to remove the extra cnitool dependency which is not packaged anywhere. With that we only need the same dependencies as rootful podman which should be already installed. Advantages: - Works for all architectures podman supports. - Works without internet connection. - No extra maintainence of an extra image. Disadvantages: - Requires the dependencies to be available on the host (e.g. dnsname plugin). The user may not have control over those. Problems: - It doesn't unmount the rootfs if the the rootless-cni-infra container is stopped directly. Also the image version did not respect the `--cni-config-dir` option properly. It mounted the cni config dir only at container create time but this option can be used on podman run commands which did not worked if the rootless-cni-infra container was already running. This is only possible with the rootfs version. Live upgrading is possible. If the old infra container is still running podman talks via the old api to the script. Once the old infra container is deleted the new imageless infra container will be created and podman can talk via the new api. A version label is added to the container to distinguish between old and new. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>

Luap99 · 2021-02-18T14:34:58Z

OK I don't think I can get this to work. There is a locking issue somewhere. While thinking about it I thought about not using a container at all and just create a netns in the podman user namespace. I have opened #9423 to implement this instead. I think this is better in the long run.

baude · 2021-02-18T15:29:12Z

agree

openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 7, 2021

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 7, 2021

Luap99 force-pushed the rootless-cni-infra branch 3 times, most recently from 8bde21a to e26ffb8 Compare January 7, 2021 22:00

Luap99 commented Jan 9, 2021

View reviewed changes

This was referenced Jan 9, 2021

"rootless containers and pods cannot be assigned static IP addresses" (podman-run, rootless, CNI) #7842

Closed

--network-alias doesn't seem to work #8567

Closed

Luap99 mentioned this pull request Jan 11, 2021

Add support for rootless network-aliases and static ip/mac #8585

Merged

Luap99 force-pushed the rootless-cni-infra branch from e26ffb8 to 6ee7da0 Compare January 11, 2021 18:25

AkihiroSuda reviewed Jan 13, 2021

View reviewed changes

contrib/rootless-cni-infra/Readme.md Outdated Show resolved Hide resolved

AkihiroSuda reviewed Jan 13, 2021

View reviewed changes

libpod/rootless_cni_linux.go Outdated Show resolved Hide resolved

AkihiroSuda reviewed Jan 13, 2021

View reviewed changes

pkg/rootless/cni/rootless_cni.go Outdated Show resolved Hide resolved

Luap99 force-pushed the rootless-cni-infra branch 2 times, most recently from 16ed86f to 468f86c Compare January 13, 2021 12:37

AkihiroSuda reviewed Jan 14, 2021

View reviewed changes

libpod/rootless_cni_linux.go Outdated Show resolved Hide resolved

openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 18, 2021

Luap99 force-pushed the rootless-cni-infra branch from 468f86c to bc1f8ce Compare February 4, 2021 14:25

openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 4, 2021

Luap99 force-pushed the rootless-cni-infra branch from bc1f8ce to 5fbe9b6 Compare February 4, 2021 14:37

Luap99 mentioned this pull request Feb 4, 2021

[RFE]Make docker-compose work with rootless podman #9169

Closed

Luap99 force-pushed the rootless-cni-infra branch from 5fbe9b6 to 3571c2f Compare February 4, 2021 15:27

Luap99 force-pushed the rootless-cni-infra branch 2 times, most recently from 9f105e2 to 19d65ff Compare February 14, 2021 16:25

Luap99 force-pushed the rootless-cni-infra branch from 19d65ff to ec8b78b Compare February 15, 2021 16:03

edsantiago mentioned this pull request Feb 15, 2021

flake: completion test: ioctl(TUNSETIFF): Operation not permitted #9086

Closed

Luap99 force-pushed the rootless-cni-infra branch 3 times, most recently from 133d8d6 to ecebe3f Compare February 16, 2021 19:47

Luap99 force-pushed the rootless-cni-infra branch from ecebe3f to 73393fc Compare February 16, 2021 22:16

Luap99 closed this Feb 18, 2021

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement the rootless-cni-infra container imageless #8910

Implement the rootless-cni-infra container imageless #8910

Luap99 commented Jan 7, 2021 •

edited

Loading

openshift-ci-robot commented Jan 7, 2021

Luap99 commented Jan 8, 2021

rhatdan commented Jan 9, 2021

rhatdan commented Jan 9, 2021

Luap99 Jan 9, 2021

Luap99 Jan 9, 2021

rhatdan Feb 4, 2021

rhatdan commented Jan 10, 2021

AkihiroSuda Jan 13, 2021

Luap99 Jan 13, 2021

AkihiroSuda Jan 14, 2021

AkihiroSuda Jan 14, 2021

rhatdan Jan 14, 2021

Luap99 commented Feb 4, 2021

Luap99 commented Feb 4, 2021

Luap99 commented Feb 14, 2021

rhatdan commented Feb 15, 2021

rhatdan commented Feb 15, 2021

rhatdan commented Feb 15, 2021

Luap99 commented Feb 18, 2021

baude commented Feb 18, 2021

Implement the rootless-cni-infra container imageless #8910

Implement the rootless-cni-infra container imageless #8910

Conversation

Luap99 commented Jan 7, 2021 • edited Loading

openshift-ci-robot commented Jan 7, 2021

Luap99 commented Jan 8, 2021

rhatdan commented Jan 9, 2021

rhatdan commented Jan 9, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rhatdan commented Jan 10, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Luap99 commented Feb 4, 2021

Luap99 commented Feb 4, 2021

Luap99 commented Feb 14, 2021

rhatdan commented Feb 15, 2021

rhatdan commented Feb 15, 2021

rhatdan commented Feb 15, 2021

Luap99 commented Feb 18, 2021

baude commented Feb 18, 2021

Luap99 commented Jan 7, 2021 •

edited

Loading