Add option for ignoring volumes defined in images #1504

lorenz · 2020-06-08T18:46:51Z

This adds a new config option for ignoring volumes defined in image metadata.

In Kubernetes volumes (even temporary ones) are generally specified in the pod configuration. Mapping the ones in the image metadata can lead to resource accounting/exhaustion issues (think writing an unlimited amount of inodes or data into the volume) and policy enforcement issues (a PSP cannot disable these).

Specifically for containers started with ReadOnlyRootFilesystem it can also lead to unexpected behavior, for example if there's a typo in the volume mount path the container will still start since the process can write to the data location, but the data ends up in the volume defined by the image which is ephemeral and subsequently gets lost when the pod gets descheduled. The user expected the process in the container to fail since there's no volume mounted at the data path and the root filesystem is readonly, but since the image volume is mounted there this works.

k8s-ci-robot · 2020-06-08T18:47:00Z

Hi @lorenz. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

mikebrow · 2020-06-09T00:56:40Z

/ok-to-test

mikebrow

see comment

we also have a config document..
https://github.com/containerd/cri/blob/master/docs/config.md

pkg/server/container_create.go

mikebrow · 2020-06-09T01:22:09Z

@lorenz sounds like this was a concrete issue you ran into. Can you provide link(s)?

lorenz · 2020-06-09T07:05:23Z

I've added a debug log output if we're ignoring volumes and documented the option in config.md. I looked at the commit for the last-added option (TolerateMissingHugePagesCgroupController) and added mine to all the files it touched, but it wasn't added to config.md, that's why I missed it.

The main reason why I've been running this patch for quite some time is because of the issue where incorrectly configured volumes end up writing data into an ephemeral containerd-allocated volume and misbehaving containers writing lots of data into the containerd directory.

lorenz · 2020-06-09T07:24:19Z

/test pull-cri-containerd-verify

mikebrow · 2020-06-09T11:12:04Z

I've added a debug log output if we're ignoring volumes and documented the option in config.md. I looked at the commit for the last-added option (TolerateMissingHugePagesCgroupController) and added mine to all the files it touched, but it wasn't added to config.md, that's why I missed it.

The main reason why I've been running this patch for quite some time is because of the issue where incorrectly configured volumes end up writing data into an ephemeral containerd-allocated volume and misbehaving containers writing lots of data into the containerd directory.

yeah I figured as much.. Which of course was my bad, either I missed it .. or just let it go in that pr review and flagged it as a docs issue to be worked on later, can't remember which was a week ago and seems like months :-)

pkg/server/container_create.go

Signed-off-by: Lorenz Brun <lorenz@brun.one>

mikebrow

LGTM

fuweid

LGTM

@mikebrow Do we actually need the volume here? The user from eBay filed issue about this before https://github.com/containerd/containerd/issues/3511.

The volume handled by CRI plugin doesn't provide interface for user to handle this. The CRI-API spec defines volumes from kubelet, not image config. If the user uses devicemapper as snapshotter with disk quota, the size of rootfs will be ensured with quota. But volume from image config will be leak point.

cc @wenlxie

mikebrow · 2020-06-10T03:49:22Z

LGTM

@mikebrow Do we actually need the volume here? The user from eBay filed issue about this before containerd/containerd#3511.

The volume handled by CRI plugin doesn't provide interface for user to handle this. The CRI-API spec defines volumes from kubelet, not image config. If the user uses devicemapper as snapshotter with disk quota, the size of rootfs will be ensured with quota. But volume from image config will be leak point.

cc @wenlxie

Hmm. Reading the other issue (I re-assigned it to cri) it sounds like @Random-Liu already had some ideas for properly handling image specified volumes (report disk usage and /or ask kublet to manage these volumes). Maybe we can recruit him for a discussion at a future sig-node call.

Either way I think this flag provides a good ignore option for now at least until we have a more elaborate solution to map and/or report usage for the image volumes.

fuweid · 2020-06-10T05:33:40Z

Either way I think this flag provides a good ignore option for now at least until we have a more elaborate solution to map and/or report usage for the image volumes.

SGTM

lorenz · 2020-06-13T17:08:21Z

Is this waiting on something to get merged?

mikebrow · 2020-06-13T19:34:54Z

Is this waiting on something to get merged?

Just giving maintainers a sufficient opportunity to weigh in.

This unbreaks bbolt (as part of containerd) on 1.14+ (see etcd-io/bbolt#201 and etcd-io/bbolt#220), pulls in my patch to ignore image-defined volumes (containerd/cri#1504) and gets us some robustness fixes in containerd CNI/CRI integration (containerd/cri#1405). This also updates K8s at the same time since they share a lot of dependencies and only updating one is very annoying. On the K8s side we mostly get the standard stream of fixes plus some patches that are no longer necessary. One annoying on the K8s side (but with no impact to the functionality) are these messages in the logs of various components: ``` W0714 11:51:26.323590 1 warnings.go:67] policy/v1beta1 PodSecurityPolicy is deprecated in v1.22+, unavailable in v1.25+ ``` They are caused by KEP-1635, but there's not explanation why this gets logged so aggressively considering the operators cannot do anything about it. There's no newer version of PodSecurityPolicy and you are pretty much required to use it if you use RBAC. Test Plan: Covered by existing tests Bug: T753 X-Origin-Diff: phab/D597 GitOrigin-RevId: f6c447da1de037c27646f9ec9f45ebd5d6660ab0

k8s-ci-robot added the needs-ok-to-test label Jun 8, 2020

k8s-ci-robot added the size/S label Jun 8, 2020

k8s-ci-robot added ok-to-test and removed needs-ok-to-test labels Jun 9, 2020

mikebrow requested review from dmcgowan and Random-Liu June 9, 2020 01:07

mikebrow reviewed Jun 9, 2020

View reviewed changes

pkg/server/container_create.go Show resolved Hide resolved

lorenz force-pushed the ignore-image-defined-volumes branch from ff69933 to 398fb1e Compare June 9, 2020 06:39

mikebrow reviewed Jun 9, 2020

View reviewed changes

pkg/server/container_create.go Outdated Show resolved Hide resolved

Add option for ignoring volumes defined in images

5a1d49b

Signed-off-by: Lorenz Brun <lorenz@brun.one>

lorenz force-pushed the ignore-image-defined-volumes branch from 398fb1e to 5a1d49b Compare June 9, 2020 19:02

mikebrow approved these changes Jun 9, 2020

View reviewed changes

fuweid approved these changes Jun 10, 2020

View reviewed changes

mikebrow merged commit b661ad7 into containerd:master Jun 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option for ignoring volumes defined in images #1504

Add option for ignoring volumes defined in images #1504

lorenz commented Jun 8, 2020

k8s-ci-robot commented Jun 8, 2020

mikebrow commented Jun 9, 2020

mikebrow left a comment

mikebrow commented Jun 9, 2020

lorenz commented Jun 9, 2020

lorenz commented Jun 9, 2020

mikebrow commented Jun 9, 2020

mikebrow left a comment

fuweid left a comment

mikebrow commented Jun 10, 2020

fuweid commented Jun 10, 2020

lorenz commented Jun 13, 2020

mikebrow commented Jun 13, 2020

Add option for ignoring volumes defined in images #1504

Add option for ignoring volumes defined in images #1504

Conversation

lorenz commented Jun 8, 2020

k8s-ci-robot commented Jun 8, 2020

mikebrow commented Jun 9, 2020

mikebrow left a comment

Choose a reason for hiding this comment

mikebrow commented Jun 9, 2020

lorenz commented Jun 9, 2020

lorenz commented Jun 9, 2020

mikebrow commented Jun 9, 2020

mikebrow left a comment

Choose a reason for hiding this comment

fuweid left a comment

Choose a reason for hiding this comment

mikebrow commented Jun 10, 2020

fuweid commented Jun 10, 2020

lorenz commented Jun 13, 2020

mikebrow commented Jun 13, 2020