Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support running podman under a root v2 cgroup #14308

Merged
merged 1 commit into from
May 25, 2022

Conversation

n1hility
Copy link
Member

@n1hility n1hility commented May 21, 2022

Handles the case where podman is launched under the root cgroup in cgroups v2.

Since the swap validation check requires memory controller limit files (not present on root), attempt to create a group to read from.

Fixes #14236

Fix memory limit failures when running under a root cgroup

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None labels May 21, 2022
@n1hility n1hility force-pushed the root-cgroup branch 2 times, most recently from 45d6068 to e88861a Compare May 21, 2022 05:23
Signed-off-by: Jason T. Greene <jason.greene@redhat.com>
@n1hility n1hility changed the title [WIP] Support running podman under a root v2 cgroup Support running podman under a root v2 cgroup May 23, 2022
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 23, 2022
@n1hility
Copy link
Member Author

PTAL @giuseppe

@skepticoitusInteruptus
Copy link

Schweet! You rock, @n1hility 🎸

So I'm guessing: Did the original implementation assume Podman was running as a child cgroup spawned by systemd?

Or something like that?

@mheon
Copy link
Member

mheon commented May 23, 2022

@giuseppe PTAL

Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be enough if we skip the check when running from the root cgroup?

if own == "/" {
// If running under the root cgroup try to create or reuse a "probe" cgroup to read memory values
own = "podman_probe"
_ = os.MkdirAll(filepath.Join("/sys/fs/cgroup", own), 0o755)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it fine to leak the cgroup here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seemed reasonable to me to leak since it's one global value for the system, the memory overhead is tiny, and it avoids the race of a cleanup/recreate. IIRC we leak in other cases (libpod_parent), and I assume for similar reasons. Additionally in the case it is used, there is likely no systemd running which would have created many more times the single probe group used here. If it's a concern we could overload libpod_parent by pre-creating it, but keeping it isolated eliminates any future conflict.

@n1hility
Copy link
Member Author

would it be enough if we skip the check when running from the root cgroup?

At first that's what I thought we should do, but in looking into the history, #6365 seems to indicate non-swap accounting setups are still common So if the check is skipped a process running in the root cgroup could still fail. When it does fail it will be confusing as to why it failed, since its not immediately clear the process cgroup is relevant (since in truth it's not). Granted we could add some debug logging to point it out. The cost of creating an adhoc group seems small, so IMO worth the tradeoff of added robustness. It could fail, but if it does, the container cgroup creation is also likely to fail.

@openshift-ci openshift-ci bot added release-note and removed do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None labels May 25, 2022
Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 25, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe, n1hility

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 25, 2022
@rhatdan
Copy link
Member

rhatdan commented May 25, 2022

/lgtm

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Podman's vs Docker's approach to container id inspection in cgroup v2 environments
6 participants