Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix OOM watcher for cgroupv2 oom_kill events #1054

Merged
merged 1 commit into from
Jan 31, 2023

Conversation

saschagrunert
Copy link
Member

What type of PR is this?

/kind bug

What this PR does / why we need it:

It may be possible that the container process already got signaled but is still running, whereas we will not get a modify (but other) event on the file watcher. Beside that, it will also not update the oom entry of memory.events, but the oom_kill, which provides another indicator for a possible out of memory kill.

Ref: https://www.kernel.org/doc/Documentation/cgroup-v2.txt

Which issue(s) this PR fixes:

Adresses cri-o/cri-o#6580 for the conmon-rs usage

Special notes for your reviewer:

None

Does this PR introduce a user-facing change?

Fixed OOM detection for cgroup v2 edge-cases where the container already got signaled but has not been fully OOM killed.

It may be possible that the container process already got signaled but
is still running, whereas we will not get a `modify` (but `other`) event
on the file watcher. Beside that, it will also not update the `oom`
entry of `memory.events`, but the `oom_kill`, which provides another
indicator for a possible out of memory kill.

Ref: https://www.kernel.org/doc/Documentation/cgroup-v2.txt

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 31, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rphillips
Copy link
Collaborator

/lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants