Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check that cgroup is empty before deleting #228

Merged
merged 3 commits into from
Oct 17, 2022

Conversation

jseba
Copy link
Contributor

@jseba jseba commented Apr 11, 2022

The kernel will block an attempt to rmdir a cgroup path that still has running processes in it. Since the removal code uses the standard os.RemoveAll function, the Go runtime will helpfully fall back to an rm -rf-like removal algorithm if the cheap unlink/rmdir attempts fail.

Since cgroup files cannot be removed, the caller ends up with an unhelpful "unlinkat /sys/fs/cgroup/.../cgroup.events: operation not permitted" error message that doesn't actually give any actionable information because the original error was swallowed by the runtime.

This changes the Delete functions to detect cgroups with still running processes and return an error indicating that removal cannot be done.

Signed-off-by: Josh Seba sebajosh@outlook.com

jseba added 2 commits April 11, 2022 15:18
The kernel will block an attempt to rmdir a cgroup path that still has
running processes in it. Since the removal code uses the standard
`os.RemoveAll` function, the Go runtime will helpfully fall back to an
`rm -rf`-like removal algorithm if the cheap `unlink`/`rmdir` attempts
fail.

Since cgroup files cannot be removed, the caller ends up with an
unhelpful "unlinkat /sys/fs/cgroup/.../cgroup.events: operation not
permitted" error message that doesn't actually give any actionable
information because the original error was swallowed by the runtime.

This changes the `Delete` functions to detect cgroups with still running
processes and return an error indicating that removal cannot be done.

Signed-off-by: Josh Seba <sebajosh@outlook.com>
Signed-off-by: Josh Seba <sebajosh@outlook.com>
return err
}
if len(processes) > 0 {
return fmt.Errorf("cgroups: unable to remove path %q: still contains running processes", c.path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have a similar message on cgroup.go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The v1 Delete just accumulates a list of subsystems that couldn't be removed and doesn't report the error. I could make the append more like:

errs = append(errs, fmt.Sprintf("%s (contains running processes)", string(s.Name()))

be a solution? Then the resulting error would be akin to

cgroups: unable to remove paths memory (contains running processes), cpu (contains running processes)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for not getting back to this sooner, I added the "contains running process" reason to the errors list in cgroup.go

Signed-off-by: Josh Seba <sebajosh@outlook.com>
@jseba
Copy link
Contributor Author

jseba commented Jun 21, 2022

If someone gets a chance to take a look at this again, I'd appreciate it! It bubbled up to the top of our ticket tracker to remind me to check on it. If there's any blockers let me know!

@@ -247,6 +256,7 @@ func (c *cgroup) Delete() error {
if err := remove(path); err != nil {
errs = append(errs, path)
}
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this continue necessary? I don't see how it changes loop flow since this is the end of the for and will naturally continue through the end of the range of subsystems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not strictly necessary, no. I added it to because from my interpretation of the structure of the code here, it's sort of like a switch case and only one branch should be valid, so if there was ever a third case (very unlikely since cgroup v1 is now legacy code), this would be correct for that situation.

Looking at it a bit more, I think it would be better if those two if blocks were restructured to better represent that intent that only one interface is expected to be matched. What do you think of something more like this, would that be more readable?

switch s.(type) {
case deleter:
    ...
case pather:
   ...
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

continue looks okay to me to be honest. switch could work, but it couldn't have if len(procs) > 0 { case which is another continue case.

Copy link
Member

@kzys kzys Oct 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed with @estesp offline that this is not a blocker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants