Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libct/cg/stats: support PSI for cgroup v2 #3900

Merged
merged 1 commit into from
Jul 14, 2023

Conversation

kolyshkin
Copy link
Contributor

This is a carry over of #3679, adding support for PSI stats for cgroup v2 to cgroup manager' Stats(), as well as `runc events'. Mostly written by @dqminh, amended by @szuecs and @kolyshkin.

Closes: #3679
Closes: #3358
Closes: #3627

@kolyshkin
Copy link
Contributor Author

The full diff from PR #3679 is here: #3679 (comment)

@kolyshkin kolyshkin force-pushed the psi branch 2 times, most recently from 9b47353 to 22836b3 Compare June 13, 2023 16:54
@kolyshkin
Copy link
Contributor Author

time="2023-06-13T03:05:17Z" level=error msg="unable to get container cgroup stats: error while statting cgroup v2: [unable to parse /sys/fs/cgroup/system.slice/runc-test_busybox.scope/cpu.pressure: read /sys/fs/cgroup/system.slice/runc-test_busybox.scope/cpu.pressure: operation not supported unable to parse /sys/fs/cgroup/system.slice/runc-test_busybox.scope/memory.pressure: read /sys/fs/cgroup/system.slice/runc-test_busybox.scope/memory.pressure: operation not supported unable to parse /sys/fs/cgroup/system.slice/runc-test_busybox.scope/io.pressure: read /sys/fs/cgroup/system.slice/runc-test_busybox.scope/io.pressure: operation not supported]"

CentOS 9 kernel returns ENOTSUP on read() not open(). The fix is:

diff --git a/libcontainer/cgroups/fs2/psi.go b/libcontainer/cgroups/fs2/psi.go
index 603c6eaf..0d5f1d7a 100644
--- a/libcontainer/cgroups/fs2/psi.go
+++ b/libcontainer/cgroups/fs2/psi.go
@@ -16,12 +16,9 @@ import (
 func statPSI(dirPath string, file string) (*cgroups.PSIStats, error) {
        f, err := cgroups.OpenFile(dirPath, file, os.O_RDONLY)
        if err != nil {
-               if errors.Is(err, os.ErrNotExist) || errors.Is(err, unix.ENOTSUP) {
-                       // Open *.pressure file returns:
-                       // - ErrNotExist when kernel < 4.20 or when CONFIG_PSI is disabled;
-                       // - ENOTSUP when psi=1 kernel cmdline is required for PSI support.
-                       // In both cases, stats are not available, so return nil.
-                       err = nil
+               if errors.Is(err, os.ErrNotExist) {
+                       // Kernel < 4.20, or CONFIG_PSI is not set.
+                       return nil, nil
                }
                return nil, err
        }
@@ -46,6 +43,11 @@ func statPSI(dirPath string, file string) (*cgroups.PSIStats, error) {
                }
        }
        if err := sc.Err(); err != nil {
+               if errors.Is(err, unix.ENOTSUP) {
+                       // Some kernels (e.g. CS9) may return ENOTSUP on read
+                       // if psi=1 kernel cmdline parameter is required.
+                       return nil, nil
+               }
                return nil, &parseError{Path: dirPath, File: file, Err: err}
        }
        return &psistats, nil

@kolyshkin kolyshkin marked this pull request as ready for review June 13, 2023 20:41
@Dragoncell
Copy link

Thanks for the work. The change looks pretty good to me !

@szuecs
Copy link
Contributor

szuecs commented Jun 13, 2023

Thanks @kolyshkin !

We read output from the following files if they exists:
- cpu.pressure
- memory.pressure
- io.pressure

Each are in format:

```
some avg10=0.00 avg60=0.00 avg300=0.00 total=0
full avg10=0.00 avg60=0.00 avg300=0.00 total=0
```

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
Co-authored-by: Sandor Szücs <sandor.szuecs@zalando.de>
Co-authored-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@bobbypage
Copy link

Thanks all, and @kolyshkin for getting this into ready state! It looks good to me, would be great to get this in so we can make use of it it in cAdvisor, and later in k8s.

@kolyshkin
Copy link
Contributor Author

@opencontainers/runc-maintainers PTAL

1 similar comment
@kolyshkin
Copy link
Contributor Author

@opencontainers/runc-maintainers PTAL

Copy link
Contributor

@hqhq hqhq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@lifubang lifubang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -114,6 +114,17 @@ func (m *Manager) GetStats() (*cgroups.Stats, error) {
if err := statCpu(m.dirPath, st); err != nil && !os.IsNotExist(err) {
errs = append(errs, err)
}
// PSI (since kernel 4.20).
var err error
Copy link
Member

@lifubang lifubang Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kolyshkin I think this one could be optimized. We should write their own PSI state function for cpu, memory and io. For example: add statCpuPressure in cpu.go, add statMemoryPressure in memory.go, add statIOPressure in io.go. Then, we can keep the code style like statCpu.
But this one could be done in a seperate PR if the upstream projects are very anxious to need this feature.

@lifubang
Copy link
Member

The code looks good now. Let's go ahead, and test it in practice, welcome contibutors/maintainers to do some refactor jobs if need.

@dqminh
Copy link
Contributor

dqminh commented Jul 14, 2023

Thanks everyone for pushing this one over the line !!! 🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants