Measuring perf events - chapter I #2419

iwankgb · 2020-03-06T16:41:44Z

Purpose of this PR is to introduce support for measuring perf events to cAdvisor. It will eventually consist of the following:

Configuration primitives allowing to describe what events are to be measured (user-friendly, using libpfm to translate event names to perf_event_attr
Perf events collector (ungrouped core events support)
Perf events manager responsible to container lifecycle control (ungrouped core events support)

This is part of effort discussed in #2388

k8s-ci-robot · 2020-03-06T16:41:59Z

Hi @iwankgb. Thanks for your PR.

I'm waiting for a google member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dashpole · 2020-03-06T17:48:42Z

/ok-to-test

iwankgb · 2020-03-13T11:29:20Z

@dashpole, I would appreciate if you looked at this very PR. At the moment it consists of documentation and proposed configuration API. Before proceeding with further changes I want to know what you think about it and make sure that we can agree on a user-friendly implementation.

Originally, we were planning to use runc to handle perf subsystem but to do so we would have to update all the projects that interact with runc including projects that interact indirectly. For instance, if we updated runc configuration and OCI runtime spec then we would have to update containerd too and Kubernetes' CRI so that it is possible to pass perf configuration from PodSpec through CRI to containerd that would pass it to runc using OCI runtime spec.

We believe that configuring perf subsystem in cAdvisor still makes sense as this subsystem is not responsible for any resource allocation but purely for monitoring. We also assume that same events will be measured for all the containers.

dashpole · 2020-03-20T17:21:38Z

Whoops, missed your comment earlier... I had been ignoring it because of the WIP tag. I'll take a look today

dashpole · 2020-03-21T00:55:59Z

Originally, we were planning to use runc to handle perf subsystem but to do so we would have to update all the projects that interact with runc including projects that interact indirectly. For instance, if we updated runc configuration and OCI runtime spec then we would have to update containerd too and Kubernetes' CRI so that it is possible to pass perf configuration from PodSpec through CRI to containerd that would pass it to runc using OCI runtime spec.

Yeah, I doubt it would be accepted in the pod spec. Do you know if perf events could be enabled using OCI hooks? Those seem to be the standard place to do container configuration.

We believe that configuring perf subsystem in cAdvisor still makes sense as this subsystem is not responsible for any resource allocation but purely for monitoring. We also assume that same events will be measured for all the containers.

I definitely agree that cAdvisor collecting these events would be useful. I have seen some pretty neat sidecar-container perf tools, so I know these metrics can be useful. I was initially surprised that we would want to measure the same events across all containers, as I assumed it would hurt application performance. But from some reading, it doesn't seem to be that bad...

We haven't done a great job in the past with having this sort of configuration in cAdvisor. If there were a way to just have cAdvisor collect the metrics, and have something else do the configuration, that would fit more cleanly within our scope. But absent that, I think this is a reasonable feature to add.

iwankgb · 2020-03-26T09:45:24Z

Sorry for the delayed reply, I had to take few days off.
@dashpole, I don't think it is feasible (while it's perfectly possible) to handle perf events configuration in an OCI hook. In order to do so one would have to call perf_event_open syscall in the hook and then do one of the following:

pass returned file descriptor to cAdvisor - doable but imperfect (only containers launched after launching cAdvisor could be monitored or we would have to create a long-running hook process)
you OCI hook to launch a sidecar container responsible for perf events handling - this is scenario that we want to deprecate by adding perf support to cAdvisor.
I can't think about any other ways of handling configuration step outside of cAdvisor.

What do you think about providing two configuration schemas:

advanced - as already described in the PR
simple - limiting number of events to number of available counters, using only named events, no grouping support?

dashpole · 2020-03-26T17:25:12Z

docs/runtime_options.md

+```
+
+In the example above number:
+* `INST_RETIRED.ANY_P` (number of instructions retired on 2nd Generation Intel® Xeon® Scalable Processor) identified by


are events specific to processors? Are there any common ones that nearly everyone would be able to use?

Yes, events are different even between various generation of Xeons. If you take AMD's Naples and Rome CPUs you will see similar differences not mentioning other architectures such as ARM.

I missed one of your questions: yes, some of the events (e.g. instructions retired, cycles) are supported across all or majority of Linux architectures.

Great! Once we have this in, it would be useful to have an example configmap in deploy/kubernetes as a follow-up, so it is easy for users to get started with the feature.

perf/config.go

docs/runtime_options.md

dashpole · 2020-03-26T17:46:09Z

As long as we have good documentation and examples, I prefer having a single configuration option.

I thought about it after suggesting it, and the file-descriptor API makes it pretty much impossible to implement cleanly.

In general, this is probably the right direction to go.

dashpole · 2020-03-26T17:46:46Z

container/libcontainer/helpers.go

+	"blkio":      {},
+	"io":         {},
+	"devices":    {},
+	"perf_event": {},


Does this exist and work similarly in cgroups v2?

It does exists and I believe it is similar but I have not investigated it yet.

iwankgb · 2020-03-30T17:46:05Z

A quick note after chat with @dashpole on Slack:

It's fine to use libpfm to make perf event configuration more user friendly
It must be possible to build cAdvisor without cgo and/or libpfm
There must be clear message (perhaps a failure to start cAdvisor) if it's built without cgo and/or libpfmsupport and user tries to launch perf event monitoring.

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

manager/manager.go

perf/collector_libpfm.go

docs/runtime_options.md

dashpole · 2020-04-09T16:48:23Z

After this is rebased, I think we can merge and iterate from there.

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

iwankgb · 2020-04-09T16:49:59Z

@dashpole rebased.

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

dashpole

LGTM

dashpole · 2020-04-09T16:58:38Z

The remaining follow-ups i'm aware of are:

handle groups
example kubernetes configmap
cgroups v2

k8s-ci-robot added the needs-ok-to-test label Mar 6, 2020

k8s-ci-robot added ok-to-test and removed needs-ok-to-test labels Mar 6, 2020

iwankgb force-pushed the perf_events_support branch 2 times, most recently from fc92381 to d07e06c Compare March 12, 2020 10:53

iwankgb force-pushed the perf_events_support branch 3 times, most recently from 47b3ace to daf1da4 Compare March 17, 2020 10:19

iwankgb mentioned this pull request Mar 17, 2020

Moving Nvidia interfaces #2432

Merged

iwankgb force-pushed the perf_events_support branch 3 times, most recently from 8dd743e to 42600b5 Compare March 17, 2020 19:05

dashpole reviewed Mar 26, 2020

View reviewed changes

iwankgb force-pushed the perf_events_support branch 8 times, most recently from b16b9a4 to 45d4ec4 Compare April 3, 2020 17:25

Maciej "Iwan" Iwanowski added 19 commits April 9, 2020 18:41

Removing non-sense struct tag

a38779f

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Little fixes to doc strings

3f416b7

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Returning more error information from stats.Manager and stats.Collector

b60e720

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Moved to stats package

c4a66d9

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Noop implementations moved to noop.go

7471aba

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Using constants instead of magic numbers

7df5f2c

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Some structs have been moved to stats

3df55b5

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Bunch of various small changes, mainly originating in make presubmit

b7eaf69

Improving logging and error returning

08615ee

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Reverting some behaviour to be backward compatible

6e1344e

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Erroring on grouped events in perf.NewManager

c16a2a1

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Using noop collector by default

fb4f7cb

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Killing stats.Collector.Setup()

4a66ac7

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Killing stats.Manager.Setup()

4656af8

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Decreasing log message importance

2a6a961

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Destroying on failure

b7e2b2d

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Moving Event-to-CustomEvent mapping to collector

702d46f

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Fixed typo

76041c0

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

Final logging tweaks

e1d8e05

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

dashpole reviewed Apr 9, 2020

View reviewed changes

manager/manager.go Outdated Show resolved Hide resolved

perf/collector_libpfm.go Outdated Show resolved Hide resolved

docs/runtime_options.md Outdated Show resolved Hide resolved

Adding missing configuration directive

8cd8b3b

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

iwankgb force-pushed the perf_events_support branch from bbfaafd to e1d8e05 Compare April 9, 2020 16:49

Fixing log message priorities

05fcd96

Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>

dashpole approved these changes Apr 9, 2020

View reviewed changes

dashpole merged commit e7efc0a into google:master Apr 9, 2020

iwankgb deleted the perf_events_support branch April 9, 2020 18:49

dashpole mentioned this pull request Nov 13, 2020

Don't fail permenantly when nvml isn't installed #2732

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Measuring perf events - chapter I #2419

Measuring perf events - chapter I #2419

iwankgb commented Mar 6, 2020 •

edited

Loading

k8s-ci-robot commented Mar 6, 2020

dashpole commented Mar 6, 2020

iwankgb commented Mar 13, 2020

dashpole commented Mar 20, 2020

dashpole commented Mar 21, 2020

iwankgb commented Mar 26, 2020

dashpole Mar 26, 2020

iwankgb Mar 26, 2020

iwankgb Mar 26, 2020

dashpole Mar 26, 2020

dashpole commented Mar 26, 2020

dashpole Mar 26, 2020

iwankgb Mar 26, 2020

iwankgb commented Mar 30, 2020

dashpole commented Apr 9, 2020

iwankgb commented Apr 9, 2020

dashpole left a comment

dashpole commented Apr 9, 2020

Measuring perf events - chapter I #2419

Measuring perf events - chapter I #2419

Conversation

iwankgb commented Mar 6, 2020 • edited Loading

k8s-ci-robot commented Mar 6, 2020

dashpole commented Mar 6, 2020

iwankgb commented Mar 13, 2020

dashpole commented Mar 20, 2020

dashpole commented Mar 21, 2020

iwankgb commented Mar 26, 2020

dashpole Mar 26, 2020

Choose a reason for hiding this comment

iwankgb Mar 26, 2020

Choose a reason for hiding this comment

iwankgb Mar 26, 2020

Choose a reason for hiding this comment

dashpole Mar 26, 2020

Choose a reason for hiding this comment

dashpole commented Mar 26, 2020

dashpole Mar 26, 2020

Choose a reason for hiding this comment

iwankgb Mar 26, 2020

Choose a reason for hiding this comment

iwankgb commented Mar 30, 2020

dashpole commented Apr 9, 2020

iwankgb commented Apr 9, 2020

dashpole left a comment

Choose a reason for hiding this comment

dashpole commented Apr 9, 2020

iwankgb commented Mar 6, 2020 •

edited

Loading