-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Measuring perf events - chapter I #2419
Conversation
Hi @iwankgb. Thanks for your PR. I'm waiting for a google member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
fc92381
to
d07e06c
Compare
@dashpole, I would appreciate if you looked at this very PR. At the moment it consists of documentation and proposed configuration API. Before proceeding with further changes I want to know what you think about it and make sure that we can agree on a user-friendly implementation. Originally, we were planning to use We believe that configuring perf subsystem in cAdvisor still makes sense as this subsystem is not responsible for any resource allocation but purely for monitoring. We also assume that same events will be measured for all the containers. |
47b3ace
to
daf1da4
Compare
8dd743e
to
42600b5
Compare
Whoops, missed your comment earlier... I had been ignoring it because of the WIP tag. I'll take a look today |
Yeah, I doubt it would be accepted in the pod spec. Do you know if perf events could be enabled using OCI hooks? Those seem to be the standard place to do container configuration.
I definitely agree that cAdvisor collecting these events would be useful. I have seen some pretty neat sidecar-container perf tools, so I know these metrics can be useful. I was initially surprised that we would want to measure the same events across all containers, as I assumed it would hurt application performance. But from some reading, it doesn't seem to be that bad... We haven't done a great job in the past with having this sort of configuration in cAdvisor. If there were a way to just have cAdvisor collect the metrics, and have something else do the configuration, that would fit more cleanly within our scope. But absent that, I think this is a reasonable feature to add. |
Sorry for the delayed reply, I had to take few days off.
What do you think about providing two configuration schemas:
|
docs/runtime_options.md
Outdated
``` | ||
|
||
In the example above number: | ||
* `INST_RETIRED.ANY_P` (number of instructions retired on 2nd Generation Intel® Xeon® Scalable Processor) identified by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are events specific to processors? Are there any common ones that nearly everyone would be able to use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, events are different even between various generation of Xeons. If you take AMD's Naples and Rome CPUs you will see similar differences not mentioning other architectures such as ARM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed one of your questions: yes, some of the events (e.g. instructions retired, cycles) are supported across all or majority of Linux architectures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Once we have this in, it would be useful to have an example configmap in deploy/kubernetes as a follow-up, so it is easy for users to get started with the feature.
As long as we have good documentation and examples, I prefer having a single configuration option. I thought about it after suggesting it, and the file-descriptor API makes it pretty much impossible to implement cleanly. In general, this is probably the right direction to go. |
"blkio": {}, | ||
"io": {}, | ||
"devices": {}, | ||
"perf_event": {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this exist and work similarly in cgroups v2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does exists and I believe it is similar but I have not investigated it yet.
A quick note after chat with @dashpole on Slack:
|
b16b9a4
to
45d4ec4
Compare
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
After this is rebased, I think we can merge and iterate from there. |
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
bbfaafd
to
e1d8e05
Compare
@dashpole rebased. |
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The remaining follow-ups i'm aware of are:
|
Purpose of this PR is to introduce support for measuring perf events to cAdvisor. It will eventually consist of the following:
libpfm
to translate event names toperf_event_attr
This is part of effort discussed in #2388