-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cpu_speed (in mhz) to cpu or system measurement #4256
Comments
A quick look suggests that we could use the type InfoStat struct {
CPU int32 `json:"cpu"`
VendorID string `json:"vendorId"`
Family string `json:"family"`
Model string `json:"model"`
Stepping int32 `json:"stepping"`
PhysicalID string `json:"physicalId"`
CoreID string `json:"coreId"`
Cores int32 `json:"cores"`
ModelName string `json:"modelName"`
Mhz float64 `json:"mhz"`
CacheSize int32 `json:"cacheSize"`
Flags []string `json:"flags"`
Microcode string `json:"microcode"`
} This reads and parses |
Well it depends on what we're really looking for here. Are we wanting maximum speed, or current speed? What about the max or min limits? |
I was looking for just the "CPU MHz" field from the linux command 'lscpu', which appears to be the same as the "cpu MHz" field of each CPU core from 'cat /proc/cpuinfo'. This doesn't change for me and matches the CPU description, like "Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz". I'm on a VMware infrastruture, though. I was not looking for instantaneous frequency or even max boost frequency, just the frequency that corresponds to the CPU description--which when combined with the CPU core count can help give some comparative sense of capacity among VMs and environments. |
This field is the instantaneous frequency of the processor, but there is also min and max, here it is on my laptop:
Even though min/max don't change, I can see the usefulness across a fleet of systems of collecting them. I think the main thing we should decide is if we want collecting this data to be opt-in or if it is light enough we should just add it. I think we can just add these 3 fields in as part of the standard fields collected by the cpu plugin since it should be a fairly light amount of extra load. |
What about the limits? Limits might be useful on embedded (or other) systems which adjust the limits to conserve power. Basically all the fields and their relationships with each other are: |
This feature request should probably also be reconciled with this PR: |
I tried in the past to read lscpu and to input the data into InfluxDB using the exec input plugin. The values were always higher than what I would get by running the same command from the command line because by the time telegraf gets to run the plugin, the CPU or kernel already increased the frequency. I would say that the plugin makes little sense, unless it is proven to provide reliable values. I'm running a E3-1220 v2 on Ubuntu 18.04. |
I see what you mean. In usecases like mine, (having XX cores HPC machine) one could assume the noise introduced by Telegraf itself can be expected to affect just few (?) cores (?). Anyway, going to write some exec() collection of |
here is P-O-C graded collector meant to be used as Exec input in Telegraf: https://github.com/jose-d/telegraf-collectors/blob/master/cpufreq-monitor/give_stats.py at the end I collect the data from
screenshot from Grafana: (it's actually showing the reason why this monitoring is useful for me - detecting suboptimal usage of CPU resources by $users ) |
Just looking at |
Add cpu_speed to cpu or system measurement
We are currently in the process of converting from Ganglia to Telegraf. (yeah!) Unfortunately, we have some existing dependence on the Ganglia cpu_speed metric. This is not found in Telegraf.
Proposal:
Add a cpu_speed field or equivalent to the cpu or system measurement. This would be in MHz.
Use case: [Why is this important (helps with prioritizing requests)]
This helps mostly in the capacity management area, when mapping cpu mhz of an application group that is targeted for migration to new hosts. We can get the cpu speed other ways, of course, but having it directly and natively in Telegraf would be optimal.
The text was updated successfully, but these errors were encountered: