Add cpu_speed (in mhz) to cpu or system measurement #4256

randallt · 2018-06-08T19:05:41Z

Add cpu_speed to cpu or system measurement

We are currently in the process of converting from Ganglia to Telegraf. (yeah!) Unfortunately, we have some existing dependence on the Ganglia cpu_speed metric. This is not found in Telegraf.

Proposal:

Add a cpu_speed field or equivalent to the cpu or system measurement. This would be in MHz.

Use case: [Why is this important (helps with prioritizing requests)]

This helps mostly in the capacity management area, when mapping cpu mhz of an application group that is targeted for migration to new hosts. We can get the cpu speed other ways, of course, but having it directly and natively in Telegraf would be optimal.

danielnelson · 2018-06-08T20:51:30Z

A quick look suggests that we could use the cpu.Info() function from gopsutil to pull in some additional cpu fields:

type InfoStat struct {
	CPU        int32    `json:"cpu"`
	VendorID   string   `json:"vendorId"`
	Family     string   `json:"family"`
	Model      string   `json:"model"`
	Stepping   int32    `json:"stepping"`
	PhysicalID string   `json:"physicalId"`
	CoreID     string   `json:"coreId"`
	Cores      int32    `json:"cores"`
	ModelName  string   `json:"modelName"`
	Mhz        float64  `json:"mhz"`
	CacheSize  int32    `json:"cacheSize"`
	Flags      []string `json:"flags"`
	Microcode  string   `json:"microcode"`
}

This reads and parses /proc/cpuinfo on Linux.

phemmer · 2018-06-08T23:12:17Z

Well it depends on what we're really looking for here. Are we wanting maximum speed, or current speed? What about the max or min limits?

randallt · 2018-06-11T13:52:08Z

I was looking for just the "CPU MHz" field from the linux command 'lscpu', which appears to be the same as the "cpu MHz" field of each CPU core from 'cat /proc/cpuinfo'. This doesn't change for me and matches the CPU description, like "Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz". I'm on a VMware infrastruture, though.

I was not looking for instantaneous frequency or even max boost frequency, just the frequency that corresponds to the CPU description--which when combined with the CPU core count can help give some comparative sense of capacity among VMs and environments.

danielnelson · 2018-06-12T01:34:53Z

This field is the instantaneous frequency of the processor, but there is also min and max, here it is on my laptop:

CPU MHz:             499.877
CPU max MHz:         3400.0000
CPU min MHz:         400.0000

Even though min/max don't change, I can see the usefulness across a fleet of systems of collecting them. I think the main thing we should decide is if we want collecting this data to be opt-in or if it is light enough we should just add it. I think we can just add these 3 fields in as part of the standard fields collected by the cpu plugin since it should be a fairly light amount of extra load.

phemmer · 2018-06-12T01:49:26Z

What about the limits? Limits might be useful on embedded (or other) systems which adjust the limits to conserve power.
Dunno if gopsutil provides them all in one spot, but they can all be obtained from /sys/devices/system/cpu/cpu*/cpufreq/

Basically all the fields and their relationships with each other are:
cpuinfo_min_freq <= scaling_min_freq <= scaling_cur_freq <= scaling_max_freq <= cpuinfo_max_freq

randallt · 2018-06-12T12:01:15Z

This feature request should probably also be reconciled with this PR:
#4215

dewi-ny-je · 2020-03-02T22:57:41Z

I tried in the past to read lscpu and to input the data into InfluxDB using the exec input plugin.

The values were always higher than what I would get by running the same command from the command line because by the time telegraf gets to run the plugin, the CPU or kernel already increased the frequency.

I would say that the plugin makes little sense, unless it is proven to provide reliable values.

I'm running a E3-1220 v2 on Ubuntu 18.04.

jose-d · 2020-07-15T10:40:31Z

I tried in the past to read lscpu and to input the data into InfluxDB using the exec input plugin.
The values were always higher than what I would get by running the same command from the command line because by the time telegraf gets to run the plugin, the CPU or kernel already increased the frequency.

I would say that the plugin makes little sense, unless it is proven to provide reliable values.

I see what you mean. In usecases like mine, (having XX cores HPC machine) one could assume the noise introduced by Telegraf itself can be expected to affect just few (?) cores (?). Anyway, going to write some exec() collection of /sys/devices/system/cpu/cpuXXX/cpufreq/cpuinfo_cur_freq and keep it running for some weeks on few compute nodes to see the real-life results.

jose-d · 2020-07-15T12:45:59Z

here is P-O-C graded collector meant to be used as Exec input in Telegraf:

https://github.com/jose-d/telegraf-collectors/blob/master/cpufreq-monitor/give_stats.py

at the end I collect the data from

/sys/devices/system/cpu/cpuNN/cpufreq/scaling_cur_freq as it is readable (Centos7) by non-root user.

screenshot from Grafana:

(it's actually showing the reason why this monitoring is useful for me - detecting suboptimal usage of CPU resources by $users )

quentinmit · 2022-06-19T05:43:02Z

Just looking at scaling_cur_freq or cpuinfo_cur_freq is going to cause a whole bunch of aliasing because the frequency normally changes much more often than the Telegraf update interval. It would be better to collect stats/time_in_state which gives a cumulative counter of time spent at each state, which could correctly show you if half the time is spent at one frequency and half at another.

danielnelson added feature request Requests for new plugin and for new features to existing plugins area/system labels Jun 8, 2018

This was referenced May 7, 2019

CPU Arch Support as tag in CPU plugin #5806

Closed

CPU Arch Support in Telegraf #5805

Closed

danielnelson mentioned this issue Jun 3, 2019

pull metrics for physical cpu but not cores #5949

Closed

sjwang90 mentioned this issue Mar 5, 2021

Input Plugin for CPU Frequency (Linux) #8342

Closed

3 tasks

fabianishere mentioned this issue Mar 15, 2021

feat(inputs.linux_cpu): Add CPUFreq input plugin for Linux (v3) #8988

Merged

3 tasks

reimda closed this as completed in #8988 Aug 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cpu_speed (in mhz) to cpu or system measurement #4256

Add cpu_speed (in mhz) to cpu or system measurement #4256

randallt commented Jun 8, 2018

danielnelson commented Jun 8, 2018

phemmer commented Jun 8, 2018

randallt commented Jun 11, 2018 •

edited

Loading

danielnelson commented Jun 12, 2018

phemmer commented Jun 12, 2018 •

edited

Loading

randallt commented Jun 12, 2018

dewi-ny-je commented Mar 2, 2020 •

edited

Loading

jose-d commented Jul 15, 2020 •

edited

Loading

jose-d commented Jul 15, 2020 •

edited

Loading

quentinmit commented Jun 19, 2022 •

edited

Loading

Add cpu_speed (in mhz) to cpu or system measurement #4256

Add cpu_speed (in mhz) to cpu or system measurement #4256

Comments

randallt commented Jun 8, 2018

Add cpu_speed to cpu or system measurement

Proposal:

Use case: [Why is this important (helps with prioritizing requests)]

danielnelson commented Jun 8, 2018

phemmer commented Jun 8, 2018

randallt commented Jun 11, 2018 • edited Loading

danielnelson commented Jun 12, 2018

phemmer commented Jun 12, 2018 • edited Loading

randallt commented Jun 12, 2018

dewi-ny-je commented Mar 2, 2020 • edited Loading

jose-d commented Jul 15, 2020 • edited Loading

jose-d commented Jul 15, 2020 • edited Loading

quentinmit commented Jun 19, 2022 • edited Loading

randallt commented Jun 11, 2018 •

edited

Loading

phemmer commented Jun 12, 2018 •

edited

Loading

dewi-ny-je commented Mar 2, 2020 •

edited

Loading

jose-d commented Jul 15, 2020 •

edited

Loading

jose-d commented Jul 15, 2020 •

edited

Loading

quentinmit commented Jun 19, 2022 •

edited

Loading