-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong values for perf core events when starting the measurement #2629
Comments
Good point, I agree with your suggestion. |
I would expect this code snippet to prevent problem that you are describing: scalingRatio := 1.0
if perfData.TimeEnabled != 0
{
scalingRatio = float64(perfData.TimeRunning) / float64(perfData.TimeEnabled)
}
stat := info.PerfStat{
Value: uint64(float64(perfData.Value) / scalingRatio),
Name: name,
ScalingRatio: scalingRatio,
Cpu: cpu,
} If |
I would too. But with a simple test i noticed that getting 0.0/0.0 returns NaN, but casting the result to uint64 returns ridiculous number. |
Hi,
I noticed wired behavior of metrics when using cAdvisor with perf + Prometheus + Grafana. I noticed, that if I use more core perf events that the platform has counters (which triggers event scaling), I get some spikes in the beginning of data collection for a given container. I did some investigation and I found this function.
In some cases, the reading returns Value 0, Time Running 0 and Time Enabled non 0. This leads to returning an undefined number from the function, by which I mean some crazy high value that breaks all the stats. I did some investigation on why this is happening. Only thing I was able to find in this context, is this part of perf tool documentation, which says in the context in event scaling:
This lead me to conclusion that we may be facing an issue, that for some reason, although PID can be measured on a core, for some reason it is not, thus we get non 0 time enabled, but no value or time running. This pretty much makes metrics corrupted(rate charts have some spikes that make them unreadable).
My suggestion is to check if the value returned is equal to 0, and if it is, return PerfStat with all fields set to 0, as if nothing happened.
The text was updated successfully, but these errors were encountered: