Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/prometheusreceiver] Fix staleness issue for histograms and summaries #8561

Merged
merged 3 commits into from
Mar 23, 2022

Conversation

gracewehner
Copy link
Contributor

Description: Fixes bug where staleness NaN cannot be sent for histogram buckets and summary quantiles because these are not sent with the datapoint when it is marked stale, so components like the PRW exporter cannot set these metrics as stale. The actual values of these buckets and quantiles would be the stale NaN, but this sends the value as 0 instead because these need to be type uint64.

Link to tracking Issue: #8492

Testing: Tested with prometheus receiver -> OTLP exporter. Added test cases for otlp_metricfamily.go similar to the ones that already exist for histogram and quantiles, but instead with stale NaN values.

Copy link
Member

@Aneurysm9 Aneurysm9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a CHANGELOG.md entry. LGTM conceptually, just a couple minor questions/suggestions.

receiver/prometheusreceiver/internal/otlp_metricfamily.go Outdated Show resolved Hide resolved
receiver/prometheusreceiver/internal/otlp_metricfamily.go Outdated Show resolved Hide resolved
}

point.SetExplicitBounds(bounds)
point.SetBucketCounts(bucketCounts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we even need to set bucket counts? IIUC, we just needed the bounds to be able to manufacture stale bucket series with NaN values later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, we don't necessarily need the bucket counts to set the NaN values later. This would just also require a change in the PRW exporter for the line if index >= len(pt.BucketCounts()) {:

for index, bound := range pt.ExplicitBounds() {
if index >= len(pt.BucketCounts()) {
break
}
cumulativeCount += pt.BucketCounts()[index]
bucket := &prompb.Sample{
Value: float64(cumulativeCount),
Timestamp: time,
}
if pt.Flags().HasFlag(pdata.MetricDataPointFlagNoRecordedValue) {
bucket.Value = math.Float64frombits(value.StaleNaN)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can also be done as a follow-up. I wouldn't be surprised if many exporters don't handle that case correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants