Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus Receiver metric type fixes to match Prometheus functionality #3853

Closed

Conversation

gracewehner
Copy link

@gracewehner gracewehner commented Aug 18, 2021

Description:

  1. The prometheus receiver sets metrics without the # TYPE hint as UNSPECIFIED which converts to empty OTLP metrics: such as {"metrics":[{},{},{}]} in the OTLP json output and considers these metric invalid. Prometheus, however, will treat these metrics as a Gauge type and these metrics show up correctly in the Prometheus expression browser. With these changes, if the metadata type is not found, the type is set to textparse.MetricTypeUnknown so that in the function convToOCAMetricType(metadata.Type), the type is converted to metricspb.MetricDescriptor_GAUGE_DOUBLE.
  2. Issue Prometheus "Counter" metrics are skipped (sent from Python Prometheus client) #3557 was created since some counter metrics sent when using the python prometheus sdk were being considered invalid metrics. This is because in the Prometheus metadata lookup, their name excluded the _total, so their metadata couldn't be found. receiver/prometheus: add _total to list of trimmable metric name suffixes #3603 fixed the issue by checking the name without the _total suffix to lookup the metadata. However, the family name without the _total suffix is used as the metric name, so the metric python_gc_objects_collected_total is converted to the name python_gc_objects_collected whereas Prometheus still includes the _total suffix. This is because previously the metric family name only differed from metric names when the metric was a summary and histogram that have the metrics with _sum and _count, which are added back in later. There isn't this special logic for counters, so the family name needs to be set to the actual metric name when it is a counter and has the _total suffix.

Link to tracking Issue: #3852 #3557

Testing:
Tested out on a Kubernetes cluster with Prometheus scrape targets that have metrics without the # TYPE hint for (1) and metrics using the python sdk sending metrics such as below for testing (2):

python_gc_objects_collected_total
python_gc_objects_uncollectable_total 
python_gc_collections_total
process_cpu_seconds_total

Changed tests for metrics without the type hint to expect the gauge type.
Added test case for situation where metadata key excludes _total suffix but metric name should still have _total suffix

@gracewehner gracewehner requested a review from a team August 18, 2021 00:09
@@ -544,7 +544,7 @@ func Test_metricBuilder_untype(t *testing.T) {
{
MetricDescriptor: &metricspb.MetricDescriptor{
Name: "something_not_exists",
Type: metricspb.MetricDescriptor_UNSPECIFIED,
Type: metricspb.MetricDescriptor_GAUGE_DOUBLE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually see where you set the type to GAUGE_DOUBLE anywhere in this PR. Is this a work in progress, or did you forget to commit some changes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is a bit hidden. This change sets the metadata.Type to textparse.MetricTypeUnknown which is already being done in line 66 here, just only in the situation where the metric name is not equal to the family name but the metadata still cannot be found:

metadata, ok := mc.Metadata(familyName)
if !ok && metricName != familyName {
// use the original metricName as metricFamily
familyName = metricName
// perform a 2nd lookup with the original metric name. it can happen if there's a metric which is not histogram
// or summary, but ends with one of those _count/_sum suffixes
metadata, ok = mc.Metadata(metricName)
// still not found, this can happen when metric has no TYPE HINT
if !ok {
metadata.Metric = familyName
metadata.Type = textparse.MetricTypeUnknown
}

The function convToOCAMetricType() is converting the textparse.MetricTypeUnknown to a GAUGE_DOUBLE, whereas before the change it was defaulting to UNSPECIFIED:
func convToOCAMetricType(metricType textparse.MetricType) metricspb.MetricDescriptor_Type {
switch metricType {
case textparse.MetricTypeCounter:
// always use float64, as it's the internal data type used in prometheus
return metricspb.MetricDescriptor_CUMULATIVE_DOUBLE
// textparse.MetricTypeUnknown is converted to gauge by default to fix Prometheus untyped metrics from being dropped
case textparse.MetricTypeGauge, textparse.MetricTypeUnknown:
return metricspb.MetricDescriptor_GAUGE_DOUBLE
case textparse.MetricTypeHistogram:
return metricspb.MetricDescriptor_CUMULATIVE_DISTRIBUTION
// dropping support for gaugehistogram for now until we have an official spec of its implementation
// a draft can be found in: https://docs.google.com/document/d/1KwV0mAXwwbvvifBvDKH_LU1YjyXE_wxCkHNoCGq1GX0/edit#heading=h.1cvzqd4ksd23
// case textparse.MetricTypeGaugeHistogram:
// return metricspb.MetricDescriptor_GAUGE_DISTRIBUTION
case textparse.MetricTypeSummary:
return metricspb.MetricDescriptor_SUMMARY
default:
// including: textparse.MetricTypeInfo, textparse.MetricTypeStateset
return metricspb.MetricDescriptor_UNSPECIFIED
}
}

@bogdandrutu
Copy link
Member

@gracewehner thanks a lot for the PR. We are in process of changing the use of OpenCensus in the receiver #3741 and would be good if we can fix this in the pdata version and avoid using opencensus dependencies.

@bogdandrutu
Copy link
Member

@gracewehner sorry, unfortunately this needs to be moved to contrib, since we moved the prometheusreceiver there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants