Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus "Counter" metrics are skipped (sent from Python Prometheus client) #3557

Closed
nayasam opened this issue Jul 2, 2021 · 8 comments
Closed
Labels
bug Something isn't working

Comments

@nayasam
Copy link

nayasam commented Jul 2, 2021

Describe the bug
We observed Prometheus "Counter" metrics (sent from the Python prometheus_client API) are skipped (in OpenTelemetry collector) when configured to grab them (in PrometheusReceiver collector configuration) from a HTTP endpoint. However, this issue is not seen when Prometheus metrics are sent using the C# prometheus client.

With The Python prometheus_client API, Counter metrics sent have a _total suffix appended (this is not the case with C# prometheus client API). It appears that this _total suffix causes the Counter metrics to skip in the OpenTelemetry collector.

Upon checking the Python prometheus_client code - https://github.com/prometheus/client_python/blob/master/prometheus_client/registry.py#L58-L70 , we observe _total suffix is appended in Counter metric name; _sum and _counter suffixes are appended in Summary metric names; _bucket suffix is appended in Histogram metric names.

Then in OpenTelemetry collector code (version 0.28.0), the suffixes appended to Summary and Histogram metric names are trimmed, but not the _total suffix appended to Counter metric name.
https://github.com/open-telemetry/opentelemetry-collector/blob/main/receiver/prometheusreceiver/internal/metricsbuilder.go#L33-L46
https://github.com/open-telemetry/opentelemetry-collector/blob/main/receiver/prometheusreceiver/internal/metricsbuilder.go#L215-L222

However, when I update the code in metricbuilder.go (of the OpenTelemetry collector code) as follows (i.e., to trim the _total suffix of Counters) --- see bold text, I can see prometheus Counter metrics properly exported by the PrometheusExporter (of OpenTelemetry collector), and sent to the prometheus backend.
const (
metricsSuffixCount = "_count"
metricsSuffixBucket = "_bucket"
metricsSuffixSum = "_sum"
startTimeMetricName = "process_start_time_seconds"
scrapeUpMetricName = "up"
metricsSuffixTotalCount = "_total"
)
var (
trimmableSuffixes = []string{metricsSuffixBucket, metricsSuffixCount, metricsSuffixSum, metricsSuffixTotalCount }
errNoDataToBuild = errors.New("there's no data to build")
errNoBoundaryLabel = errors.New("given metricType has no BucketLabel or QuantileLabel")
errEmptyBoundaryLabel = errors.New("BucketLabel or QuantileLabel is empty")
)

Why do we see the skipping of prometheus Counter metrics (from the collector) when they are sent from the Python prometheus client API, and not with other prometheus client APIs (e.g., C#)? This particular issue is not observed from the C# prometheus client API? It appears that Python prometheus metric APIs support OpenMetrics, which requires the _total suffix to be appended to Counters, whereas prometheus C# client does not support OpenMetrics.

This issue is similar to https://github.com/open-telemetry/opentelemetry-collector/issues/3118 (which is closed)

Steps to reproduce
Use prometheus_client API (https://github.com/prometheus/client_python) to send prometheus Counter metrics, and grab them from opentelemetry collector.

What did you expect to see?
Prometheus Counter metrics properly received by the PrometehusReceiver (of the collector) and exported.

What did you see instead?
Prometheus counter metrics are skipped

What version did you use?
Prometheus client version 0.11.0 (https://github.com/prometheus/client_python), openetelemetry collector version = 0.28.0

Environment
OS: Linux

Additional context
Same question was asked from the maintainers of Python Prometheus client (see prometheus/client_python#678), and they believe this is an issue with opentelemetry collector ("All counters should end with a _total suffix, and that is actually required for OpenMetrics..... opentelemetry collector negotiates OpenMetrics preferentially, so it definitely needs to handle _total suffixes on counters.")

@nayasam nayasam added the bug Something isn't working label Jul 2, 2021
@bogdandrutu
Copy link
Member

/cc @rakyll @alolita @dashpole

@rakyll
Copy link
Contributor

rakyll commented Jul 8, 2021

cc @Aneurysm9

@Aneurysm9
Copy link
Member

The use of _total as a suffix for counter metric names seems to be all over the place. In #2993 I removed logic to force all counter metric names sent by the PRW exporter to end in _total after confirming with the compliance test suite authors that it was not required. This seems almost the inverse of that.

It isn't clear to me why counters would be dropped if they have that suffix, but removing it doesn't seem to cause any compliance test failures. I will put up a PR shortly to add _total to the list of suffixes to trim.

tigrannajaryan pushed a commit that referenced this issue Jul 13, 2021
…ffixes (#3603)

**Description:** attempts to ensure that the prometheus receiver correctly ingests counter metrics, regardless whether the producing system includes a `_total` suffix on the counter metric name.

**Fixes:** #3557 

Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
@bogdandrutu
Copy link
Member

@Aneurysm9 should this issue be closed?

@bogdandrutu
Copy link
Member

@nayasam can you give a try using the newly release 0.30.0?

@nayasam
Copy link
Author

nayasam commented Jul 15, 2021 via email

@nayasam
Copy link
Author

nayasam commented Jul 16, 2021

Since the issue reported is solved, this issue can be closed.

The observation I mentioned on failing to translate metrics in PrometheusExporter should be a separate issue. I will create a separate issue for that..

@alolita
Copy link
Member

alolita commented Jul 28, 2021

Closing this issue as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants