All metrics scraped from push gateway have same job label #34237

kpanic9 · 2024-07-24T04:56:38Z

Component(s)

receiver/prometheus

What happened?

Description

In our setup, we have a push gateway which jobs push metrics to. Each of the metrics pushed by the jobs have a different job label. We want to preserve the job label set by the metric publishing job. But when we set honor_labels: true in the otel prometheus receiver configuration for the scrape job, all metrics scraped from the push gateway has a single value for the job label. The value set for the job label is from a one set of metrics pushed by a job.

Steps to Reproduce

Configure a push gateway, push few metrics to it with different values for job label.
Configure OTEL collector to scrape push gateway.
Check the values for the job label.

Expected Result

Metrics scraped from push gateway should have the job label value set by the metrics publisher.

Actual Result

All metrics scraped from push gateway has a single value for the job label.

Collector version

v0.103.0

Environment information

No response

OpenTelemetry Collector configuration

receivers:
      prometheus/2:
        config:
          scrape_configs:
          - honor_labels: true
            job_name: app-platform-pushgateway
            kubernetes_sd_configs:
            - namespaces:
                names:
                - app-platform-monitoring
              role: pod
            relabel_configs:
            - action: keep
              regex: Running
              source_labels:
              - __meta_kubernetes_pod_phase
            - action: keep
              regex: true
              source_labels:
              - __meta_kubernetes_pod_ready
            - action: keep
              regex: metrics
              source_labels:
              - __meta_kubernetes_pod_container_port_name
            scheme: http
            scrape_interval: 30s
            scrape_timeout: 10s

Log output

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

github-actions · 2024-07-24T04:56:54Z

Pinging code owners:

receiver/prometheus: @Aneurysm9 @dashpole

See Adding Labels via Comments if you do not have permissions to add labels yourself.

bacherfl · 2024-07-24T12:02:06Z

I was looking into this to get a better understanding of this receiver, and also reproduced this behavior. Looking at the code, it seems like after every scrape for a certain scrape config, all gathered metrics are put into the same resource, which is created in the initTransaction method: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/prometheusreceiver/internal/transaction.go#L358 - the name of the resource will be set based on the job label - which is the name of the scrape config if honor_labels is set to false, but with honor_labels set to true it can happen that we have multiple different values for that label.
When attaching the different data points of metrics with the same name, e.g. the job and instance labels are among the labels which will not be added to the datapoint attributes (see.

opentelemetry-collector-contrib/receiver/prometheusreceiver/internal/util.go

Line 52 in 76c03ec

func getSortedNotUsefulLabels(mType pmetric.MetricType) []string {

) - so from my understanding this could be an explanation for why all different instances of a metric are aggregated under the same job label.

Question to the code owners - @Aneurysm9 @dashpole - is the creation of one resource per transaction intended, or does the logic need to be adjusted to account for the possibility of having different values for the job label in case honor_labels is set to true, i.e. should in this case multiple resources be created in the same transaction, based on the set of different values for the job label?

dashpole · 2024-07-31T01:18:57Z

Good find. The logic needs to be adjusted to account for multiple resources. We should create a new, unique resource for each combination of job + instance.

bacherfl · 2024-07-31T05:02:40Z

Good find. The logic needs to be adjusted to account for multiple resources. We should create a new, unique resource for each combination of job + instance.

Thanks for the response @dashpole - I would be happy to work on a PR for this. I already have a PoC implementation that should fix this which needs some polishing and tests, but I should have a PR ready this week

…rom `job`/`instance` label pairs (#34344) **Description:** This PR solves a bug where metrics with different `job`/`instance` labels were added into the same resource. This can happen with the `honor_labels` being set to `true`, in which case those labels are not taken by the scrape config, but from the individual data points that are aggregated during a scrape iteration. This change also affects the use of relabel configs, if the job or instance labels of gathered metrics are changed by those. Here a new resource for each distinct job/instance label will be created, with the matching metrics being added to those. The additional scrape metrics (number of scraped samples, scrape duration, up, etc.) will be put into a resource representing the scrape config. **Link to tracking Issue:** #34237 **Testing:** Added Unit tests and adapted relevant e2e tests --------- Signed-off-by: Florian Bacher <florian.bacher@dynatrace.com> Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>

…rom `job`/`instance` label pairs (open-telemetry#34344) **Description:** This PR solves a bug where metrics with different `job`/`instance` labels were added into the same resource. This can happen with the `honor_labels` being set to `true`, in which case those labels are not taken by the scrape config, but from the individual data points that are aggregated during a scrape iteration. This change also affects the use of relabel configs, if the job or instance labels of gathered metrics are changed by those. Here a new resource for each distinct job/instance label will be created, with the matching metrics being added to those. The additional scrape metrics (number of scraped samples, scrape duration, up, etc.) will be put into a resource representing the scrape config. **Link to tracking Issue:** open-telemetry#34237 **Testing:** Added Unit tests and adapted relevant e2e tests --------- Signed-off-by: Florian Bacher <florian.bacher@dynatrace.com> Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>

kpanic9 · 2024-09-17T05:11:19Z

Tested the fix, it works. Thank you!

kpanic9 added bug Something isn't working needs triage New item requiring triage labels Jul 24, 2024

github-actions bot added the receiver/prometheus Prometheus receiver label Jul 24, 2024

github-actions bot mentioned this issue Jul 30, 2024

Weekly Report: 2024-07-23 - 2024-07-30 #34301

Closed

dashpole removed the needs triage New item requiring triage label Jul 31, 2024

bacherfl mentioned this issue Jul 31, 2024

[receiver/prometheus]: Group scraped metrics into resources created from job/instance label pairs #34344

Merged

dashpole assigned bacherfl Jul 31, 2024

bacherfl mentioned this issue Aug 19, 2024

REQUEST: New membership for @bacherfl open-telemetry/community#2287

Closed

6 tasks

dashpole closed this as completed Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All metrics scraped from push gateway have same job label #34237

All metrics scraped from push gateway have same job label #34237

kpanic9 commented Jul 24, 2024 •

edited

Loading

github-actions bot commented Jul 24, 2024

bacherfl commented Jul 24, 2024 •

edited

Loading

dashpole commented Jul 31, 2024

bacherfl commented Jul 31, 2024

kpanic9 commented Sep 17, 2024

All metrics scraped from push gateway have same job label #34237

All metrics scraped from push gateway have same job label #34237

Comments

kpanic9 commented Jul 24, 2024 • edited Loading

Component(s)

What happened?

Description

Steps to Reproduce

Expected Result

Actual Result

Collector version

Environment information

OpenTelemetry Collector configuration

Log output

Additional context

github-actions bot commented Jul 24, 2024

bacherfl commented Jul 24, 2024 • edited Loading

dashpole commented Jul 31, 2024

bacherfl commented Jul 31, 2024

kpanic9 commented Sep 17, 2024

kpanic9 commented Jul 24, 2024 •

edited

Loading

bacherfl commented Jul 24, 2024 •

edited

Loading