Filter not working #93

anderson4u2 · 2019-02-22T17:34:24Z

I'm setting the filter as follows:

        - '--filter=__name__="consumer_group_backlog_avg_10m"'

However when I trace the logs in the sidecar it stays here and doesn't seem to forward any metrics

level=info ts=2019-02-22T12:52:07.654292427Z caller=main.go:256 msg="Starting Stackdriver Prometheus sidecar" version="(version=0.4.0, branch=master, revision=e246041acf99c8487e1ac73552fb8625339c64a1)"
level=info ts=2019-02-22T12:52:07.654367128Z caller=main.go:257 build_context="(go=go1.11.4, user=kbuilder@kokoro-gcp-ubuntu-prod-217445279, date=20190221-15:24:24)"
level=info ts=2019-02-22T12:52:07.654414564Z caller=main.go:258 host_details="(Linux 4.14.65+ #1 SMP Thu Oct 25 10:42:50 PDT 2018 x86_64 prometheus-84b8bdf44-6kcw8 (none))"
level=info ts=2019-02-22T12:52:07.654645769Z caller=main.go:259 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2019-02-22T12:52:07.658270228Z caller=main.go:463 msg="Web server started"
level=info ts=2019-02-22T12:52:07.658797109Z caller=main.go:444 msg="Stackdriver client started"
level=info ts=2019-02-22T12:53:10.664382837Z caller=manager.go:150 component="Prometheus reader" msg="Starting Prometheus reader..."
level=info ts=2019-02-22T12:53:10.668043076Z caller=manager.go:211 component="Prometheus reader" msg="reached first record after start offset" start_offset=0 skipped_records=0

When I curl the prometheus server it should be querying metrics for it does seem to have the metric I'm trying to filter for:

root@myserver:/# curl prometheus:9090/api/v1/query?query=consumer_group_backlog_avg_10m | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   313  100   313    0     0  43745      0 --:--:-- --:--:-- --:--:-- 44714
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "consumer_group_backlog_avg_10m",
          "consumer_group": "media-extractor"
        },
        "value": [
          1550838129.789,
          "892269.2"
        ]
      },
      {
        "metric": {
          "__name__": "consumer_group_backlog_avg_10m",
          "consumer_group": "summarizer"
        },
        "value": [
          1550838129.789,
          "548159.4"
        ]
      }
    ]
  }
}

Does anything seem clearly wrong? What could be the issue? Thanks!

The text was updated successfully, but these errors were encountered:

jkohen · 2019-02-22T18:19:43Z

Thanks for the report. Nothing obvious jumps to mind. What is the full command-line? Note that if you have multiple filters, they all have to pass.

Do you see any metrics in Stackdriver if you remove all filters (the sidecar should forward all metrics by default)? Knowing this would help us eliminate the filter as a problem.

anderson4u2 · 2019-02-22T23:32:49Z

Hi Javier, thanks for picking this up.

I'm passing the args through a k8s manifest. The full command is:

      - name: stackdriver-prometheus-sidecar
        image: gcr.io/stackdriver-prometheus/stackdriver-prometheus-sidecar:0.4.0
        imagePullPolicy: Always
        args:
        - --stackdriver.project-id={{ project-id }}
        - --prometheus.wal-directory=/prometheus/data/wal
        - --stackdriver.kubernetes.location={{ gcp_region }}
        - --stackdriver.kubernetes.cluster-name={{ kube_cluster }}
        - --stackdriver.use-gke-resource
        - '--filter=__name__="consumer_group_backlog_avg_10m"'
        ports:
        - name: sidecar
          containerPort: 9091
        volumeMounts:
        - name: tmp-data-dir
          mountPath: /prometheus/data

Yes without filter all metrics are exported, and others filters work. For example the filter '--filter=consumer_group=~".+"' exports some kafka metrics. Unfortunately it doesn't export the metric I'm interested in, that also has the label consumer_group populated.

The problem may lie in a small detail, that the metric I'm trying to export (consumer_group_backlog_avg_10m) actually comes from a recording rule. The config is:

groups:
- name: consumer-groups
  rules:
  - record: consumer_group_backlog_avg_10m
    expr: avg_over_time(consumer_group_backlog_k8s[10m])
  - record: consumer_group_backlog_k8s
    expr: sum(kafka_consumer_group_total_lag) by (consumer_group)

The contents of the metric are in my initial post, this at least shows that is available in prometheus.
Thanks!

jkohen · 2019-02-25T14:08:09Z

Anderson, thanks for the clarification. The issue is indeed caused by recorded rules, as you suspected. I can see two options:

You can ingest the raw metric into Stackdriver and use Stackdriver's query-time aggregations. In this case, ingest kafka_consumer_group_total_lag and query with 'mean aggregation' using a '10m window' and 'group by consumer_group label'. Would this work for your case? Raw metrics have the advantage that you can play with the data interactively in the Metrics Explorer and Stackdriver dashboard, filter and group by metadata, etc.
Add a static_metadata entry in the collector's config (docs). As long as you preserve the job and instance labels (i.e. don't aggregate them away in your query), which is the case in this rule, then it should work. If you go with this option, I would recommend ingesting consumer_group_backlog_k8s instead of consumer_group_backlog_avg_10m, so you can still change the aggregation (not the group-by) at query time.

I will add an entry to our website explaining this in a bit more detail. Thanks for bringing it up!

jkohen · 2019-02-26T15:07:54Z

I'm going to go ahead and close this request. If there's anything else I can do to help you, let me know.

anderson4u2 · 2019-02-27T13:22:47Z

Thanks a lot for your reply!
The first solution wouldn't work for me, because I'm planning to autoscale (hpa) on these metrics.
The second solution works like a charm! Thanks!
Saw you've already updated the google docs as well, cool!

jkohen · 2019-02-27T14:16:52Z

Glad it helped, thanks for the update!

jkohen self-assigned this Feb 22, 2019

jkohen closed this as completed Feb 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter not working #93

Filter not working #93

anderson4u2 commented Feb 22, 2019 •

edited

Loading

jkohen commented Feb 22, 2019

anderson4u2 commented Feb 22, 2019 •

edited

Loading

jkohen commented Feb 25, 2019 •

edited

Loading

jkohen commented Feb 26, 2019

anderson4u2 commented Feb 27, 2019

jkohen commented Feb 27, 2019

Filter not working #93

Filter not working #93

Comments

anderson4u2 commented Feb 22, 2019 • edited Loading

jkohen commented Feb 22, 2019

anderson4u2 commented Feb 22, 2019 • edited Loading

jkohen commented Feb 25, 2019 • edited Loading

jkohen commented Feb 26, 2019

anderson4u2 commented Feb 27, 2019

jkohen commented Feb 27, 2019

anderson4u2 commented Feb 22, 2019 •

edited

Loading

anderson4u2 commented Feb 22, 2019 •

edited

Loading

jkohen commented Feb 25, 2019 •

edited

Loading