-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to scrape metrics through kubeletstats receiver #26481
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
@anand3493 the receiver is unable to scrape metrics from |
@TylerHelmuth true that but am using the default configuration for the endpoint: ${K8S_NODE_NAME}:10250 and I can confirm the host name is a valid one as I see them through kubectl get nodes command. |
Are you able to hit the endpoint successfully? |
We're having this issue as well, even on version 0.85: 2023-09-14T08:12:11.091Z error kubeletstatsreceiver@v0.85.0/scraper.go:68
call to /stats/summary endpoint failed
{"kind": "receiver", "name": "kubeletstats", "data_type": "metrics", "error": "Get \"https://<#####>:10250/stats/summary\": dial tcp: lookup <#####> on 100.64.0.10:53: no such host"}
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/kubeletstatsreceiver.(*kubletScraper).scrape We basically followed the getting-started installation process, enabling the kubeletMetrics preset and providing a custom otel endpoint. However, we seem to have found a workaround by using the node's hostIP. According to downward-api/#available-fields the field
In order to apply the workaround we performed these steps:
[...]
env:
- name: NODE_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
[...]
[...]
config:
receivers:
kubeletstats:
collection_interval: 20s
auth_type: 'serviceAccount'
endpoint: 'https://${env:NODE_IP}:10250'
[...]
helm upgrade otel-collector --values values.yaml <path-to-cloned-repo>/charts/opentelemetry-collector/ Maybe this is not the right thing to do, but nevertheless it might point in the right direction. |
@sspieker if node name is not working for you then node IP is a valid workaround. You don't have to modify the helm chart tho, it supports added extra env vars: mode: daemonset
presets:
kubeletMetrics:
enabled: true
extraEnvs:
- name: NODE_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
config:
receivers:
kubeletstats:
endpoint: 'https://${env:NODE_IP}:10250' |
As it happens, that works too. Thanks @TylerHelmuth , this makes stuff quite a bit easier for us! |
@TylerHelmuth This NODE_IP suggestion worked for me as well. The us-east-1 based nodes has the private IP DNS Name in the format: ip-xx-xxx-xxx-xx.ec2.internal The eu-west-1 based nodes has the private IP DNS Name in the format: ip-xx-xxx-xxx-xxx.eu-west-1.compute.internal This may be the reason why NODE_NAME is not working on my European Cluster. NODE_IP works fine. . |
Component(s)
receiver/kubeletstats
What happened?
Description
Getting error from Opentelemetry Collector agent pods -
scraperhelper/scrapercontroller.go:200 Error scraping metrics {"kind": "receiver", "name": "kubeletstats", "data_type": "metrics", "error": "Get "https://ip-10-166-222-111.eu-west-1.compute.internal:10250/stats/summary\": dial tcp: lookup ip-10-166-222-111.eu-west-1.compute.internal on 172.20.0.10:53: no such host", "scraper": "kubeletstats"}
The nodes are present without doubt. Happening over all the pods of the collector daemonset.
Steps to Reproduce
Expected Result
To scrape metrics and send to the exporter
Actual Result
Erroring out at
scraperhelper/scrapercontroller.go:200 Error scraping metrics {"kind": "receiver", "name": "kubeletstats", "data_type": "metrics", "error": "Get "https://ip-10-166-222-111.eu-west-1.compute.internal:10250/stats/summary\": dial tcp: lookup ip-10-166-222-111.eu-west-1.compute.internal on 172.20.0.10:53: no such host", "scraper": "kubeletstats"}
Collector version
v0.83.0
Environment information
Environment
Kubernetes
OpenTelemetry Collector configuration
Log output
Additional context
No response
The text was updated successfully, but these errors were encountered: