-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prometheus is crashing after sidecar injection #107
Comments
The sidecar requires access to the Prometheus WAL, and it's stopping (crashing) because it can't find it. Let me know if there is something else I can do to help. |
Hi @jkohen sidecar mount - mountPath: /data
name: prometheus-prometheus-prometheus-oper-prometheus-db prometheus mount - mountPath: /prometheus
name: prometheus-prometheus-prometheus-oper-prometheus-db We have changed the argument in the script to line below
this time prometheus pod initialized successfully but it stuck like the issue #83 on level=info ts=2019-01-26T15:29:42.652116635Z caller=manager.go:150 component="Prometheus reader" msg="Starting Prometheus reader..." |
Glad it helped. The sidecar may take a few seconds to start, and then it will actually go silent, so those logs could be fine. Have you looked in Metrics Explorer whether metrics are showing up? See this https://cloud.google.com/monitoring/kubernetes-engine/prometheus#viewing_metrics Also make sure that your Prometheus version is in the compatibility matrix: https://github.com/Stackdriver/stackdriver-prometheus-sidecar#compatibility |
Hi @jkohen In the metrics explorer i searched external/xxx but no chance. |
@StevenYCChou you looked into #91 Can you help us diagnose this? @Pamir can you include full logs from the sidecar? Can you share with us your project id and cluster name so we can take a second look? If you have a Cloud Support contract, please also contact us through that channel to ensure we have all the important information. |
Hi @StevenYCChou @jkohen https://github.com/Pamir/stackdriver-prometheus-sidecar-configuration |
Thanks @Pamir for creating the repo with files. Let me look into your files above and I will get back to you if I have any question. |
Can you double check the version of prometheus you use? Why I ask this is because I see the repo you provided use prometheus-operator v0.26.0, and it uses prometheus v2.5.0 based on the promehtheus-operator release page. Besides that, does your Prometheus server scrape the sidecar for metrics? Could you check the metrics If you can provide logs from prometheus server, that helps me to understand more about how Prometheus server is doing. |
Hi, image:
repository: quay.io/prometheus/prometheus
tag: v2.6.1 In prometheus there is no such metric.
|
Hi @StevenYCChou
|
I see a lot of errors in your Prometheus logs. Is Prometheus working well? What are some working metrics? Do you have any metrics that aren't recording rules? There are some limitations for recording rules at the time: https://cloud.google.com/monitoring/kubernetes-engine/prometheus#prometheus_integration_issues |
Hi @jkohen |
Hi @Pamir, we added debug logs with #110, and it is included in release v0.4.1 . If there is no data shown up in Stackdriver, you can turn on the debug logging by following instruction in the section "No data shows up in Stackdriver" of https://cloud.google.com/monitoring/kubernetes-engine/prometheus#prometheus_integration_issues, and feel free to share the debugging info with us. |
I also ran into the I'm running the full https://github.com/coreos/kube-prometheus stack on GKE. I've fixed my version to v2.6.1 and have mounted a PVC at /prometheus. The wal definitely exists in there and contains data. For abbrevity, I've only included my
If I change the Here is a a log from the container
I tried launching with I'm not able to see any prom data reporting into SD. |
Hi @matthewgoslett, Thanks for the report. I haven't used kube-prometheus yet, so I need to gather some further information.
Can you check where your Prometheus stores data?
|
Hi @StevenYCChou with Prometheus Image: quay.io/prometheus/prometheus:v2.7.1 level=info ts=2019-07-29T09:21:09.70048798Z caller=main.go:298 host_details="(Linux 4.14.91+ #1 SMP Wed Jan 23 21:34:58 PST 2019 x86_64 prometheus-monitoring-prometheus-oper-prometheus-0 (none))" |
Hello @Pamir, In my prometheus-operator HELM config I had 'subPath' with value 'prometheus-db' specified at the volumeMounts section of the configuration and due to that the 'prometheus-prometheus-operator-prometheus-db' volume was mounted to /prometheus/prometheus-db so in the wal directory variable I had to specify the extended path like this: --prometheus.wal-directory=/prometheus/prometheus-db/wal After changing the wal directory path from /prometheus/wal to /prometheus/prometheus-db/wal the sidecar came up perfectly and the metrics from Prometheus were sent to the Stackdriver. As I mentioned I found this out completely by accident as I was trying to 'shell' into the sidecar container while running sidecar version 0.7.3, however I could not as 'sh' was not found in the container so as a test I did a re-patch but with using way older sidecar version (ex: 0.3.2). Hope it helps! |
Hi My Prometheus Pod and sidecar container are running but I don't see any metrics in stack diver related to Prometheus. level=info ts=2020-07-14T16:23:49.975Z caller=main.go:293 msg="Starting Stackdriver Prometheus sidecar" version="(version=0.7.5, branch=master, revision=c8c0bfb1a5e22f5838eb6bb86608b29ef0eca0ef)" |
The prometheus sidecar is no longer a recommended Google Cloud solution. It has been superseded by Google Cloud Managed Service for Prometheus. |
error message : Tailing WAL failed: retrieve last checkpoint: open /data/wal: no such file or directory
export KUBE_NAMESPACE=monitoring
export GCP_PROJECT=<project_name>
export GCP_REGION=us-central1
export KUBE_CLUSTER=standard-cluster-1
export SIDECAR_IMAGE_TAG=release-0.4.0
prometheus operator values.yaml
The text was updated successfully, but these errors were encountered: