You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the collector receives a NOHUP signal, it reloads the current configuration and calls the Shutdown method. The Shutdown method of the prometheusexporter however does only close the listener, not the http.Server. This means that the old server does not accept any new connections, but all existing connections may continue to live forever. If a client (like Prometheus) keeps the connection between scrapes alive, the requests will be served by the old instance, meaning the metrics will no longer get any updates until they disappear. This is especially difficult to debug, since requests using curl or wget will be served by the new instance, seeing completely different metrics.
Steps to Reproduce
Setup otel-collector with the prometheus exporter
Setup prometheus to scrape this instance every 60s
Send a NOHUP signal to prometheus
No metric updates will be recorded by prometheus after the NOHUP
Expected Result
The old server must be shutdown, so that prometheus reconnects and receives the latest metrics
Actual Result
The old HTTP connection continues to live on and metrics go stale in prometheus
Component(s)
exporter/prometheus
What happened?
Description
When the collector receives a
NOHUP
signal, it reloads the current configuration and calls theShutdown
method. The Shutdown method of the prometheusexporter however does only close the listener, not thehttp.Server
. This means that the old server does not accept any new connections, but all existing connections may continue to live forever. If a client (like Prometheus) keeps the connection between scrapes alive, the requests will be served by the old instance, meaning the metrics will no longer get any updates until they disappear. This is especially difficult to debug, since requests usingcurl
orwget
will be served by the new instance, seeing completely different metrics.Steps to Reproduce
Expected Result
Actual Result
I will provide a Pull Request for this issue
Collector version
v0.109.0/v0.110.0
Environment information
Environment
OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")
OpenTelemetry Collector configuration
No response
Log output
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: