Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Prometheus endpoint and metrics #3456

Merged
merged 1 commit into from
Jul 5, 2023
Merged

Conversation

andybug
Copy link
Contributor

@andybug andybug commented Jul 2, 2023

Discussed a bit in #3294.

  • Add a server for serving Prometheus metrics.
  • Include a configuration block in the config file.
  • Provide HTTP metrics on the API, along with process-level metrics and DB pool metrics.
  • The inclusion of the Prometheus components is gated by the prom feature flag.
  • Acquired port 10002 for Lemmy in the well-known Prometheus ports.

Follow-up Ideas

  • Metrics on federation, capture per-peer
  • DB query metrics
  • Community metrics (sub count, posts)
  • User metrics (user count, bot users, active users)

Example Metrics

The lemmy_api_* metrics will populate as requests come in to different endpoints. This is showing that only one request has come in so far and it was on the "/" endpoint.

# HELP lemmy_api_http_requests_duration_seconds HTTP request duration in seconds for all requests
# TYPE lemmy_api_http_requests_duration_seconds histogram
lemmy_api_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="0.005"} 0
lemmy_api_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="0.01"} 0
lemmy_api_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="0.025"} 1
lemmy_api_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="0.05"} 1
lemmy_api_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="0.1"} 1
lemmy_api_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="0.25"} 1
lemmy_api_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="0.5"} 1
lemmy_api_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="1"} 1
lemmy_api_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="2.5"} 1
lemmy_api_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="5"} 1
lemmy_api_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="10"} 1
lemmy_api_http_requests_duration_seconds_bucket{endpoint="/",method="GET",status="200",le="+Inf"} 1
lemmy_api_http_requests_duration_seconds_sum{endpoint="/",method="GET",status="200"} 0.01122634
lemmy_api_http_requests_duration_seconds_count{endpoint="/",method="GET",status="200"} 1
# HELP lemmy_api_http_requests_total Total number of HTTP requests
# TYPE lemmy_api_http_requests_total counter
lemmy_api_http_requests_total{endpoint="/",method="GET",status="200"} 1
# HELP lemmy_db_pool_available_connections Number of available connections in the pool
# TYPE lemmy_db_pool_available_connections gauge
lemmy_db_pool_available_connections 1
# HELP lemmy_db_pool_connections Current number of connections in the pool
# TYPE lemmy_db_pool_connections gauge
lemmy_db_pool_connections 1
# HELP lemmy_db_pool_max_connections Maximum number of connections in the pool
# TYPE lemmy_db_pool_max_connections gauge
lemmy_db_pool_max_connections 5
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1024
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 89
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 70168576
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1688323261
# HELP process_threads Number of OS threads in the process.
# TYPE process_threads gauge
process_threads 37
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 2602901504

@andybug

This comment was marked as outdated.

@andybug andybug force-pushed the prometheus branch 2 times, most recently from 57fa92d to b2a1780 Compare July 3, 2023 03:08
Copy link
Member

@dessalines dessalines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide some instructions with how to run this? Possibly including it in the docker-compose.yml ?

People did the same for otel, and still haven't provided any instructions on how to use them.

crates/utils/src/settings/structs.rs Show resolved Hide resolved
@andybug

This comment was marked as outdated.

@andybug

This comment was marked as resolved.

Cargo.toml Outdated Show resolved Hide resolved
@Nutomic
Copy link
Member

Nutomic commented Jul 4, 2023

Generally looks good. Would be best if you can already write instructions for this in lemmy-docs, so we can review it together with the code before merging both.

Add a server for serving Prometheus metrics. Include a configuration
block in the config file. Provide HTTP metrics on the API, along with
process-level metrics and DB pool metrics.
@andybug
Copy link
Contributor Author

andybug commented Jul 4, 2023

Added documentation in the above PR.

@Nutomic Nutomic merged commit 1e99e8b into LemmyNet:main Jul 5, 2023
@Nutomic
Copy link
Member

Nutomic commented Jul 5, 2023

Thanks!

@phiresky
Copy link
Collaborator

phiresky commented Jul 5, 2023

can this serve any metrics? https://docs.rs/tokio-metrics/latest/tokio_metrics/ would be very interesting. https://github.com/Hanaasagi/tokio-metrics-collector

@andybug
Copy link
Contributor Author

andybug commented Jul 5, 2023

Yes it can serve any metrics and will do so for metrics added to the default registry. Glancing at those links, I think the task metrics seem easy enough but the runtime ones require tokio unstable so probably not viable for quite some time.

@andybug andybug deleted the prometheus branch July 6, 2023 01:46
@ubergeek77
Copy link

It doesn't look like the official images are compiling with this enabled. At least not as far as I can see in Woodpecker.

Am I mistaken, or is this a "compile only" feature for now?

@andybug
Copy link
Contributor Author

andybug commented Jul 7, 2023

It doesn't look like the official images are compiling with this enabled. At least not as far as I can see in Woodpecker.

Am I mistaken, or is this a "compile only" feature for now?

This is compile-only for now. I didn't want to introduce any potential bugs and performance problems to existing users since I expect that the number of people who want this feature is rather small.

If it gets some uptake maybe we can talk with the maintainers about making it part of the default build.

If you are interested in building Lemmy with this feature enabled, the details are in the documentation here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants