Port udf execution metrics to be bucketed based on time (#32048)

The dashboard provides simple timeseries data for different counters (function counts, cache hit rates, and rows read or written to a table) and histograms (percentiles for function execution time). These timeseries are stored naively: We keep the original data as-is and store a circular buffer of the past 1000 samples per metric. This leads to confusing behavior on a few fronts: very active functions can effectively have very little data in the dashboard, and different timeseries may have different validity windows. This PR switches in-memory metrics to use a different approach: - We default to storing timeseries data at 1m granularity and storing a hour's worth of data. The data structure stores buckets sparsely: only buckets that have a sample take up memory. We could store more, but we currently reset metrics on backend restart, so only showing the past hour makes this less disruptive. - For counters (e.g. database rows read), this is 8 bytes * 60 = ~0.5KB of data per metric. We log ~5 metrics per function and ~2 metrics per table => we shouldn't use more than 5MB of RAM in the worst case. - For histograms (e.g. function latency), we use an HDR histogram configured to roughly 1.5KB per bucket = 90KB of data per metric. With ~1000 active functions, this will be at most 90MB of memory. Eventually, we'd like to set up a victoriametrics cluster for customers and just use that, but this will unlock a few more analyses for the insights project. (For example, we can efficiently compute the top K functions for a given metric.) It's API compatible with the old stuff, so we shouldn't need to change the dashboard. GitOrigin-RevId: 08a9de42d4263b3b4a2b07d3366a2275a80d7df4
get-convex · Dec 13, 2024 · cc2aea1 · cc2aea1
1 parent 2f347e1
commit cc2aea1
Show file tree

Hide file tree

Showing 14 changed files with 1,171 additions and 366 deletions.
diff --git a/Cargo.lock b/Cargo.lock
diff --git a/Cargo.toml b/Cargo.toml
@@ -53,6 +53,7 @@ futures = "0.3"
 futures-async-stream = "0.2.11"
 futures-util = "0.3.30"
 governor = "0.6.0"
+hdrhistogram = "7.5.4"
 headers = "0.4"
 hex = "0.4"
 home = "0.5"

diff --git a/crates/application/Cargo.toml b/crates/application/Cargo.toml
@@ -68,6 +68,7 @@ thousands = { workspace = true }
 tokio = { workspace = true }
 tokio-stream = { workspace = true }
 tracing = { workspace = true }
+udf_metrics = { path = "../udf_metrics" }
 url = { workspace = true }
 usage_tracking = { path = "../../crates/usage_tracking" }
 value = { path = "../value" }