-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add gateway histogram metrics #8443
Conversation
No real preference but something like |
Do you want both |
|
@schomatis : can you please review/merge? |
@aschmahmann : I'm moving this to "ready for review" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense but I know nothing of our metric system.
@aschmahmann : any concern with merging? |
I will update this PR to work with non-unixfs response types added in #8758 |
Testing if this removes me from the review list and removes the PR from my view in the board (I think it doesn't but want to confirm).
0176b66
to
914ed40
Compare
I rebased this PR on top of New Histogram Metrics
👉 as per usual, feedback on names / descriptions / usefullness is appreciated. Demohttp://127.0.0.1:5001/debug/metrics/prometheus will have histogram per namespace # HELP ipfs_http_gw_first_content_block_get_latency_seconds The time till the first content block is received on GET from the gateway.
# TYPE ipfs_http_gw_first_content_block_get_latency_seconds histogram
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipfs",le="0.1"} 3
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipfs",le="0.5"} 3
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipfs",le="1"} 3
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipfs",le="2"} 3
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipfs",le="3"} 3
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipfs",le="5"} 3
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipfs",le="8"} 3
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipfs",le="13"} 3
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipfs",le="+Inf"} 3
ipfs_http_gw_first_content_block_get_latency_seconds_sum{gateway="ipfs"} 0.024477469
ipfs_http_gw_first_content_block_get_latency_seconds_count{gateway="ipfs"} 3
...
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipns",le="0.1"} 60
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipns",le="0.5"} 63
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipns",le="1"} 63
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipns",le="2"} 63
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipns",le="3"} 63
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipns",le="5"} 63
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipns",le="8"} 63
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipns",le="13"} 63
ipfs_http_gw_first_content_block_get_latency_seconds_bucket{gateway="ipns",le="+Inf"} 63
ipfs_http_gw_first_content_block_get_latency_seconds_sum{gateway="ipns"} 0.9177464280000002
ipfs_http_gw_first_content_block_get_latency_seconds_count{gateway="ipns"} 63
... # HELP ipfs_http_gw_unixfs_gen_dir_listing_get_duration_seconds The time it takes till generated HTML with directory listing is served on GET from the gateway.
# TYPE ipfs_http_gw_unixfs_gen_dir_listing_get_duration_seconds histogram
ipfs_http_gw_unixfs_gen_dir_listing_get_duration_seconds_bucket{gateway="ipfs",le="0.1"} 1
ipfs_http_gw_unixfs_gen_dir_listing_get_duration_seconds_bucket{gateway="ipfs",le="0.5"} 1
ipfs_http_gw_unixfs_gen_dir_listing_get_duration_seconds_bucket{gateway="ipfs",le="1"} 1
ipfs_http_gw_unixfs_gen_dir_listing_get_duration_seconds_bucket{gateway="ipfs",le="2"} 1
ipfs_http_gw_unixfs_gen_dir_listing_get_duration_seconds_bucket{gateway="ipfs",le="3"} 1
ipfs_http_gw_unixfs_gen_dir_listing_get_duration_seconds_bucket{gateway="ipfs",le="5"} 1
ipfs_http_gw_unixfs_gen_dir_listing_get_duration_seconds_bucket{gateway="ipfs",le="8"} 1
ipfs_http_gw_unixfs_gen_dir_listing_get_duration_seconds_bucket{gateway="ipfs",le="13"} 1
ipfs_http_gw_unixfs_gen_dir_listing_get_duration_seconds_bucket{gateway="ipfs",le="+Inf"} 2
ipfs_http_gw_unixfs_gen_dir_listing_get_duration_seconds_sum{gateway="ipfs"} 19.474375763
ipfs_http_gw_unixfs_gen_dir_listing_get_duration_seconds_count{gateway="ipfs"} 2 |
Awesome @lidel ! Is review needed here now? |
The new metrics look good, only suggestions is that we add a couple more buckets. |
|
||
// Legacy Metrics | ||
// ---------------------------- | ||
unixfsGetMetric: newGatewaySummaryMetric( // TODO: remove? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's low-effort to keep right? Might as well keep it around to avoid breaking the monitoring of gateway operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, my plan is to keep it at least for one/two releases.
Adds: - response-type agnostic firstContentBlockGetMetric which counts the latency til the first content block. - car/block/file/gen-dir-index duration histogram metrics that show how long each response type takes
Co-authored-by: Gus Eggert <gus@gus.dev>
42fac3b
to
ae5de99
Compare
|
0.05, 0.1, 0.25, 0.5, 1, 2, 5, 10, 30, 60 as suggested in reviews: #8443
I've changed buckets to include asks/suggestions from the reviews, we now have 10 of them: |
* feat(gw): response type histogram metrics - response-type agnostic firstContentBlockGetMetric which counts the latency til the first content block. - car/block/file/gen-dir-index duration histogram metrics that show how long each response type takes * docs: improve metrics descriptions * feat: more gw histogram buckets 0.05, 0.1, 0.25, 0.5, 1, 2, 5, 10, 30, 60 secs as suggested in reviews at ipfs/kubo#8443 Co-authored-by: Marcin Rataj <lidel@lidel.org> Co-authored-by: Gus Eggert <gus@gus.dev> This commit was moved from ipfs/kubo@beaa8fc
fixes #8441
@gmasgras did you want both the summary and histogram exposed, or should I just switch the summary over to a histogram?
Any preference on the metric name? For example, if replacing the summary metric we may still want a different name so infra that's running the older version and the newer version can have the metrics play together a little nicer.