-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use method and path as labels for http metrics #8271
Conversation
Hello, and thank you for contributing this pull request! I think the changes you propose are correct, but we will very likely need to keep the existing metrics for backwards compatibility. I think we can emit both metrics, and rely on the What I'm not sure about is if we should start by disabling the new labelled version by default and require someone to enable it in config (we can change the default in a future release), or if we should emit both by default. Another thing we might want to do is change the name of the metric so that it is easier to include or exclude using a prefix. I suspect a filter of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your quick response! I updated the PR to send both metrics, and I changed the new metric name to consul.api.http
.
Also, should I add -consul.api.http
in the default prefix_filter
option then? I guess the change should be here
consul/agent/config/default.go
Lines 123 to 125 in 40f0805
telemetry = { | |
metrics_prefix = "consul" | |
filter_default = true |
@@ -177,7 +177,7 @@ This is a full list of metrics emitted by Consul. | |||
| `consul.dns.stale_queries` | This increments when an agent serves a query within the allowed stale threshold. | queries | counter | | |||
| `consul.dns.ptr_query.` | This measures the time spent handling a reverse DNS query for the given node. | ms | timer | | |||
| `consul.dns.domain_query.` | This measures the time spent handling a domain query for the given node. | ms | timer | | |||
| `consul.http..` | This tracks how long it takes to service the given HTTP request for the given verb and path. Paths do not include details like service or key names, for these an underscore will be present as a placeholder (eg. `consul.http.GET.v1.kv._`) | ms | timer | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I leave this line in the documentation since both metrics are sent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dnephin I'm not sure I understand why the CI is failing. Any suggestion?
@@ -123,6 +123,7 @@ func DefaultSource() Source { | |||
telemetry = { | |||
metrics_prefix = "consul" | |||
filter_default = true | |||
prefix_filter = [ "-consul.api.http" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added it here, please tell me if it should be added elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be correct. I'll need to double check. A comment explaining why this is disabled by default would be great. Something about it being a new metric and will be enabled by default in a future version.
Thanks! This is looking good I think CI is failing because of the change to the default config source. The expected value will need to be updated to account for the new defaults. |
Thanks @dnephin ! I updated the expected value in the test and added a small comment |
Hi @dnephin not sure how this change affected a cache test here |
Hi @dnephin, sorry to bother you again, but how is there any news here? |
Hey @coignetp, sorry for all the delays on this and your Raft contribution (you can blame me for those)! Thank you for taking the time to write this out and keep up with its status. Great work overall! I'm going to merge in your PR. After that I'll follow up with a PR of my own with a few changes to how we're handling these deprecated metrics. We want to add a warning for users when they're using the old metrics, and have the new This path has been something we've been discussing internally, so I apologize that that decision was not available when you picked up the initial issue. Even though we're landing it in a slightly different way, I want to make sure your commits still make it in. Again, thank you for submitting this PR. It may be small but you can see from how many issues it's going to close out that it's high impact. |
Hey!
I'm currently working on the consul integration at Datadog, and we want to scrape the prometheus endpoint to get consul metrics. In the current state several metrics are not consistently named, and contain parts that would be good as labels instead, and that makes scraping the endpoint and converting to Datadog metrics troublesome.
This PR use the method and the path as labels of the
consul.http
metric, and remove it from the metric name.Example:
consul_http_GET_v1_agent_metrics
->consul_http
with labelsmethod = GET
,path = v1_agent_metrics
.This change is not backward compatible since the metric names will be different. Maybe you'd like to keep sending metrics like
consul_http_GET_v1_agent_metrics
as well.Please tell me what you think about this change! 😄
P.S.: I also opened a PR for the raft module hashicorp/raft#409 for a similar change.