-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Categorize Prometheus metrics and add Connection metrics #1143
Conversation
Thanks for your PR. The following commands are available:
|
615571f
to
f5722d9
Compare
f5722d9
to
777a4b7
Compare
This change does the following: - Categorize Antrea Agent prometheus metrics and provides a way to flexibly configure them. - Remove host/node name metric. I changed the antrea-prometheus.yaml to add nodename to instance lable instead of IP:port, which makes promql queries easier. I do not know if there is any other benefit for the host/node name metric. - Add connection metrics with flow exporter feature enabled. Specifically total connection count in conntrack table and connection count in Antrea connetion store.
777a4b7
to
01896e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a general comment for now before I review further: what's the rationale for having individual configuration switches for different categories of metrics?
# EnablePrometheusMetrics is a map of metric categories to bool flags that enables or disables those metrics exposure. | ||
enablePrometheusMetrics: | ||
# Metrics in all below categories can be enabled or disabled through AllMetrics. | ||
# AllMetrics: false | ||
# Metrics related to Pods. They can be enabled or disabled through PodMetrics. | ||
# PodMetrics: false | ||
# Metrics related to Network Policies. They can be enabled or disabled through NetworkPolicyMetrics. | ||
# NetworkPolicyMetrics: false | ||
# Metrics related to OVS switch. They can be enabled or disabled through OVSMetrics. | ||
# OVSMetrics: false | ||
# Metrics related to connections when FlowExporter feature is enabled. They can be enabled or disabled through ConnectionMetrics. | ||
# ConnectionMetrics: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the advantage of exposing all these configuration switches?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless this change is driven by some specific use case, I think we should look into addressing #723 first.
Hi Antonin, The idea is that the user might be interested in specific metrics only, say he only wants to track the number of network policies at Antrea agent and not in other metrics. When we have a large number of metrics (not now but as we keep adding), categorizing them would help in optimizing resources both when tracking at the agent and also during scraping by the Prometheus server. Agree that the functionality can be enhanced when we can change the configmap dynamically without needing a restart (#723 ). |
@srikartati I haven't checked but do other projects (e.g. k8s) support conditionally enabling some Prometheus metrics? I am not sure the optimization on the Agent side is worth introducing new configuration dimensions. On the Prometheus server side, can't scraping be configured on a per-metric basis? |
@antoninbas Hubble from Cilium does the conditional supporting, but the metric list is sent through cobra command flags and not the configMap. I know that on the Prometheus server side we can drop some metrics before ingestion by relabeling them with action drop. However, scraping will still happen. I am not aware if there is some other way of not configuring a subset of metrics. |
Had an offline discussion with @antoninbas. As there is a significant change in the configMap format. This may need some more discussion in wider forum. Connection metrics will be handled separately. |
This change does the following:
IP:port, which make promql queries easier.
I do not know if there is any other benefit for the host/node name metric.
in conntrack table and connection count in Antrea connection store.