-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add label propagation metrics #2076
Add label propagation metrics #2076
Conversation
Hi @songrx1997. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
3e1ba5e
to
0c140d0
Compare
/ok-to-test |
) | ||
|
||
const ( | ||
ratioWithAnnotation = "ratio_endpoints_with_annotation" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of ratio, lets say percentage
prometheus.CounterOpts{ | ||
Subsystem: negControllerSubsystem, | ||
Name: totalLabelError, | ||
Help: "The total number of errors of label propagation", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the total number of errors occurred for label propagation
prometheus.CounterOpts{ | ||
Subsystem: negControllerSubsystem, | ||
Name: labelTruncationErrorCount, | ||
Help: "The number of label truncation", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the number of labels truncated
LabelTruncationErrorCount = prometheus.NewCounter( | ||
prometheus.CounterOpts{ | ||
Subsystem: negControllerSubsystem, | ||
Name: labelTruncationErrorCount, | ||
Help: "The number of label truncation", | ||
}, | ||
) | ||
|
||
LabelFailureCount = prometheus.NewCounter( | ||
prometheus.CounterOpts{ | ||
Subsystem: negControllerSubsystem, | ||
Name: labelFailureCount, | ||
Help: "The number of label truncation failures", | ||
}, | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can these two metrics be combined and use a label to track truncation failures, truncated, and all labels added?
EndpointsWithAnnotations = prometheus.NewGauge( | ||
prometheus.GaugeOpts{ | ||
Subsystem: negControllerSubsystem, | ||
Name: endpointWithAnnotation, | ||
Help: "The number of endpoints with annotations", | ||
}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one lets name number_of_endpoints
and then you would have a label that say with annotations.
This way we track total and number with annotations with a single metric
RatioWithAnnotation = prometheus.NewGauge( | ||
prometheus.GaugeOpts{ | ||
Subsystem: negControllerSubsystem, | ||
Name: ratioWithAnnotation, | ||
Help: "The ratio between the number of endpoints with annotation and the total number", | ||
}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to have this metric specifically. with the one above, we can calculate the ratio after the fact. For metrics, generally we try to stay away from doing any calculations when emitting/publishing them.
AverageLabelNumber = prometheus.NewGauge( | ||
prometheus.GaugeOpts{ | ||
Subsystem: negControllerSubsystem, | ||
Name: averageLabelNumber, | ||
Help: "The average number of labels per endpoint", | ||
}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one should be a distribution and we just emit the number of labels per endpoint. That way prometheus will do the calculation.
AverageAnnotationSize = prometheus.NewGauge( | ||
prometheus.GaugeOpts{ | ||
Subsystem: negControllerSubsystem, | ||
Name: averageAnnotationSize, | ||
Help: "The average size of endpoint annotations", | ||
}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above, change this to a distribution
// LabelPropagationStat contains stats related to label propagation. | ||
type LabelPropagationStats struct { | ||
EndpointCount int | ||
EndpointWithAnnotation int | ||
PodLabelCount int | ||
AnnotationSize int | ||
TotalLabelError int | ||
LabelTruncationCount int | ||
LabelFailureCount int | ||
} | ||
|
||
type LabelPropagationMetrics struct { | ||
TotalLabelError int | ||
LabelTruncationCount int | ||
LabelFailureCount int | ||
EndpointWithAnnotation int | ||
RatioWithAnnotation float64 | ||
AverageLabelNum float64 | ||
AverageAnnotationSize float64 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the labels should go in the same file as the metrics that will use them
ee00d32
to
497126d
Compare
LabelNumber = prometheus.NewHistogram( | ||
prometheus.HistogramOpts{ | ||
Subsystem: negControllerSubsystem, | ||
Name: labelNumber, | ||
Help: "The number of labels per endpoint", | ||
}, | ||
) | ||
|
||
AnnotationSize = prometheus.NewHistogram( | ||
prometheus.HistogramOpts{ | ||
Subsystem: negControllerSubsystem, | ||
Name: annotationSize, | ||
Help: "The size of endpoint annotations per endpoint", | ||
}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should add custom buckets. Otherwise the max bucket is 10 and anything over is not recorded.
For the annotation we should indicate that this is in bytes
bef626e
to
088e9b6
Compare
088e9b6
to
213c56c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: songrx1997, swetharepakula The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/assign @swetharepakula