-
Notifications
You must be signed in to change notification settings - Fork 351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change condition of PHC to reduce amount of produced logs #3046
Conversation
👍 |
0acc8f2
to
c50f132
Compare
c50f132
to
dae713c
Compare
It is hardcoded to 5% now, but we plan to make it configurable. Signed-off-by: Roman Zavodskikh <roman.zavodskikh@zalando.de>
dae713c
to
62a3f33
Compare
👍 |
1 similar comment
👍 |
@@ -22,7 +22,7 @@ func (h *healthyEndpoints) filterHealthyEndpoints(ctx *context, endpoints []rout | |||
filtered := make([]routing.LBEndpoint, 0, len(endpoints)) | |||
for _, e := range endpoints { | |||
dropProbability := e.Metrics.HealthCheckDropProbability() | |||
if p < dropProbability { | |||
if dropProbability > 0.05 && p < dropProbability { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this does not look right as it not only influences logging but a logic of dropping the endpoint.
We can enhance logging to log once a second/minute and also keep drop counter in the e.Metrics for logging but maybe we should just remove logs and rely on the passive-health-check.endpoints.dropped
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And when probability goes up this will flood logs anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The min probability cutoff (if we really want it) should be implemented in updateStats()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as we have no metrics that are good enough we let the logs.
The idea was that we cut the min. There was a lot of logs that were way too close to 0 to make sense. If you have all endpoints having a low error rate you just cycle through all of them, likely something we also want to address.
It is hardcoded to 5% now, but we plan to make it configurable. Signed-off-by: Roman Zavodskikh <roman.zavodskikh@zalando.de> Co-authored-by: Roman Zavodskikh <roman.zavodskikh@zalando.de>
Logging every request produces too many log lines.
However, we can still keep the logs of stats updater https://github.com/zalando/skipper/blob/v0.21.65/routing/endpointregistry.go#L172 because their amount does not depend on requests count.