Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change condition of PHC to reduce amount of produced logs #3046

Merged
merged 1 commit into from
Apr 26, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion proxy/healthy_endpoints.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ func (h *healthyEndpoints) filterHealthyEndpoints(ctx *context, endpoints []rout
filtered := make([]routing.LBEndpoint, 0, len(endpoints))
for _, e := range endpoints {
dropProbability := e.Metrics.HealthCheckDropProbability()
if p < dropProbability {
if dropProbability > 0.05 && p < dropProbability {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this does not look right as it not only influences logging but a logic of dropping the endpoint.

We can enhance logging to log once a second/minute and also keep drop counter in the e.Metrics for logging but maybe we should just remove logs and rely on the passive-health-check.endpoints.dropped.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And when probability goes up this will flood logs anyway.

Copy link
Member

@AlexanderYastrebov AlexanderYastrebov Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The min probability cutoff (if we really want it) should be implemented in updateStats()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as we have no metrics that are good enough we let the logs.
The idea was that we cut the min. There was a lot of logs that were way too close to 0 to make sense. If you have all endpoints having a low error rate you just cycle through all of them, likely something we also want to address.

ctx.Logger().Infof("Dropping endpoint %q due to passive health check: p=%0.2f, dropProbability=%0.2f",
e.Host, p, dropProbability)
metrics.IncCounter("passive-health-check.endpoints.dropped")
Expand Down
Loading