Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: log server-side /health checks #11704

Merged
merged 1 commit into from
Mar 18, 2020
Merged

*: log server-side /health checks #11704

merged 1 commit into from
Mar 18, 2020

Conversation

gyuho
Copy link
Contributor

@gyuho gyuho commented Mar 18, 2020

To make it easier to root-cause when /health check fails.
For example, we are using load balancer to health check
each etcd instance, and when one etcd node gets terminated,
it's hard to tell whether etcd "server" was really failing
or client (or load balancer") failed to reach the etcd cluster
which is also failure in load balancer health check.

{"level":"info","ts":"2020-03-18T10:56:05.206-0700","caller":"etcdhttp/metrics.go:65","msg":"/health OK","http-error-code":200}

@gyuho
Copy link
Contributor Author

gyuho commented Mar 18, 2020

To make it easier to root-cause when /health check fails.
For example, we are using load balancer to health check
each etcd instance, and when one etcd node gets terminated,
it's hard to tell whether etcd "server" was really failing
or client (or load balancer") failed to reach the etcd cluster
which is also failure in load balancer health check.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
@wenjiaswe
Copy link
Contributor

/lgtm thanks!

@gyuho gyuho merged commit b50c92c into etcd-io:master Mar 18, 2020
@gyuho gyuho deleted the log-health branch March 18, 2020 19:33
gyuho added a commit that referenced this pull request Mar 18, 2020
ref.
#11704

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
gyuho added a commit that referenced this pull request Mar 18, 2020
ref.
#11704

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
gyuho added a commit that referenced this pull request Mar 18, 2020
ref.
#11704

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
gyuho added a commit that referenced this pull request Mar 18, 2020
ref.
#11704

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
gyuho added a commit that referenced this pull request Mar 18, 2020
ref.
#11704

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
gyuho added a commit that referenced this pull request Mar 19, 2020
ref.
#11704

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
chaochn47 added a commit to chaochn47/etcd that referenced this pull request Feb 9, 2021
… in debug level

When we have an external component that checks /health periodically, the
etcd server logs can be quite verbose (e.g., DDOS-ing against insure
etcd health check can lead to disk space full due to large log files).

This change was introduced in etcd-io#11704.

While we keep the warning logs for etcd health check failures, the
success (or OK) log level should be set to DEBUG.

Fixes etcd-io#12676
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants