*: log server-side /health checks #11704

gyuho · 2020-03-18T18:09:54Z

To make it easier to root-cause when /health check fails.
For example, we are using load balancer to health check
each etcd instance, and when one etcd node gets terminated,
it's hard to tell whether etcd "server" was really failing
or client (or load balancer") failed to reach the etcd cluster
which is also failure in load balancer health check.

{"level":"info","ts":"2020-03-18T10:56:05.206-0700","caller":"etcdhttp/metrics.go:65","msg":"/health OK","http-error-code":200}

gyuho · 2020-03-18T18:10:24Z

/cc @wenjiaswe @jingyih @spzala @hexfusion

To make it easier to root-cause when /health check fails. For example, we are using load balancer to health check each etcd instance, and when one etcd node gets terminated, it's hard to tell whether etcd "server" was really failing or client (or load balancer") failed to reach the etcd cluster which is also failure in load balancer health check. Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

wenjiaswe · 2020-03-18T18:21:39Z

/lgtm thanks!

ref. #11704 Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

… in debug level When we have an external component that checks /health periodically, the etcd server logs can be quite verbose (e.g., DDOS-ing against insure etcd health check can lead to disk space full due to large log files). This change was introduced in etcd-io#11704. While we keep the warning logs for etcd health check failures, the success (or OK) log level should be set to DEBUG. Fixes etcd-io#12676

gyuho force-pushed the log-health branch from f45f0e5 to 92f180c Compare March 18, 2020 18:14

gyuho merged commit b50c92c into etcd-io:master Mar 18, 2020

gyuho deleted the log-health branch March 18, 2020 19:33

gyuho added a commit that referenced this pull request Mar 18, 2020

etcdserver/api/etcdhttp: log server-side /health checks

6373507

ref. #11704 Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

gyuho added a commit that referenced this pull request Mar 18, 2020

etcdserver/api/etcdhttp: log server-side /health checks

8e01b77

ref. #11704 Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

gyuho added a commit that referenced this pull request Mar 18, 2020

etcdserver/api/etcdhttp: log server-side /health checks

7a86aa3

ref. #11704 Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

gyuho added a commit that referenced this pull request Mar 18, 2020

etcdserver/api/etcdhttp: log server-side /health checks

30aaceb

ref. #11704 Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

gyuho added a commit that referenced this pull request Mar 18, 2020

etcdserver/api/etcdhttp: log server-side /health checks

0b9cfa8

ref. #11704 Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

gyuho added a commit that referenced this pull request Mar 19, 2020

etcdserver/api/etcdhttp: log server-side /health checks

42d7490

ref. #11704 Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

gyuho mentioned this pull request Feb 9, 2021

Reduce verbosity of etcd server-side health check logs #12676

Closed

chaochn47 mentioned this pull request Feb 9, 2021

etcdserver/api/etcdhttp: log successful etcd server side health check in debug level #12677

Merged

hexfusion mentioned this pull request May 11, 2021

Bug 1958405: UPSTREAM: <carry>: *: log server-side /health checks openshift/etcd#79

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

*: log server-side /health checks #11704

*: log server-side /health checks #11704

gyuho commented Mar 18, 2020

gyuho commented Mar 18, 2020

wenjiaswe commented Mar 18, 2020

*: log server-side /health checks #11704

*: log server-side /health checks #11704

Conversation

gyuho commented Mar 18, 2020

gyuho commented Mar 18, 2020

wenjiaswe commented Mar 18, 2020