Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance the detection mechanism for the unhealthy etcd node #7730

Closed
JmPotato opened this issue Jan 18, 2024 · 2 comments
Closed

Enhance the detection mechanism for the unhealthy etcd node #7730

JmPotato opened this issue Jan 18, 2024 · 2 comments
Labels
component/election Election related logic. report/customer Customers have encountered this bug. type/enhancement The issue or PR belongs to an enhancement.

Comments

@JmPotato
Copy link
Member

Part of #7499.

In various tests, we always observe continuous unavailability when injecting IO latency and other chaos into the PD leader node., like #6291. Upon further investigation of the logs, we discovered that the detection and eviction of unhealthy nodes are not always accurate. As a result, problematic etcd nodes can persistently impact our requests due to the round-robin balancer used by the etcd client. Especially during the leader switch, this problem could cause the PD leader unable to be stabilized and prolong the election, which affects the availability a lot.

2024-01-17 12:42:26 | {"level":"INFO","namespace":"endless-ha-test-tps-6150264-1-163","pod":"tc-pd-1","container":"pd","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=3->2] [last-endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]"} | 
2024-01-17 12:41:33 | {"pod":"tc-pd-0","container":"pd","level":"INFO","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=2->3] [last-endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]","namespace":"endless-ha-test-tps-6150264-1-163"} |  
2024-01-17 12:41:33 | {"pod":"tc-pd-0","container":"pd","level":"INFO","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=2->3] [last-endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]","namespace":"endless-ha-test-tps-6150264-1-163"} |  
2024-01-17 12:41:31 | {"pod":"tc-pd-0","container":"pd","level":"INFO","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=3->2] [last-endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]","namespace":"endless-ha-test-tps-6150264-1-163"} |  
2024-01-17 12:41:31 | {"pod":"tc-pd-0","container":"pd","level":"INFO","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=3->2] [last-endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]","namespace":"endless-ha-test-tps-6150264-1-163"} |  
2024-01-17 12:40:47 | {"level":"INFO","pod":"tc-pd-2","container":"pd","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=2->3] [last-endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]","namespace":"endless-ha-test-tps-6150264-1-163"} |  
2024-01-17 12:40:47 | {"level":"INFO","namespace":"endless-ha-test-tps-6150264-1-163","pod":"tc-pd-1","container":"pd","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=2->3] [last-endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]"} |  
2024-01-17 12:40:46 | {"level":"INFO","namespace":"endless-ha-test-tps-6150264-1-163","pod":"tc-pd-1","container":"pd","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=3->2] [last-endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]"} | 
2024-01-17 12:40:45 | {"level":"INFO","pod":"tc-pd-2","container":"pd","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=3->2] [last-endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]","namespace":"endless-ha-test-tps-6150264-1-163"} |  
2024-01-17 12:40:45 | {"level":"INFO","pod":"tc-pd-2","container":"pd","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=3->2] [last-endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]","namespace":"endless-ha-test-tps-6150264-1-163"} |  
2024-01-17 12:40:39 | {"pod":"tc-pd-0","container":"pd","level":"INFO","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=2->3] [last-endpoints=\"[http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]","namespace":"endless-ha-test-tps-6150264-1-163"} |  
2024-01-17 12:40:39 | {"pod":"tc-pd-0","container":"pd","level":"INFO","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=2->3] [last-endpoints=\"[http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]","namespace":"endless-ha-test-tps-6150264-1-163"} |  
2024-01-17 12:40:31 | {"pod":"tc-pd-0","container":"pd","level":"INFO","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=3->2] [last-endpoints=\"[http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]","namespace":"endless-ha-test-tps-6150264-1-163"} |  
2024-01-17 12:40:31 | {"pod":"tc-pd-0","container":"pd","level":"INFO","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=3->2] [last-endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]","namespace":"endless-ha-test-tps-6150264-1-163"} |  
2024-01-17 12:39:27 | {"pod":"tc-pd-0","container":"pd","level":"INFO","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=2->3] [last-endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]","namespace":"endless-ha-test-tps-6150264-1-163"} |  
2024-01-17 12:39:27 | {"pod":"tc-pd-0","container":"pd","level":"INFO","log":"[etcdutil.go:314] [\"update endpoints\"] [num-change=2->3] [last-endpoints=\"[http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"] [endpoints=\"[http://tc-pd-0.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-2.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-tps-6150264-1-163.svc:2379]\"]","namespace":"endless-ha-test-tps-6150264-1-163"}

For now, we just use 10 seconds as the timeout for the health check, it's kind of too loose in some cases. We require a more precise detection mechanism to promptly remove an unhealthy etcd node from the available endpoints and prevent it from rejoining before it truly recovers.

@JmPotato JmPotato added type/enhancement The issue or PR belongs to an enhancement. component/election Election related logic. labels Jan 18, 2024
ti-chi-bot bot pushed a commit that referenced this issue Jan 22, 2024
ref #7730

Move the health checker into a separate file.

Signed-off-by: JmPotato <ghzpotato@gmail.com>
ti-chi-bot bot pushed a commit that referenced this issue Jan 30, 2024
…#7737)

ref #7730

Consider the latency while patrolling the healthy endpoints to reduce the effect of slow nodes.
Now, there are the following strategies to select and remove unhealthy endpoints:

- Choose only the healthy endpoint within the lowest acceptable latency range.
- The evicted endpoint can only rejoin if it is selected again for three consecutive times.

Signed-off-by: JmPotato <ghzpotato@gmail.com>
ti-chi-bot bot pushed a commit that referenced this issue Jan 31, 2024
ref #7499, ref #7730

Return the originally picked endpoints directly if all are evicted to gain better availability.

Signed-off-by: JmPotato <ghzpotato@gmail.com>
@JmPotato
Copy link
Member Author

JmPotato commented Feb 6, 2024

Close with #7737.

@JmPotato JmPotato closed this as completed Feb 6, 2024
@seiya-annie
Copy link

/found customer

@ti-chi-bot ti-chi-bot bot added the report/customer Customers have encountered this bug. label Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/election Election related logic. report/customer Customers have encountered this bug. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

2 participants