You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
loadTablets() in the topology watcher can call into healthcheck methods that need to acquire its mutex and then updateHealth method can get stuck holding that same mutex waiting for a chan send on the loadTabletsTrigger that the topology watcher goroutine is supposed to be draining.
Overview of the Issue
We found out a case where the topology watcher and the health check can deadlock -
these two pieces of logic interact:
https://github.com/vitessio/vitess/blob/release-20.0/go/vt/discovery/topology_watcher.go#L126-L130
https://github.com/vitessio/vitess/blob/release-20.0/go/vt/discovery/healthcheck.go#L550
loadTablets()
in the topology watcher can call intohealthcheck
methods that need to acquire its mutex and thenupdateHealth
method can get stuck holding that same mutex waiting for a chan send on theloadTabletsTrigger
that the topology watcher goroutine is supposed to be draining.Here are the stack traces for the deadlock -
This problem has existed since the trigger logic was introduced in #12893.
Reproduction Steps
Run Vitess health check and topology watcher and hope to see the deadlock, or write a unit test :3!
Binary Version
Operating System and Environment details
Log Fragments
No response
The text was updated successfully, but these errors were encountered: