forked from vitessio/vitess
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backport 14693 - Use GetTabletsByCell in healthcheck #514
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This backports upstram PR vitessio#14693, with a few minor changes to make it work with the Go version we are using and a small change to topology_watcher.go so that test cases reflect and test for the same behavior as the upstream code. The description of the original PR follows: VTGate's healthcheck module currently calls GetTablet for each tablet alias that it discovers in a cell. Instead we can use GetTabletsForCell to fetch all tablets for a cell at once. This PR does a few more things: * GetTabletsForCell now handles the case where the response size violates gRPC limits by falling back to one tablet at a time in case of error. * Previously, the one tablet at a time method had unlimited concurrency. In this PR we introduce a configuration option for concurrency. * We pass topoReadConcurrency from healthcheck into GetTabletsForCell. * The behavior of --refresh_known_tablets flag is different now. Previously we would not read those tablets at all, now we do read them, but ignore any changes if they are already known. The basic fix has already been tried in production and shown to reduce the number of Get calls from vtgate -> topo from O(n) to O(1). We can consider deprecating and deleting --refresh_known_tablets in a future release. The concerns that originally motivated adding that flag in vitessio#3965 are alleviated by fetching all tablets in one call to the topo.
timvaillancourt
added
Component: VTGate
Component: VTTablet
Component: Query Serving
Type: Performance
upstream-backport
An upstream backport
v15
v19-backport
labels
Sep 18, 2024
makinje16
approved these changes
Sep 26, 2024
tanjinx
approved these changes
Oct 1, 2024
timvaillancourt
approved these changes
Oct 2, 2024
ejortegau
added a commit
that referenced
this pull request
Oct 10, 2024
vitessio#16871 added filtering in the health check, but later #514 undid that. So aiming at fixing it here. With this change, essentially, the filters passed in NewHealthCheck() get actually applied to the topology watchers, instead of it using a newly instantiated, empty filter as it's currently happening. Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>
Merged
timvaillancourt
added a commit
that referenced
this pull request
Oct 10, 2024
* Fix healthcheck filters vitessio#16871 added filtering in the health check, but later #514 undid that. So aiming at fixing it here. With this change, essentially, the filters passed in NewHealthCheck() get actually applied to the topology watchers, instead of it using a newly instantiated, empty filter as it's currently happening. Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com> * Improve tests by getting them from vitessio#16871 Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com> --------- Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com> Co-authored-by: Tim Vaillancourt <tim@timvaillancourt.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This backports upstream PR vitessio#14693, with a few minor changes to make it work with the Go
version we are using and a small change to topology_watcher.go so that test cases reflect
and test for the same behavior as the upstream code. The description of the original PR
follows:
VTGate's healthcheck module currently calls GetTablet for each tablet alias that it discovers
in a cell. Instead we can use GetTabletsForCell to fetch all tablets for a cell at once.
This PR does a few more things:
falling back to one tablet at a time in case of error.
introduce a configuration option for concurrency.
read those tablets at all, now we do read them, but ignore any changes if they are
already known.
The basic fix has already been tried in production and shown to reduce the number of Get
calls from vtgate -> topo from O(n) to O(1).
We can consider deprecating and deleting --refresh_known_tablets in a future release.
The concerns that originally motivated adding that flag in vitessio#3965 are alleviated by fetching
all tablets in one call to the topo.