-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store/copr: use a ttl duration to protect a new recovered tiflash node from processing mpp tasks. #26793
Conversation
add ttl for store fail time set var tiny fix cancel retry make retry time more accurate fix tiny bug fix bug change default wait time
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see hanfei1991#16
* amendment for readability. * refine name * Update mpp.go
some tiny change suggested by qizhi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@LittleFall: Thanks for your review. The bot only counts approvals from reviewers and higher roles in list, but you're still welcome to leave your comments. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@windtalker: Thanks for your review. The bot only counts approvals from reviewers and higher roles in list, but you're still welcome to leave your comments. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
/merge |
@hanfei1991: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/merge |
This pull request has been accepted and is ready to merge. Commit hash: e907210
|
/run-build |
/run-check_dev_2 |
/run-build |
/run-unit-test |
cherry pick to release-5.1 failed |
…e from processing mpp tasks. (pingcap#26793)
What problem does this PR solve?
Problem Summary:
Right now, we use a "mpp.IsAlive" rpc request to assure whether a tiflash node is available. However, in some cases, such as inactivate network service for a while, the tiflash node cannot recovered promptly. Accordingly, even though a tiflash node response with "IsAlive", it doesn't mean it can serve at once. There is a period that the node process mpp task very slowly.
Before tiflash can precisely tell whether can provide service, we throw up a work-around way to prevent from sending query to the just recovered tiflash for a period. We set up a variable "tidb_mpp_fail_store_ttl" meaning the unavailable time for the just reconvered tiflash.
What is changed and how it works?
Check List
Tests
Side effects
Documentation
Release note