-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve robustness of Backup/Restore involving external_storage (#7917) #7965
Conversation
Signed-off-by: sre-bot <sre-bot@pingcap.com>
/run-all-tests |
Please fix conflicts |
7750c03
to
6559431
Compare
Signed-off-by: kennytm <kennytm@gmail.com>
6559431
to
1a39666
Compare
/run-integration-ddl-test /run-integration-compatibility-test |
/merge |
Your auto merge job has been accepted, waiting for:
|
/merge |
Your auto merge job has been accepted, waiting for:
|
/run-integration-ddl-test |
/merge |
Your auto merge job has been accepted, waiting for:
|
/merge |
Your auto merge job has been accepted, waiting for:
|
/merge |
Your auto merge job has been accepted, waiting for:
|
/run-all-tests |
@sre-bot merge failed. |
/run-all-tests |
cherry-pick #7917 to release-3.1
What problem does this PR solve?
Issue Number: close #7375, close #7850, close #7880, close tidb-challenge-program/bug-hunting-issue#72
Problem Summary:
The backup/restore did not have timeout and retry. This causes a single spurious network error to either catastrophically kill the entire BR task or block the BR thread pool forever.
What is changed and how it works?
What's Changed:
ExternalStorage::write
in particular) to fix AWS br backup failed when collation is open #7850.tame-gcs
does not support resumable upload Support resumable upload EmbarkStudios/tame-gcs#24 so this won't work for GCS. (This also fix external_storage: if S3 upload cannot finish within 15 minutes it will fail #7375.)reqwest
byhyper
so thoseconvert_request
andconvert_response
are no longer needed.ExternalStorage::read
, so that a stuck download won't block the thread pool forever, to fix After backup/restore stuck, new backup cannot work until all tikv are deleted #7880.Related changes
Check List
Tests
minio
S3 storage.Side effects
Release note