Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

operator: fast-fail if the leader changed to the RemovePeer (#2530) #2551

Merged
merged 1 commit into from
Jun 18, 2020

Conversation

nolouch
Copy link
Contributor

@nolouch nolouch commented Jun 18, 2020

Signed-off-by: nolouch nolouch@gmail.com

What problem does this PR solve?

Fix #2493
If the leader changed cause by TiKV(maybe too busy or network problem), the operator cannot step to the final state, such as

PD:
try to remove peer 24
TiKV:
reject remove peer 24 because it is a leader.

Then the operator needs 10mins to timeout, which may cause the rule sync learner became slow.

What is changed and how it works?

Check List

Tests

  • Unit test

Release note

  • Let the operator fast fail if the leader changed to the RemovePeer

@nolouch nolouch added the priority/release-blocker This issue blocks a release. Please solve it ASAP. label Jun 18, 2020
@nolouch nolouch added this to the v4.0.2 milestone Jun 18, 2020
@nolouch nolouch requested review from rleungx and disksing June 18, 2020 07:58
@nolouch nolouch added the component/schedule Scheduling logic. label Jun 18, 2020
@codecov-commenter
Copy link

Codecov Report

Merging #2551 into release-4.0 will decrease coverage by 0.10%.
The diff coverage is 64.13%.

Impacted file tree graph

@@               Coverage Diff               @@
##           release-4.0    #2551      +/-   ##
===============================================
- Coverage        77.16%   77.06%   -0.11%     
===============================================
  Files              204      204              
  Lines            21940    22011      +71     
===============================================
+ Hits             16931    16963      +32     
- Misses            3724     3744      +20     
- Partials          1285     1304      +19     
Impacted Files Coverage Δ
pkg/mock/mockoption/mockoption.go 93.81% <33.33%> (-1.94%) ⬇️
server/schedule/operator/step.go 65.40% <37.50%> (-9.78%) ⬇️
server/schedule/operator_controller.go 84.07% <96.77%> (+3.23%) ⬆️
server/statistics/hot_peer_cache.go 92.61% <100.00%> (+0.26%) ⬆️
server/kv/etcd_kv.go 75.32% <0.00%> (-12.99%) ⬇️
server/region_syncer/client.go 80.15% <0.00%> (-4.59%) ⬇️
server/tso/tso.go 77.37% <0.00%> (-2.92%) ⬇️
pkg/etcdutil/etcdutil.go 88.40% <0.00%> (-2.90%) ⬇️
server/cluster/coordinator.go 74.52% <0.00%> (-2.20%) ⬇️
server/grpc_service.go 58.80% <0.00%> (-0.59%) ⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b81f851...cb86d13. Read the comment docs.

@HunDunDM
Copy link
Member

/merge

@ti-srebot ti-srebot added the status/can-merge Indicates a PR has been approved by a committer. label Jun 18, 2020
@ti-srebot
Copy link
Contributor

/run-all-tests

@ti-srebot
Copy link
Contributor

@nolouch merge failed.

@lhy1024
Copy link
Contributor

lhy1024 commented Jun 18, 2020

/merge

@ti-srebot
Copy link
Contributor

/run-all-tests

1 similar comment
@HunDunDM
Copy link
Member

/run-all-tests

@ti-srebot ti-srebot merged commit c459c25 into tikv:release-4.0 Jun 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/schedule Scheduling logic. priority/release-blocker This issue blocks a release. Please solve it ASAP. status/can-merge Indicates a PR has been approved by a committer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants