Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade and scaling in tikv at the same time will cause upgrade failure #2631

Closed
DanielZhangQD opened this issue Jun 5, 2020 · 1 comment · Fixed by #2705
Closed

upgrade and scaling in tikv at the same time will cause upgrade failure #2631

DanielZhangQD opened this issue Jun 5, 2020 · 1 comment · Fixed by #2705
Assignees
Labels
priority:P1 status/help-wanted Extra attention is needed status/WIP Issue/PR is being worked on
Milestone

Comments

@DanielZhangQD
Copy link
Contributor

Bug Report

What version of Kubernetes are you using?

1.12.8
What version of TiDB Operator are you using?

v1.1.0
What storage classes exist in the Kubernetes cluster and what are used for PD/TiKV pods?

local-storage
What's the status of the TiDB cluster pods?

Running
What did you do?

Update spec.version from 3.1.1 to 3.1.2
Update spec.tikv.replicas from 6 to 3
What did you expect to see?
Cluster is upgraded successfully and tikv is scaled in to 3.
What did you see instead?
tikv-5 is scaled in before tikv upgrade and the tikv upgrade hangs after tikv-5 is upgraded because tikv-5 can never be ready again since it's in offline state.

@DanielZhangQD DanielZhangQD added this to the v1.1.1 milestone Jun 5, 2020
@DanielZhangQD
Copy link
Contributor Author

I0605 08:37:46.246529       1 stateful_set_control.go:81] TidbCluster: [dan106/dan106]'s StatefulSet: [dan106/dan106-pd] updated successfully
I0605 08:37:50.454438       1 tikv_scaler.go:89] scaling in tikv statefulset dan106/dan106-tikv, ordinal: 5 (replicas: 5, delete slots: [])
I0605 08:37:51.785155       1 tikv_scaler.go:114] tikv scale in: delete store 154 for tikv dan106/dan106-tikv-5 successfully
I0605 08:37:51.792941       1 tidbcluster_control.go:68] TidbCluster: [dan106/dan106] updated successfully
I0605 08:37:51.792974       1 tidb_cluster_controller.go:295] TidbCluster: dan106/dan106, still need sync: TiKV dan106/dan106-tikv-5 store 154  still in cluster, state: Up, requeuing
I0605 08:38:00.038326       1 utils.go:149] set dan106/dan106-pd partition to 1
I0605 08:38:00.038372       1 utils.go:149] set dan106/dan106-pd partition to 0
I0605 08:38:00.045450       1 stateful_set_control.go:81] TidbCluster: [dan106/dan106]'s StatefulSet: [dan106/dan106-pd] updated successfully
I0605 08:38:06.991063       1 tidbcluster_control.go:68] TidbCluster: [dan106/dan106] updated successfully
E0605 08:38:06.991123       1 tidb_cluster_controller.go:297] TidbCluster: dan106/dan106, sync failed Get http://dan106-pd.dan106:2379/pd/api/v1/stores: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers), requeuing
I0605 08:38:44.243548       1 stateful_set_control.go:81] TidbCluster: [dan106/dan106]'s StatefulSet: [dan106/dan106-pd] updated successfully
I0605 08:38:45.655040       1 tikv_scaler.go:84] the TidbCluster: [dan106/dan106]'s tikv is upgrading,can not scale in until upgrade have completed
I0605 08:38:45.664532       1 stateful_set_control.go:81] TidbCluster: [dan106/dan106]'s StatefulSet: [dan106/dan106-tikv] updated successfully
I0605 08:38:48.589849       1 pvc_control.go:90] update PVC: [dan106/pd-dan106-pd-0] successfully, TidbCluster: dan106
I0605 08:38:48.589909       1 pvc_cleaner.go:266] cluster dan106/dan106, clean pvc pd-dan106-pd-0 pod scheduling annotation successfully
I0605 08:38:48.601360       1 tidbcluster_control.go:68] TidbCluster: [dan106/dan106] updated successfully
I0605 08:38:57.233423       1 utils.go:149] set dan106/dan106-tikv partition to 3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority:P1 status/help-wanted Extra attention is needed status/WIP Issue/PR is being worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants