Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster: updateTopology maybe hang forever #1144

Merged

Conversation

9547
Copy link
Contributor

@9547 9547 commented Feb 21, 2021

What problem does this PR solve?

fix #333

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Code changes

  • Has exported function/method change
  • Has exported variable/fields change
  • Has interface methods change
  • Has persistent data change

Side effects

  • Possible performance regression
  • Increased code complexity
  • Breaking backward compatibility

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

Release notes:

fixed an issue with hang forever when updating topology

@ti-chi-bot ti-chi-bot requested a review from nrc February 21, 2021 17:09
@ti-chi-bot ti-chi-bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 21, 2021
@codecov-io
Copy link

codecov-io commented Feb 21, 2021

Codecov Report

Merging #1144 (36d2c41) into master (30b7746) will decrease coverage by 12.68%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #1144       +/-   ##
===========================================
- Coverage   53.66%   40.98%   -12.69%     
===========================================
  Files         285      284        -1     
  Lines       20317    20285       -32     
===========================================
- Hits        10904     8313     -2591     
- Misses       7748    10692     +2944     
+ Partials     1665     1280      -385     
Flag Coverage Δ
cluster 31.17% <100.00%> (-13.92%) ⬇️
dm ?
integrate 34.49% <100.00%> (-13.54%) ⬇️
playground 2.93% <ø> (ø)
tiup 16.36% <ø> (ø)
unittest 22.89% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/cluster/task/update_topology.go 72.22% <100.00%> (+1.06%) ⬆️
components/dm/main.go 0.00% <0.00%> (-100.00%) ⬇️
components/dm/spec/bindversion.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/cluster/template/config/config.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/cluster/template/scripts/scripts.go 0.00% <0.00%> (-100.00%) ⬇️
components/dm/spec/cluster.go 0.00% <0.00%> (-87.50%) ⬇️
pkg/queue/any_queue.go 0.00% <0.00%> (-83.34%) ⬇️
pkg/cluster/template/scripts/tiflash.go 0.00% <0.00%> (-79.60%) ⬇️
components/dm/spec/logic.go 0.60% <0.00%> (-79.27%) ⬇️
pkg/cluster/manager/reload.go 0.00% <0.00%> (-77.56%) ⬇️
... and 109 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 30b7746...36d2c41. Read the comment docs.

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • AstroProfundis

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Feb 22, 2021
@AstroProfundis
Copy link
Contributor

AstroProfundis commented Feb 22, 2021

Timeout is good, but I think we should also refine the process of checking PD status before updating topology and fail there. (I believe we should have timeout set for HTTP clients)

@AstroProfundis
Copy link
Contributor

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 36d2c41

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Feb 22, 2021
@ti-chi-bot ti-chi-bot merged commit ffed4ff into pingcap:master Feb 22, 2021
@AstroProfundis AstroProfundis added this to the v1.3.3 milestone Mar 1, 2021
lucklove pushed a commit to lucklove/tiup that referenced this pull request Mar 4, 2021
* fix(cluster): UpdateTopology maybe hangs forever

* tests: add prometheus,grafana start testcase
@9547 9547 deleted the fix/update-topology-maybe-hang-forever branch April 6, 2021 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/S Denotes a PR that changes 10-29 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT1 Indicates that a PR has LGTM 1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add timeout when updatetopology
5 participants