Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli(ticdc): Allow overwriting checkpoint ts when resuming a changefeed #6020

Conversation

zhaoxinyu
Copy link
Contributor

@zhaoxinyu zhaoxinyu commented Jun 24, 2022

What problem does this PR solve?

Issue Number: close #5684 #6021

What is changed and how it works?

  1. Adding a new cli cmd parameter --overwrite-checkpoint-ts in changefeed resume, which allows users to specify a new changefeed checkpoint ts in the range of (GCSafeTime, CurrentTSO]. By this way, a changefeed in failed state can be resumed with an overwritten checkpoint ts instead of removing and then recreating a changefeed.
  2. In owner(ticdc): Add backoff mechanism into changefeed restart logic (#4262) #4337, we add a defaultBackoffMaxElapsedTime to control the total retry time of changefeed. If that time elapses, the changefeed will enter into failed state. However, some users think that it's not a good idea because gc-ttl is still in valid range. So in this PR, we remove defaultBackoffMaxElapsedTime and allow changefeed retry continuously if gc-ttl is in that valid range.

Check List

Tests

  • Unit test
  • Integration test

Questions

Will it cause performance regression or break compatibility?

No

Do you need to update user documentation, design documentation or monitoring documentation?

Yes

Release note

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Jun 24, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • asddongmen
  • sdojjy

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 24, 2022
@zhaoxinyu
Copy link
Contributor Author

/cc @asddongmen
/cc @sdojjy
/cc @overvenus

@@ -120,7 +116,7 @@ func (m *feedStateManager) shiftStateWindow(state model.FeedState) {
m.stateHistory[defaultStateWindowSize-1] = state
}

func (m *feedStateManager) Tick(state *orchestrator.ChangefeedReactorState) {
func (m *feedStateManager) Tick(state *orchestrator.ChangefeedReactorState) (adminJobPending bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is a good idea to return a value in the tick method ?

@sdojjy
Copy link
Member

sdojjy commented Jun 24, 2022

/run-integration-tests

cdc/api/v2/changefeed.go Outdated Show resolved Hide resolved
@zhaoxinyu
Copy link
Contributor Author

A failed unit test is under fixing.

@zhaoxinyu
Copy link
Contributor Author

/cc @liuzix

@ti-chi-bot ti-chi-bot requested a review from liuzix June 27, 2022 07:25
cdc/owner/changefeed.go Show resolved Hide resolved
cdc/scheduler/internal/base/owner_scheduler.go Outdated Show resolved Hide resolved
pkg/cmd/cli/cli_changefeed_resume.go Outdated Show resolved Hide resolved
@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 27, 2022
@zhaoxinyu zhaoxinyu changed the title cli(ticdc): Allow overwrite checkpoint ts when resuming a changefeed cli(ticdc): Allow overwriting checkpoint ts when resuming a changefeed Jun 27, 2022
@zhaoxinyu zhaoxinyu force-pushed the optimize_changefeed_resume_experience branch from 7b49cf7 to 9b51968 Compare June 27, 2022 09:48
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 27, 2022
@zhaoxinyu
Copy link
Contributor Author

/run-integration-tests
/run-unit-tests

@zhaoxinyu
Copy link
Contributor Author

/run-unit-tests

@zhaoxinyu zhaoxinyu requested review from sdojjy and liuzix June 28, 2022 02:07
@zhaoxinyu
Copy link
Contributor Author

/hold
There remains some code conflicts to be resolved.

@ti-chi-bot ti-chi-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 28, 2022
@zhaoxinyu
Copy link
Contributor Author

/unhold

@ti-chi-bot ti-chi-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 28, 2022
@zhaoxinyu
Copy link
Contributor Author

/run-integration-tests

@zhaoxinyu
Copy link
Contributor Author

/cc @crelax

@ti-chi-bot ti-chi-bot requested a review from crelax June 29, 2022 02:31
@zhaoxinyu
Copy link
Contributor Author

/run-verify

@codecov-commenter
Copy link

Codecov Report

Merging #6020 (4c1714f) into cli-use-open-api (6a92960) will decrease coverage by 6.4267%.
The diff coverage is n/a.

Flag Coverage Δ
cdc ?
dm 51.9889% <ø> (+0.0685%) ⬆️
engine ?

Flags with carried forward coverage won't be shown. Click here to find out more.

@@                   Coverage Diff                    @@
##           cli-use-open-api      #6020        +/-   ##
========================================================
- Coverage           58.4156%   51.9889%   -6.4267%     
========================================================
  Files                   708        240       -468     
  Lines                 83743      39393     -44350     
========================================================
- Hits                  48919      20480     -28439     
+ Misses                30373      16810     -13563     
+ Partials               4451       2103      -2348     

@zhaoxinyu
Copy link
Contributor Author

/run-integration-tests

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants