Skip to content
This repository has been archived by the owner on Nov 24, 2023. It is now read-only.

*: use dumpling's finish building connection location to leave safe mode #915

Merged
merged 17 commits into from
Sep 3, 2020

Conversation

lance6716
Copy link
Collaborator

@lance6716 lance6716 commented Aug 21, 2020

What problem does this PR solve?

close #907

What is changed and how it works?

when dump unit can't assure consistency (for example migrate Aurora, or maybe we could support more scenario to avoid FTWRL), between start and ending of dumping's creating connections, binlog maybe duplicated with dump result, so we should enable safe mode in this period. originally safe mode was turned on for 5 minutes, this PR assure safe mode will keep for enough time.

maybe we should save exit location to checkpoint table, to handle worker switch before exit safe mode. This will achieve in another PR including updating strategy

Check List

Tests

  • Unit test
  • Integration test

Code changes

Side effects

  • Increased code complexity

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation
  • Need to be included in the release note

@lance6716 lance6716 added needs-cherry-pick-release-1.0 This PR should be cherry-picked to release-1.0. Remove this label after cherry-picked to release-1.0 needs-cherry-pick-release-2.0 This PR should be cherry-picked to release-2.0. Remove this label after cherry-picked to release-2.0 needs-update-docs Should update docs after this PR is merged. Remove this label once the docs are updated needs-update-release-note This PR should be added into release notes. Remove this label once the release notes are updated priority/normal Minor change, requires approval from ≥1 primary reviewer status/DNM Do not merge, test is failing or blocked by another PR labels Aug 21, 2020
@lance6716 lance6716 changed the title [DNM]*: use dump unit exit location to leave safe mode *: use dump unit exit location to leave safe mode Aug 21, 2020
@lance6716 lance6716 added status/WIP This PR is still work in progress and removed status/DNM Do not merge, test is failing or blocked by another PR labels Aug 21, 2020
@lance6716 lance6716 added status/PTAL This PR is ready for review. Add this label back after committing new changes and removed status/WIP This PR is still work in progress labels Aug 24, 2020
@lance6716
Copy link
Collaborator Author

/run-all-tests

@lance6716
Copy link
Collaborator Author

waiting dumpling to record exit binlog location in metadata files

@GMHDBJD GMHDBJD added status/WIP This PR is still work in progress and removed status/PTAL This PR is ready for review. Add this label back after committing new changes labels Aug 28, 2020
@lance6716

This comment has been minimized.

@lance6716 lance6716 removed the needs-cherry-pick-release-1.0 This PR should be cherry-picked to release-1.0. Remove this label after cherry-picked to release-1.0 label Aug 31, 2020
@@ -0,0 +1,181 @@
// Copyright 2019 PingCAP, Inc.
Copy link
Collaborator Author

@lance6716 lance6716 Aug 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed from pkg/utils/mydumper.go to avoid importing cycle

Copy link
Member

@csuzhangxc csuzhangxc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should save exit location to checkpoint table, to handle worker switch before exit safe mode.

the case:

  1. start task with --consistency none and enter sync stage
  2. flush checkpoint for sync unit
  3. pause-task
  4. resume-task

Should save exit location to checkpoint table?

pkg/dumpling/utils.go Outdated Show resolved Hide resolved
syncer/checkpoint.go Outdated Show resolved Hide resolved
dm/config/task.go Outdated Show resolved Hide resolved
@@ -1094,7 +1094,7 @@ func (s *Syncer) Run(ctx context.Context) (err error) {
if s.cfg.Mode == config.ModeAll {
if err = s.flushCheckPoints(); err != nil {
s.tctx.L().Warn("fail to flush checkpoints when starting task", zap.Error(err))
} else {
} else if s.cfg.CleanDumpFile {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may fix the error in #941?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we default enabled CleanDumpFile, so if this line was added #941 still happens. Maybe we should check downstream checkpoint, if not exist then load from file 😢

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previous if fresh can achieve this?

@lance6716
Copy link
Collaborator Author

lance6716 commented Sep 2, 2020

Should save exit location to checkpoint table?

Yes, going to change checkpoint structure, implement DM version update changes, and Maybe we should check downstream checkpoint, if not exist then load from file in new PR

Copy link
Member

@csuzhangxc csuzhangxc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@csuzhangxc csuzhangxc added status/LGT1 One reviewer already commented LGTM and removed status/PTAL This PR is ready for review. Add this label back after committing new changes labels Sep 2, 2020
@lance6716
Copy link
Collaborator Author

/run-all-tests

Copy link
Collaborator

@GMHDBJD GMHDBJD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@GMHDBJD GMHDBJD added status/LGT2 Two reviewers already commented LGTM, ready for merge and removed status/LGT1 One reviewer already commented LGTM labels Sep 3, 2020
@lance6716 lance6716 merged commit b16f201 into pingcap:master Sep 3, 2020
@lance6716 lance6716 deleted the safe-mode branch September 3, 2020 02:38
@ti-srebot
Copy link

cherry pick to release-2.0 failed

lance6716 added a commit to lance6716/dm that referenced this pull request Sep 3, 2020
@lance6716 lance6716 added already-cherry-pick-2.0 The related PR is already cherry-picked to release-2.0. Add this label once the PR is cherry-picked and removed needs-cherry-pick-release-2.0 This PR should be cherry-picked to release-2.0. Remove this label after cherry-picked to release-2.0 labels Sep 3, 2020
@csuzhangxc csuzhangxc added already-update-release-note The release note is updated. Add this label once the release note is updated already-update-docs The docs related to this PR already updated. Add this label once the docs are updated and removed needs-update-release-note This PR should be added into release notes. Remove this label once the release notes are updated needs-update-docs Should update docs after this PR is merged. Remove this label once the docs are updated labels Oct 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
already-cherry-pick-2.0 The related PR is already cherry-picked to release-2.0. Add this label once the PR is cherry-picked already-update-docs The docs related to this PR already updated. Add this label once the docs are updated already-update-release-note The release note is updated. Add this label once the release note is updated priority/normal Minor change, requires approval from ≥1 primary reviewer status/LGT2 Two reviewers already commented LGTM, ready for merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

more actions on Dump unit
4 participants