Skip to content
This repository has been archived by the owner on Nov 24, 2023. It is now read-only.

syncer: fix syncer gtid auto switch from off to on #1723

Merged
merged 19 commits into from
Jun 3, 2021

Conversation

lichunzhu
Copy link
Contributor

What problem does this PR solve?

#1586 part 1

What is changed and how it works?

Add auto adjust gtid before we start a task.

Check List

Tests

  • Unit test
  • Integration test

Code changes

  • Has exported function/method change
  • Has exported variable/fields change

Side effects

  • Possible performance regression
  • Increased code complexity

Related changes

  • Need to cherry-pick to the release branch
  • Need to be included in the release note

@lichunzhu lichunzhu added the needs-cherry-pick-release-2.0 This PR should be cherry-picked to release-2.0. Remove this label after cherry-picked to release-2.0 label May 25, 2021
@lichunzhu lichunzhu added status/PTAL This PR is ready for review. Add this label back after committing new changes and removed size/XXL labels May 25, 2021
Copy link
Collaborator

@lance6716 lance6716 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will review later

Is the test case needed to be added to other_integrations.txt?

@@ -254,8 +254,10 @@ func (s *Server) doStartKeepAlive() {
}

func (s *Server) stopKeepAlive() {
s.kaCancel()
s.kaWg.Wait()
if s.kaCancel != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in which case stopKeepAlive is called before or concurrent with startKeepAlive? we may need a lock if so 😂

Copy link
Contributor Author

@lichunzhu lichunzhu May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. dm-worker/main.go s.Start()
  2. Start() return an error before s.startKeepAlive(). e.g.: worker port already in use
  3. dm-worker/main.go s.Close() -> s.stopKeepAlive(), report a nil pointer info in stdout

Comment on lines 331 to 336
if cmp = compareIndex(location1.Suffix, location2.Suffix); cmp != 0 {
return cmp
}
if location1.gtidSet != nil && location1.gtidSet.String() != "" {
return cmp
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if both gtid sets are empty, but suffix are not? e.g. location1:(pos1,"",1), location2:(pos2,"",0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Suffix is only used in handle-error?

check_checkpoint $SOURCE_ID1 $name1 $pos1 $gtid1
check_checkpoint $SOURCE_ID2 $name2 $pos2 $gtid2
dmctl_stop_task_with_retry $TASK_NAME $MASTER_PORT
clean_gtid
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe start a source without gtid and than restart with gtid closer to the user scenario? Moreover, adjust gtid in relay seems not be implemented and tested in this test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think cleaning gtid data is the same. Operation source again is a little complicated so I use this way to simplify the test.
  2. adjust GTID in relay will be implemented in the next PR

dmctl_stop_task_with_retry $TASK_NAME $MASTER_PORT
clean_gtid

export GO_FAILPOINTS='github.com/pingcap/dm/syncer/AdjustGTIDExit=return(true)'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dumplicate export

@purelind
Copy link

/run-all-tests

@@ -56,6 +56,7 @@ func TestRunMain(_ *testing.T) {
defer func() { utils.OsExit = oldOsExit }()
utils.OsExit = func(code int) {
log.L().Info("os exits", zap.Int("exit code", code))
_ = utils.KillMySelf()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember why we need this helper function now. maybe the process won't exit after this test function returns?

if so, I think the reason may be that lthe test function should be named TestMain not TestRunMain, please check if this works.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -56,6 +56,7 @@ func TestRunMain(_ *testing.T) {
defer func() { utils.OsExit = oldOsExit }()
utils.OsExit = func(code int) {
log.L().Info("os exits", zap.Int("exit code", code))
_ = utils.KillMySelf()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkg/binlog/position.go Outdated Show resolved Hide resolved
pkg/binlog/position.go Show resolved Hide resolved
pkg/binlog/reader/reader.go Show resolved Hide resolved
@lance6716 lance6716 added the needs-update-release-note This PR should be added into release notes. Remove this label once the release notes are updated label Jun 1, 2021
@lance6716 lance6716 added this to the v2.0.4 milestone Jun 1, 2021
Copy link
Collaborator

@lance6716 lance6716 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgtm

defer r.Close()

// GetGTIDsForPosStreamer tries to get GTID sets for the specified binlog position (for the corresponding txn) from a Streamer.
func GetGTIDsForPosStreamer(ctx context.Context, r Streamer, endPos gmysql.Position) (gtid.Set, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func GetGTIDsForPosStreamer(ctx context.Context, r Streamer, endPos gmysql.Position) (gtid.Set, error) {
func GetGTIDsForPosFromStreamer(ctx context.Context, r Streamer, endPos gmysql.Position) (gtid.Set, error) {

@@ -781,7 +781,8 @@ func (cp *RemoteCheckPoint) Load(tctx *tcontext.Context) error {
gset,
)
if isGlobal {
if binlog.CompareLocation(location, binlog.NewLocation(cp.cfg.Flavor), cp.cfg.EnableGTID) > 0 {
// Use CompareLocationAsPossible here to make sure checkpoint can be updated if gset is empty
if binlog.CompareLocationAsPossible(location, binlog.NewLocation(cp.cfg.Flavor), cp.cfg.EnableGTID) > 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this function always called with location2 as empty location? If so, we could not be bothered to add an unit test, just simplifying the logic and tighten function name

syncer/syncer.go Show resolved Hide resolved
tests/adjust_gtid/run.sh Show resolved Hide resolved
tests/adjust_gtid/run.sh Outdated Show resolved Hide resolved
@GMHDBJD
Copy link
Collaborator

GMHDBJD commented Jun 3, 2021

/lgtm

@ti-chi-bot
Copy link
Member

@GMHDBJD: Please use GitHub review feature instead of /lgtm [cancel] when you want to submit review to the pull request.
For how to use GitHub review feature, see also this document provided by GitHub.

For the reason we drop support to the commands, see also this page.
This reply is being used as a temporary reply during the migration of review process and will be removed on July 1st.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

Copy link
Collaborator

@GMHDBJD GMHDBJD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@ti-chi-bot
Copy link
Member

@GMHDBJD: Please use GitHub review feature instead of /lgtm [cancel] when you want to submit review to the pull request.
For how to use GitHub review feature, see also this document provided by GitHub.

For the reason we drop support to the commands, see also this page.
This reply is being used as a temporary reply during the migration of review process and will be removed on July 1st.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • lance6716

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added the status/LGT1 One reviewer already commented LGTM label Jun 3, 2021
@lance6716
Copy link
Collaborator

add require-LGT1 for GMHDBJD

@lance6716 lance6716 added the require-LGT1 for small PR, LGT1 is enough label Jun 3, 2021
@lance6716
Copy link
Collaborator

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 4c39ddf

@ti-chi-bot ti-chi-bot merged commit 5c1377e into pingcap:master Jun 3, 2021
Copy link
Collaborator

@GMHDBJD GMHDBJD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #1745.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
needs-cherry-pick-release-2.0 This PR should be cherry-picked to release-2.0. Remove this label after cherry-picked to release-2.0 needs-update-release-note This PR should be added into release notes. Remove this label once the release notes are updated require-LGT1 for small PR, LGT1 is enough size/XXL status/can-merge status/LGT1 One reviewer already commented LGTM status/PTAL This PR is ready for review. Add this label back after committing new changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants