Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

txn: fix the ttlmanager and cleanup logic for 1pc and async commit #23342

Merged
merged 7 commits into from
Mar 18, 2021

Conversation

cfzjywxk
Copy link
Contributor

@cfzjywxk cfzjywxk commented Mar 16, 2021

What problem does this PR solve?

Issue Number: close #23331

Problem Summary:
There are two problems:

  • The left pessimistic locks are not cleaned up using 1pc protocol to commit, blocking other concurrent transactions on these keys.
  • The ttlManager is not closed, if the max ts caculation error is reported using async commit or 1pc, thus the left pessimistic locks could not be resolved by the ddl backfill worker.

What is changed and how it works?

What's Changed:

  1. Do cleanup if the error is reported when 1pc protocol is used.
  2. Close the ttlManager if the execution result of twoPhaseCommitter is error.

How it Works:

Related changes

  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test

Side effects

Release note

  • No release note.

@cfzjywxk cfzjywxk added type/bugfix This PR fixes a bug. sig/transaction SIG:Transaction labels Mar 16, 2021
@ti-chi-bot ti-chi-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 16, 2021
@cfzjywxk cfzjywxk requested a review from zyguan March 16, 2021 07:24
@@ -920,6 +920,9 @@ func (c *twoPhaseCommitter) execute(ctx context.Context) (err error) {
if c.isOnePC() {
// The error means the 1PC transaction failed.
if err != nil {
if c.getUndeterminedErr() == nil {
c.cleanup(ctx)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just use something like pessimisticRollbackMutations if it's pessimistic transactions instead of cleanup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there won't be too much difference, but maybe we can skip this for optimistic 1pc transactions. @sticnarf How do you think?

Copy link
Contributor

@sticnarf sticnarf Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always use pessimistic rollback because an optimistic one pc won't leave any waste.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about changing cleanup()?

		cleanupKeysCtx := context.WithValue(context.Background(), TxnStartKey, ctx.Value(TxnStartKey))
		var err error
		if c.isPessimistic && c.isOnePC() {
			err = c.pessimisticRollbackMutations(NewBackofferWithVars(cleanupKeysCtx, cleanupMaxBackoff, c.txn.vars), c.mutations)
		} else {
			err = c.cleanupMutations(NewBackofferWithVars(cleanupKeysCtx, cleanupMaxBackoff, c.txn.vars), c.mutations)

		}
		if err != nil {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youjiali1995 We needn't do cleanup for optimistic 1PC, so I suggest this:

if !c.isOnePC() {
	err = c.cleanupMutations(NewBackofferWithVars(cleanupKeysCtx, cleanupMaxBackoff, c.txn.vars), c.mutations)
} else if c.isPessimistic {
	err = c.pessimisticRollbackMutations(NewBackofferWithVars(cleanupKeysCtx, cleanupMaxBackoff, c.txn.vars), c.mutations)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake.

@github-actions github-actions bot added the sig/sql-infra SIG: SQL Infra label Mar 16, 2021
Copy link
Contributor

@MyonKeminta MyonKeminta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -920,6 +920,9 @@ func (c *twoPhaseCommitter) execute(ctx context.Context) (err error) {
if c.isOnePC() {
// The error means the 1PC transaction failed.
if err != nil {
if c.getUndeterminedErr() == nil {
c.cleanup(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there won't be too much difference, but maybe we can skip this for optimistic 1pc transactions. @sticnarf How do you think?

@ti-chi-bot
Copy link
Member

@MyonKeminta: Please use /LGTM instead of LGTM when you want to approve the pull request by comment.
If you use the GitHub review feature, please approve the PR directly, the comment will not take effect in the GitHub review feature.
If you have any qustions please refer to lgtm command help or lgtm plugin design.

If you have approved this PR, please ignore this reply. This reply is being used as a temporary reply during the migration of the new bot and will be removed on April 1.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Mar 16, 2021
@ichn-hu ichn-hu mentioned this pull request Mar 16, 2021
@cfzjywxk
Copy link
Contributor Author

@sticnarf @youjiali1995 PTAL

Copy link
Contributor

@youjiali1995 youjiali1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

@@ -250,7 +250,11 @@ func (txn *KVTxn) Commit(ctx context.Context) error {
}
defer func() {
// For async commit transactions, the ttl manager will be closed in the asynchronous commit goroutine.
if !committer.isAsyncCommit() {
if committer.isAsyncCommit() || committer.isOnePC() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sticnarf @youjiali1995

Could we just always close the ttl manager here, seems at commit phase and the async commit/1pc transaction size will be small.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is okay. The chance is small that TTL expires before the primary lock is committed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think yes because SafeWindow(2s) is smaller than the interval of keepAlive(10s). It's nearly useless here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me not related to SafeWindow. SafeWindow only constrains the prewrite time.

@@ -899,15 +899,20 @@ func (c *twoPhaseCommitter) cleanup(ctx context.Context) {
})

cleanupKeysCtx := context.WithValue(context.Background(), TxnStartKey, ctx.Value(TxnStartKey))
err := c.cleanupMutations(NewBackofferWithVars(cleanupKeysCtx, cleanupMaxBackoff, c.txn.vars), c.mutations)
var err error
if c.isPessimistic && c.isOnePC() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about skipping cleanup for optimistic one pc? https://github.com/pingcap/tidb/pull/23342/files#r596528085

Copy link
Contributor

@youjiali1995 youjiali1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cfzjywxk
Copy link
Contributor Author

@sticnarf
Copy link
Contributor

Should also remove https://github.com/pingcap/tidb/pull/23342/files#diff-7352f99d1adcf12eded39ff8934684296725268e9116dcecfa16f560827f9271R1173

I cannot see the line you refer to. I guess it's the defer c.ttlManager.close() in the asynchrounous commit stage?

@youjiali1995
Copy link
Contributor

I cannot see the line you refer to. I guess it's the defer c.ttlManager.close() in the asynchrounous commit stage?

Yes.

@cfzjywxk
Copy link
Contributor Author

Should also remove https://github.com/pingcap/tidb/pull/23342/files#diff-7352f99d1adcf12eded39ff8934684296725268e9116dcecfa16f560827f9271R1173

I cannot see the line you refer to. I guess it's the defer c.ttlManager.close() in the asynchrounous commit stage?

Got, it's removed.

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • MyonKeminta
  • youjiali1995

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Mar 18, 2021
Copy link
Contributor

@sticnarf sticnarf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@ti-chi-bot
Copy link
Member

@sticnarf: Please use /LGTM instead of LGTM when you want to approve the pull request by comment.
If you use the GitHub review feature, please approve the PR directly, the comment will not take effect in the GitHub review feature.
If you have any qustions please refer to lgtm command help or lgtm plugin design.

If you have approved this PR, please ignore this reply. This reply is being used as a temporary reply during the migration of the new bot and will be removed on April 1.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@cfzjywxk
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 76838b2

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Mar 18, 2021
@ti-chi-bot ti-chi-bot merged commit 1ab3b48 into pingcap:master Mar 18, 2021
ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Mar 18, 2021
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor

cherry pick to release-5.0 in PR #23388

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-5.0 sig/sql-infra SIG: SQL Infra sig/transaction SIG:Transaction size/M Denotes a PR that changes 30-99 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The ddl test suite of stmtflow failed with 1pc enabled
6 participants