Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memdb: fix data race between mutators and UnionScan #20365

Merged
merged 2 commits into from
Oct 9, 2020

Conversation

bobotu
Copy link
Contributor

@bobotu bobotu commented Oct 9, 2020

What problem does this PR solve?

Fix data race found in https://internal.pingcap.net/idc-jenkins/blue/organizations/jenkins/tidb_ghpr_unit_test/detail/tidb_ghpr_unit_test/54258/pipeline/

[2020-10-09T06:22:15.312Z] WARNING: DATA RACE
[2020-10-09T06:22:15.312Z] Write at 0x00c0363c5188 by goroutine 548:
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/kv/memdb.(*arena).enlarge()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/kv/memdb/arena.go:125 +0x1b9
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/kv/memdb.(*arena).alloc()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/kv/memdb/arena.go:112 +0xde
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/kv/memdb.(*arena).newNode()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/kv/memdb/arena.go:84 +0x68
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/kv/memdb.(*Sandbox).Put()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/kv/memdb/memdb.go:89 +0x2fb
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/kv.(*memDbBuffer).Set()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/kv/memdb_buffer.go:100 +0x27f
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/table/tables.(*TableCommon).UpdateRecord()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/table/tables/tables.go:379 +0x11cc
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/executor.updateRecord()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/write.go:196 +0x1aa6
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/executor.(*UpdateExec).exec()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/update.go:97 +0x840
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/executor.(*UpdateExec).updateRows()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/update.go:176 +0x66c
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/executor.(*UpdateExec).Next()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/update.go:127 +0xa0
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/executor.Next()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/executor.go:253 +0x27d
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/executor.(*ExecStmt).handleNoDelayExecutor()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/adapter.go:515 +0x38e
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/executor.(*ExecStmt).handleNoDelay()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/adapter.go:397 +0x24d
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/executor.(*ExecStmt).Exec()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/adapter.go:353 +0x3f6
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/session.runStmt()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/session/tidb.go:286 +0x2f2
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/session.(*session).executeStatement()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/session/session.go:1061 +0xd9
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/session.(*session).execute()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/session/session.go:1173 +0xaa7
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/session.(*session).Execute()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/session/session.go:1104 +0xee
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/util/testkit.(*TestKit).Exec()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/util/testkit/testkit.go:151 +0x103
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/util/testkit.(*TestKit).MustExec()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/util/testkit/testkit.go:189 +0x91
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/executor_test.(*testSuite7).TestUpdateScanningHandles()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/union_scan_test.go:351 +0x491
[2020-10-09T06:22:15.312Z]   runtime.call32()
[2020-10-09T06:22:15.312Z]       /usr/local/go/src/runtime/asm_amd64.s:539 +0x3a
[2020-10-09T06:22:15.312Z]   reflect.Value.Call()
[2020-10-09T06:22:15.312Z]       /usr/local/go/src/reflect/value.go:321 +0xd3
[2020-10-09T06:22:15.312Z]   github.com/pingcap/check.(*suiteRunner).forkTest.func1()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/pkg/mod/github.com/pingcap/check@v0.0.0-20200212061837-5e12011dc712/check.go:850 +0x9aa
[2020-10-09T06:22:15.312Z]   github.com/pingcap/check.(*suiteRunner).forkCall.func1()
[2020-10-09T06:22:15.312Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/pkg/mod/github.com/pingcap/check@v0.0.0-20200212061837-5e12011dc712/check.go:739 +0x113
[2020-10-09T06:22:15.312Z] 
[2020-10-09T06:22:15.312Z] Previous read at 0x00c0363c5188 by goroutine 292:
[2020-10-09T06:22:15.312Z]   github.com/pingcap/tidb/kv/memdb.(*Sandbox).findGreaterEqual()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/kv/memdb/arena.go:95 +0x18f
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/kv/memdb.(*Iterator).Seek()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/kv/memdb/iterator.go:60 +0x6d
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/kv.(*memDbBuffer).Iter()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/kv/memdb_buffer.go:63 +0x1b8
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/executor.iterTxnMemBuffer()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/mem_reader.go:311 +0x376
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/executor.(*memTableReader).getMemRows()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/mem_reader.go:202 +0x143
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/executor.(*UnionScanExec).open()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/union_scan.go:145 +0x2ac
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/executor.(*dataReaderBuilder).buildUnionScanForIndexJoin()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/builder.go:2691 +0x302
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/executor.(*dataReaderBuilder).buildExecutorForIndexJoinInternal()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/builder.go:2655 +0x682
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/executor.(*dataReaderBuilder).buildExecutorForIndexJoin()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/builder.go:2642 +0x11c
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/executor.(*innerWorker).fetchInnerResults()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/index_lookup_join.go:658 +0x1ec
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/executor.(*innerWorker).handleTask()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/index_lookup_join.go:512 +0xf9
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/executor.(*innerWorker).run()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/index_lookup_join.go:491 +0x168
[2020-10-09T06:22:15.313Z] 
[2020-10-09T06:22:15.313Z] Goroutine 548 (running) created at:
[2020-10-09T06:22:15.313Z]   github.com/pingcap/check.(*suiteRunner).forkCall()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/pkg/mod/github.com/pingcap/check@v0.0.0-20200212061837-5e12011dc712/check.go:734 +0x4a3
[2020-10-09T06:22:15.313Z]   github.com/pingcap/check.(*suiteRunner).forkTest()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/pkg/mod/github.com/pingcap/check@v0.0.0-20200212061837-5e12011dc712/check.go:832 +0x1b9
[2020-10-09T06:22:15.313Z]   github.com/pingcap/check.(*suiteRunner).doRun()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/pkg/mod/github.com/pingcap/check@v0.0.0-20200212061837-5e12011dc712/check.go:666 +0x13a
[2020-10-09T06:22:15.313Z]   github.com/pingcap/check.(*suiteRunner).asyncRun.func1()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/pkg/mod/github.com/pingcap/check@v0.0.0-20200212061837-5e12011dc712/check.go:650 +0xf7
[2020-10-09T06:22:15.313Z] 
[2020-10-09T06:22:15.313Z] Goroutine 292 (running) created at:
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/executor.(*IndexLookUpJoin).startWorkers()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/index_lookup_join.go:204 +0x343
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/executor.(*IndexLookUpJoin).Open()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/index_lookup_join.go:186 +0x469
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/executor.(*UpdateExec).Open()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/update.go:269 +0x2e9
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/executor.(*ExecStmt).Exec()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/adapter.go:321 +0x299
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/session.runStmt()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/session/tidb.go:286 +0x2f2
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/session.(*session).executeStatement()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/session/session.go:1061 +0xd9
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/session.(*session).execute()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/session/session.go:1173 +0xaa7
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/session.(*session).Execute()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/session/session.go:1104 +0xee
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/util/testkit.(*TestKit).Exec()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/util/testkit/testkit.go:151 +0x103
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/util/testkit.(*TestKit).MustExec()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/util/testkit/testkit.go:189 +0x91
[2020-10-09T06:22:15.313Z]   github.com/pingcap/tidb/executor_test.(*testSuite7).TestUpdateScanningHandles()
[2020-10-09T06:22:15.313Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/executor/union_scan_test.go:351 +0x491
[2020-10-09T06:22:15.313Z]   runtime.call32()
[2020-10-09T06:22:15.313Z]       /usr/local/go/src/runtime/asm_amd64.s:539 +0x3a
[2020-10-09T06:22:15.313Z]   reflect.Value.Call()
[2020-10-09T06:22:15.313Z]       /usr/local/go/src/reflect/value.go:321 +0xd3
[2020-10-09T06:22:15.314Z]   github.com/pingcap/check.(*suiteRunner).forkTest.func1()
[2020-10-09T06:22:15.314Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/pkg/mod/github.com/pingcap/check@v0.0.0-20200212061837-5e12011dc712/check.go:850 +0x9aa
[2020-10-09T06:22:15.314Z]   github.com/pingcap/check.(*suiteRunner).forkCall.func1()
[2020-10-09T06:22:15.314Z]       /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/pkg/mod/github.com/pingcap/check@v0.0.0-20200212061837-5e12011dc712/check.go:739 +0x113

What is changed and how it works?

After #19276 the UnionScan executor will read the immutable snapshot part of memdb to avoid data race. Unfortunately, the collection of uderlaying blocks may be updated while UnionScan collecting data.

To resolve this issue, I simply use a Mutex to protect operations on arena.blocks. The reason for use Mutex is the fast path is a single atomic CAS instruction which is much faster than RWMutex.

Check List

Tests

  • Unit test

Release note

  • No release note

@bobotu bobotu added the type/bugfix This PR fixes a bug. label Oct 9, 2020
Copy link
Contributor

@crazycs520 crazycs520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot
Copy link
Contributor

@crazycs520, Thanks for your review, however we are sorry that your vote won't be count.

@crazycs520
Copy link
Contributor

/run-all-tests

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Oct 9, 2020
@zz-jason
Copy link
Member

zz-jason commented Oct 9, 2020

/merge

@ti-srebot
Copy link
Contributor

@zz-jason Oops! This PR requires at least 2 LGTMs to merge. The current number of LGTM is 1

Copy link
Member

@jackysp jackysp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot removed the status/LGT1 Indicates that a PR has LGTM 1. label Oct 9, 2020
@ti-srebot ti-srebot added the status/LGT2 Indicates that a PR has LGTM 2. label Oct 9, 2020
@jackysp
Copy link
Member

jackysp commented Oct 9, 2020

/merge

@ti-srebot ti-srebot added the status/can-merge Indicates a PR has been approved by a committer. label Oct 9, 2020
@ti-srebot
Copy link
Contributor

/run-all-tests

@ti-srebot ti-srebot merged commit 2f98a29 into pingcap:release-4.0 Oct 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants