mvcc/kvstore: Optimize compaction #11150

shenjiangc · 2019-09-14T07:49:31Z

mvcc/kvstore: Optimize compaction

when the number of key-value index is greater than one million, compact take too long and blocks other requests.
Move the “kvindex.Compact” to “fifoSched.Schedule” func

Remove auth validation loop in v3_server.raftRequest(). Re-validation when error ErrAuthOldRevision occurs should be handled on client side.

…ompact take too long and blocks other requests

jingyih · 2019-09-14T08:42:39Z

Could you elaborate how this change helps? This is still the same amount of work but is done at a different time? Do you have test data showing the improvement? Thanks!

Also if we move index compaction to the scheduled FIFO, we need to adjust the metric "indexCompactionPauseMs" accordingly.

shenjiangc · 2019-09-14T09:38:42Z

Could you elaborate how this change helps? This is still the same amount of work but is done at a different time? Do you have test data showing the improvement? Thanks!

Also if we move index compaction to the scheduled FIFO, we need to adjust the metric "indexCompactionPauseMs" accordingly.

Yes, Just let index compact do in another goroutine。

first，I put 1000000 or more key-value：
./tools/benchmark --target-leader --conns=50 --clients=50 put --key-size=64 --sequential-keys --total=5000000 --val-size=256 --key-space-size=5000000

then，i do compact use：
rev=$(./etcdctl --endpoints=http://127.0.0.1:2379 endpoint status --write-out="json" | egrep -o '"revision":[0-9]' | egrep -o '[0-9].') ; echo $rev
./etcdctl --endpoints=http://127.0.0.1:2379 compact $rev
the compaction took more than 1 sec。

when the compact is going on， the other new put request is blocked, and Completion delay > 1 sec. The reason is the index compact took too time to assign map[]，when the tree is big enough。

Move index compaction to the scheduled FIFO， let the compaction don't block the new request。

TEST ：（put 5 key，when compacting）
Before Modification：
./tools/benchmark --target-leader --conns=1 --clients=1 put --key-size=64 --sequential-keys --total=5 --val-size=256 --key-space-size=5
Summary:
Total: 2.9459 secs.
Slowest: 2.9445 secs.
Fastest: 0.0002 secs.
Average: 0.5892 secs.
Stddev: 1.1776 secs.
Requests/sec: 1.6973
Response time histogram:
0.0002 [1] |∎∎∎∎∎∎∎∎∎∎∎∎∎
0.2947 [3] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.5891 [0] |
0.8835 [0] |
1.1779 [0] |
1.4723 [0] |
1.7668 [0] |
2.0612 [0] |
2.3556 [0] |
2.6500 [0] |
2.9445 [1] |∎∎∎∎∎∎∎∎∎∎∎∎∎

After Modification：
./tools/benchmark --target-leader --conns=1 --clients=1 put --key-size=64 --sequential-keys --total=5 --val-size=256 --key-space-size=5
Summary:
Total: 0.0028 secs.
Slowest: 0.0014 secs.
Fastest: 0.0003 secs.
Average: 0.0005 secs.
Stddev: 0.0004 secs.
Requests/sec: 1783.4436

Response time histogram:
0.0003 [1] |∎∎∎∎∎∎∎∎∎∎∎∎∎
0.0004 [3] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.0005 [0] |
0.0006 [0] |
0.0007 [0] |
0.0009 [0] |
0.0010 [0] |
0.0011 [0] |
0.0012 [0] |
0.0013 [0] |
0.0014 [1] |∎∎∎∎∎∎∎∎∎∎∎∎∎

bill-of-materials was fixed for module aware 'go list' as part of coreos/license-bill-of-materials#17 So can re enable bom tests fixes etcd-io#11132

jingyih · 2019-09-15T03:28:26Z

Thanks for doing benchmark testing!

With this change, when the actual index compaction happens later, it will still block other requests such as put request. Is my understanding correct?

Remotes is not a valid git command. Also, set the environmental variable correctly.

shenjiangc · 2019-09-16T01:07:15Z

With this modification, compaction will no longer block other request messages。;)

hack: fix cherrypick instruction

jingyih · 2019-09-16T22:15:01Z

With this modification, compaction will no longer block other request messages。;)

Sorry I don't quite understand. This modification basically postponed the compaction work in index tree to a later time. So I can understand the compaction request does not block other requests immediately. But later when the actual compaction work happens, it has the same effect on other requests, right?

shenjiangc · 2019-09-16T23:57:54Z

Hi,
The etcd process normal request messages such as put/get/compact in the same goroute “go run()”. When it is processing compact,it will can’t process other messages. The fifo scheduler is another goroute. So I move the index compact to fifo scheduler, to let the goroute “go run()” to process the next request as soon.
I tested it，when the actual compaction happen,it doesn’t block the other request.

travis: re-enable bom tests

integration: fix bug in for loop, make it break properly

jingyih · 2019-09-17T18:03:55Z

Thanks I see your point. My previous thinking was that the index compaction does hold a write lock most of the time during its execution, so it blocks other request (which also need read or write the same index tree) anyway.

Another potential concern is the function returns w/o actually doing the compaction work in index tree (because it is just scheduled). So depends on when the index compaction happens, the returned "keep" could be different, meaning the keys to be deleted in the data base is different. I don't think there are any correctness concerns, but it makes consistency checking among nodes more difficult. - not completely sure about what I said here, need some time to read and think through.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

Since NewMutex will append a slash to pfx, there is no need to append a slash beforehand.

clientv3/concurrency: remove the unneeded slash

Update the code of conduct. Remove notice file.

*: update project code of conduct

clientv3: remove the redundant CancelFunc invocation

test(functional): remove unknown field Etcd.Debug

Add slack chat contact.

*: add slack contact

Added link and removed wrongly copied text

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

scripts/release: list GPG key only when tagging is needed

jingyih · 2019-10-25T00:53:56Z

So depends on when the index compaction happens, the returned "keep" could be different, meaning the keys to be deleted in the data base is different.

I think my previous statement was wrong. "keep" does not depend on key modifications after compaction rev.

So this should be safe.

jingyih

LGTM

jingyih · 2019-10-25T01:01:12Z

This conflicts with the tracing we recently added in #11179.

@YoyinZyc Could you help take a look? If we move the compaction of in-memory index to scheduled, can we still time it? Or maybe we need to drop the corresponding step?

trace.Step("compact in-memory index tree")

jingyih · 2019-10-25T01:02:11Z

Oh actually we also need to adjust the calculation of metrics indexCompactionPauseMs.

YoyinZyc · 2019-10-25T16:20:07Z

This conflicts with the tracing we recently added in #11179.

Yuchen Zhou Could you help take a look? If we move the compaction of in-memory index to scheduled, can we still time it? Or maybe we need to drop the corresponding step?

trace.Step("compact in-memory index tree")

We can remove this step in that case. It will be timed with the backend together as a whole scheduled job.

fileutil, src: format errors

jingyih · 2019-10-26T00:22:42Z

@shenjiangc Could you please rebase to current master? And then:

remove trace.Step("compact in-memory index tree").
move the calculation of "indexCompactionPauseMs" to the scheduled function. It is used to measure the duration of s.kvindex.Compact(rev).

Disable TestV3AuthOldRevConcurrent for now. See etcd-io#10468 (comment)

etcdserver: remove infinite loop for auth in raftRequest

To prevent the purge file loop from accidentally acquiring the file lock and remove the files during server shutdowm.

etcdserver: wait purge file loop to finish during shutdown

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

clientv3: fix retry/streamer error message

lease:Add Unlock before break in loop

mvcc: Add Unlock before panic to prevent double lock

etcdserver: fix a bug which append object to a new allocated sized slice

…ompact take too long and blocks other requests

shenjiangc · 2019-11-04T06:05:53Z

@shenjiangc Could you please rebase to current master? And then:

remove trace.Step("compact in-memory index tree").

move the calculation of "indexCompactionPauseMs" to the scheduled function. It is used to measure the duration of s.kvindex.Compact(rev).

I made some mistacks, the new pull request is #11330 , Please take a look。

mvcc/kvstore: Optimize compaction, slove conflict for #11150

jingyih and others added 2 commits February 12, 2019 15:40

etcdserver: remove auth validation loop

7aa6358

Remove auth validation loop in v3_server.raftRequest(). Re-validation when error ErrAuthOldRevision occurs should be handled on client side.

mvcc/kvstore:when the number key-value is greater than one million, c…

4eceaa3

…ompact take too long and blocks other requests

travis: re-enable bom tests

e4cb346

bill-of-materials was fixed for module aware 'go list' as part of coreos/license-bill-of-materials#17 So can re enable bom tests fixes etcd-io#11132

hack: fix cherrypick instruction

78fb1e3

Remotes is not a valid git command. Also, set the environmental variable correctly.

Merge pull request etcd-io#11152 from spzala/cherrypick

9bb9a88

hack: fix cherrypick instruction

Guangming Wang and others added 3 commits September 17, 2019 11:27

integration: fix bug in for loop, make it break properly

6287052

Merge pull request etcd-io#11151 from vimalk78/integration-fixes

39b4d14

travis: re-enable bom tests

Merge pull request etcd-io#11153 from beautytiger/dev-190916

a546864

integration: fix bug in for loop, make it break properly

gyuho and others added 15 commits September 17, 2019 13:29

CHANGELOG: update with patch release

2530c90

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

scripts/release: fix docker push command

c327120

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

scripts/release: fix SHA256SUMS command

8383152

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

clientv3/concurrency: remove the unneeded slash

e53298a

Since NewMutex will append a slash to pfx, there is no need to append a slash beforehand.

Merge pull request etcd-io#11161 from yaojingguo/remove-slash

501bb07

clientv3/concurrency: remove the unneeded slash

clientv3: remove the redundant CancelFunc invocation

e245642

*: update project code of conduct

5370570

Update the code of conduct. Remove notice file.

Merge pull request etcd-io#11164 from spzala/codeofconduct

9088d07

*: update project code of conduct

Merge pull request etcd-io#11162 from yaojingguo/remove-cancel

e84029c

clientv3: remove the redundant CancelFunc invocation

test(functional): remove unknown field Etcd.Debug

2c95b49

Merge pull request etcd-io#11167 from lsytj0413/fix-unknown-field

cbbaf2b

test(functional): remove unknown field Etcd.Debug

Update README.md with 9/19/2019 meeting recording

1291387

*: add slack contact

4681061

Add slack chat contact.

Merge pull request etcd-io#11172 from spzala/slack

4f47771

*: add slack contact

README: fix formatting on hangouts link

7cb2bb6

Added link and removed wrongly copied text

gyuho added 2 commits October 23, 2019 10:33

scripts/release: list GPG key only when tagging is needed

fd5a25f

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

Merge pull request etcd-io#11289 from gyuho/rrr

f17107e

scripts/release: list GPG key only when tagging is needed

jingyih approved these changes Oct 25, 2019

View reviewed changes

fileutil, src: format errors

279fee6

Merge pull request etcd-io#11288 from ZYunH/format-errors

669cd6b

fileutil, src: format errors

jingyih and others added 14 commits October 25, 2019 18:39

integration: disable TestV3AuthOldRevConcurrent

41539df

Disable TestV3AuthOldRevConcurrent for now. See etcd-io#10468 (comment)

Merge pull request etcd-io#10468 from jingyih/remove_auth_loop

84e2788

etcdserver: remove infinite loop for auth in raftRequest

mvcc: Add Unlock before panic to prevent double lock

235d7c2

lease:Add Unlock before break in loop

ce61cd4

etcdserver: wait purge file loop during shutdown

c447955

To prevent the purge file loop from accidentally acquiring the file lock and remove the files during server shutdowm.

Merge pull request etcd-io#11308 from jingyih/wait_purgefile_loop

bbe1e78

etcdserver: wait purge file loop to finish during shutdown

clientv3: fix retry/streamer error message

5dc98c5

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

Merge pull request etcd-io#11313 from gyuho/retry-log

9177098

clientv3: fix retry/streamer error message

Merge pull request etcd-io#11301 from lzhfromustc/MU_benchmarkL

cb0ba4a

lease:Add Unlock before break in loop

Merge pull request etcd-io#11300 from lzhfromustc/MU_mvcc2

fa972cf

mvcc: Add Unlock before panic to prevent double lock

etcdserver: fix append object to a new allocated sized slice

cdc2850

Merge pull request etcd-io#11325 from lijianwh/#11320_fix_bug

edd011c

etcdserver: fix a bug which append object to a new allocated sized slice

mvcc/kvstore:when the number key-value is greater than one million, c…

de398e6

…ompact take too long and blocks other requests

Merge branch 'master' of https://github.com/shenjiangc/etcd

eb2b37c

shenjiangc closed this Nov 4, 2019

shenjiangc reopened this Nov 4, 2019

shenjiangc closed this Nov 4, 2019

shenjiangc mentioned this pull request Nov 4, 2019

mvcc/kvstore: Optimize compaction, slove conflict for #11150 #11330

Merged

jingyih added a commit that referenced this pull request Nov 5, 2019

Merge pull request #11330 from shenjiangc/master

cbc1340

mvcc/kvstore: Optimize compaction, slove conflict for #11150

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mvcc/kvstore: Optimize compaction #11150

mvcc/kvstore: Optimize compaction #11150

shenjiangc commented Sep 14, 2019

jingyih commented Sep 14, 2019

shenjiangc commented Sep 14, 2019 •

edited

Loading

jingyih commented Sep 15, 2019

shenjiangc commented Sep 16, 2019

jingyih commented Sep 16, 2019

shenjiangc commented Sep 16, 2019 •

edited

Loading

jingyih commented Sep 17, 2019

jingyih commented Oct 25, 2019

jingyih left a comment

jingyih commented Oct 25, 2019

jingyih commented Oct 25, 2019

YoyinZyc commented Oct 25, 2019

jingyih commented Oct 26, 2019

shenjiangc commented Nov 4, 2019

mvcc/kvstore: Optimize compaction #11150

mvcc/kvstore: Optimize compaction #11150

Conversation

shenjiangc commented Sep 14, 2019

jingyih commented Sep 14, 2019

shenjiangc commented Sep 14, 2019 • edited Loading

jingyih commented Sep 15, 2019

shenjiangc commented Sep 16, 2019

jingyih commented Sep 16, 2019

shenjiangc commented Sep 16, 2019 • edited Loading

jingyih commented Sep 17, 2019

jingyih commented Oct 25, 2019

jingyih left a comment

Choose a reason for hiding this comment

jingyih commented Oct 25, 2019

jingyih commented Oct 25, 2019

YoyinZyc commented Oct 25, 2019

jingyih commented Oct 26, 2019

shenjiangc commented Nov 4, 2019

shenjiangc commented Sep 14, 2019 •

edited

Loading

shenjiangc commented Sep 16, 2019 •

edited

Loading