Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: fix memory leak introduced by timer.After #6720

Merged
merged 11 commits into from
Jul 3, 2023
Merged

Conversation

lhy1024
Copy link
Contributor

@lhy1024 lhy1024 commented Jun 29, 2023

What problem does this PR solve?

Issue Number: Close #6719

What is changed and how does it work?

  1. use sync.Pool to maintain TSDeadline

  2. replace time.After with NewTimer and timer.Stop, or ticker

  • for loop retry use ticker, if the loop is broken, we will stop it by deferring or checking each branch

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)

only bench with forward c.serviceModeUpdateCb(pdpb.ServiceMode_PD_SVC_MODE)

~/pingcap/tiup/bin/tiup-playground v6.6.0 --pd.micro '[{"api":true, "amount":3},{"tso":true,"amount":3}]' --pd.binpath ~/pd-cse/bin/pd-server --db.binpath ~/tidb-cse/bin/tidb-server --kv 3 --db 1 --tiflash 0 --db.port 14000 --host 192.168.8.88

./pd-tso-bench -v -duration 250000s -client 10 -c 1 -interval 3s -pd 192.168.8.88:2379

Before this pr

71f02e8f-43b9-4cc7-8121-663d5408b5c4

After this pr
image

Release note

None.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jun 29, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • binshi-bing
  • rleungx

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot bot added the release-note-none Denotes a PR that doesn't merit a release note. label Jun 29, 2023
@ti-chi-bot ti-chi-bot bot requested review from disksing and JmPotato June 29, 2023 11:10
@ti-chi-bot ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 29, 2023
Signed-off-by: lhy1024 <admin@liudos.us>
@codecov
Copy link

codecov bot commented Jun 29, 2023

Codecov Report

Patch coverage: 75.65% and project coverage change: +0.30 🎉

Comparison is base (05f71e0) 74.27% compared to head (a40b03a) 74.58%.

❗ Current head a40b03a differs from pull request most recent head ac7bb65. Consider uploading reports for the commit ac7bb65 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6720      +/-   ##
==========================================
+ Coverage   74.27%   74.58%   +0.30%     
==========================================
  Files         410      418       +8     
  Lines       42871    43706     +835     
==========================================
+ Hits        31844    32599     +755     
- Misses       8169     8215      +46     
- Partials     2858     2892      +34     
Flag Coverage Δ
unittests 74.58% <75.65%> (+0.30%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/keyspace/keyspace.go 62.32% <0.00%> (+0.11%) ⬆️
pkg/mcs/discovery/register.go 61.53% <33.33%> (-13.98%) ⬇️
server/grpc_service.go 52.87% <33.33%> (+0.52%) ⬆️
pkg/election/lease.go 85.88% <50.00%> (-3.87%) ⬇️
client/timerpool/pool.go 60.00% <60.00%> (ø)
client/client.go 62.66% <66.66%> (-0.30%) ⬇️
client/resource_manager_client.go 70.69% <66.66%> (+1.68%) ⬆️
client/tso_stream.go 74.46% <66.66%> (-1.62%) ⬇️
pkg/mcs/resourcemanager/server/server.go 53.38% <66.66%> (+0.39%) ⬆️
pkg/mcs/utils/util.go 61.53% <66.66%> (+6.99%) ⬆️
... and 9 more

... and 38 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Signed-off-by: lhy1024 <admin@liudos.us>
Copy link
Contributor

@binshi-bing binshi-bing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left one comment. The reset LGTM

@@ -208,8 +214,10 @@ func NewTSDeadline(
done chan struct{},
cancel context.CancelFunc,
) *TSDeadline {
timer := timerPool.Get().(*time.Timer)
timer.Reset(timeout)
return &TSDeadline{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we use sync.Pool for TSDeadline and a timer is always attached to a TSDeadline and can be reused and reset every time? In this way, we use sync.Pool for both TSDeadline and Timer in fact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will make code complex and test show that done and cancel don't cost some memory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Timer is the main issue and we can keep smaller change.

Copy link
Contributor

@binshi-bing binshi-bing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -208,8 +214,10 @@ func NewTSDeadline(
done chan struct{},
cancel context.CancelFunc,
) *TSDeadline {
timer := timerPool.Get().(*time.Timer)
timer.Reset(timeout)
return &TSDeadline{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Timer is the main issue and we can keep smaller change.

@ti-chi-bot ti-chi-bot bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jun 29, 2023
client/tso_dispatcher.go Outdated Show resolved Hide resolved
client/tso_dispatcher.go Outdated Show resolved Hide resolved
server/grpc_service.go Outdated Show resolved Hide resolved
Signed-off-by: lhy1024 <admin@liudos.us>
2. replace timer with ticker in for-retry
3. add reset prefunc in lease.go

Signed-off-by: lhy1024 <admin@liudos.us>
@lhy1024 lhy1024 requested a review from rleungx June 30, 2023 05:25
client/tso_dispatcher.go Outdated Show resolved Hide resolved
Signed-off-by: lhy1024 <admin@liudos.us>
@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 30, 2023
Copy link
Contributor

@binshi-bing binshi-bing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

client/tso_dispatcher.go Outdated Show resolved Hide resolved
client/tso_dispatcher.go Outdated Show resolved Hide resolved
reset ticker to support dynamic config

Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
@ti-chi-bot ti-chi-bot bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jun 30, 2023
ti-chi-bot bot added a commit that referenced this pull request Jun 30, 2023
close #6719, ref #6720

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
@JmPotato
Copy link
Member

JmPotato commented Jul 3, 2023

/merge

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jul 3, 2023

@JmPotato: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jul 3, 2023

This pull request has been accepted and is ready to merge.

Commit hash: a40b03a

@ti-chi-bot ti-chi-bot bot added the status/can-merge Indicates a PR has been approved by a committer. label Jul 3, 2023
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jul 3, 2023

@lhy1024: Your PR was out of date, I have automatically updated it for you.

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot bot merged commit ff67696 into tikv:master Jul 3, 2023
rleungx pushed a commit to rleungx/pd that referenced this pull request Aug 2, 2023
close tikv#6719, ref tikv#6720

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
rleungx pushed a commit to rleungx/pd that referenced this pull request Aug 2, 2023
close tikv#6719

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
rleungx pushed a commit to rleungx/pd that referenced this pull request Aug 2, 2023
close tikv#6719

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
rleungx pushed a commit to rleungx/pd that referenced this pull request Aug 2, 2023
close tikv#6719, ref tikv#6720

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
rleungx pushed a commit to rleungx/pd that referenced this pull request Aug 2, 2023
close tikv#6719

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
JmPotato added a commit to JmPotato/pd that referenced this pull request Sep 25, 2023
* mcs: use patch method in keyspace group (tikv#6713)

ref tikv#6233

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* *: fix memory leak introduced by timer.After (tikv#6720)

close tikv#6719

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* tso: implement groupSplitPatroller to speed up the split process (tikv#6736)

ref tikv#5895, close tikv#6696

Implement `groupSplitPatroller` to speed up the split process.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* client: fix tso service discovery at the first time for NewClientWithAPIContext (tikv#6749)

close tikv#6748

After NewClientWithAPIContextV2 returns, the keyspace group should be discovered by the passed keyspace name immediately

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* *: add test for misusing keyspace ID when creating the client (tikv#6754)

ref tikv#6747, ref tikv#6748, ref tikv#6749

Signed-off-by: Ryan Leung <rleungx@gmail.com>

* tso: support multi-keyspace, fault injection and keyspace-name in pd-tso-bench (tikv#6608)

ref tikv#5895

support multi-keyspace, fault injection and keyspace-name in pd-tso-bench

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* *: make test great again (tikv#6767)

close tikv#6761

Signed-off-by: Ryan Leung <rleungx@gmail.com>

* tso: implement deletedGroupCleaner to clean up the legacy TSO key (tikv#6745)

close tikv#6589

- Implement `deletedGroupCleaner` to clean up the legacy TSO key.
- Extract the timestamp key path constructor.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

* Add TestUpgradingAPIandTSOClusters (tikv#6534)

ref tikv#5895

Add TestUpgradingAPIandTSOClusters to test the scenario that after we restart the API cluster
then restart the TSO cluster, the TSO service can still serve TSO requests normally.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* client, tests: allow TSO fallback happens in TestMixedTSODeployment (tikv#6740)

close tikv#6634

Introduce `WithAllowTSOFallback` client option to bypass the panic in `TestMixedTSODeployment`.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* tso: allow mergedTS to be zero in mergingChecker (tikv#6758)

ref tikv#6589

Since it's possible that a keyspace group is to be deleted and merged before its TSO is initialized,
we should allow `mergedTS` to be zero in `mergingChecker`. This PR allows this case and only
block the merging when loading the TSO meets the error.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* *: move keyspace group primary path code to key_path.go (tikv#6755)

ref tikv#5895

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* mcs: add log flags (tikv#6777)

ref tikv#5766

Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* keyspace, tso, apiv2: impl the interface to merge all keyspace groups into the default (tikv#6757)

ref tikv#6756

Impl the interface to merge all keyspace groups into the default.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* pdctl: support show keyspace group primary (tikv#6747)

close tikv#6746

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* pd-ctl, tests: impl the merge all keyspace groups command (tikv#6782)

close tikv#6756

- Impl the merge all keyspace groups command.
- Further reuse of TSO cluster-related test code.

Signed-off-by: JmPotato <ghzpotato@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* pd-ctl, tests: impl the merge all keyspace groups command (tikv#6782)

close tikv#6756

- Impl the merge all keyspace groups command.
- Further reuse of TSO cluster-related test code.

Signed-off-by: JmPotato <ghzpotato@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* *: fix test suite race (tikv#6784)

close tikv#6772

Signed-off-by: Ryan Leung <rleungx@gmail.com>

* keyspace: fix data race (tikv#6797)

Signed-off-by: lhy1024 <admin@liudos.us>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* pdctl: support show keyspace meta with refresh group id (tikv#6751)

close tikv#6746

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* mcs: Refactor ServicePath to make caller's life easier (tikv#6799)

close tikv#6800

Signed-off-by: Xiaoguang Sun <sunxiaoguang@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* Remove the lastPhysical check in dispatchClient (tikv#6812)

* *: fix `TestGetTSOImmediately` (tikv#6811)

close tikv#6795

Signed-off-by: Ryan Leung <rleungx@gmail.com>

* fix test

Signed-off-by: lhy1024 <admin@liudos.us>

---------

Signed-off-by: Bin Shi <binshi.bing@gmail.com>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: JmPotato <ghzpotato@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Co-authored-by: JmPotato <ghzpotato@gmail.com>
Co-authored-by: Bin Shi <39923490+binshi-bing@users.noreply.github.com>
Co-authored-by: Ryan Leung <rleungx@gmail.com>
Co-authored-by: Xiaoguang Sun <sunxiaoguang@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Frequently new a timer may cost too much memory
4 participants