Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

owner: fix memory accumulated when owner consume etcd update slow #1224

Merged
merged 3 commits into from
Dec 18, 2020

Conversation

amyangfei
Copy link
Contributor

@amyangfei amyangfei commented Dec 18, 2020

What problem does this PR solve?

Fix for a memory accumulated issue in owner

https://github.com/pingcap/ticdc/blob/eed57c94491fd140f0a8a0528acc574bc542bb0b/cdc/owner.go#L1024-L1039

The output channel consume speed (which relays on how often L1035 can be run) may be much slower than the etcd key update frequency, which will lead to data accumulated in etcd watch client

What is changed and how it works?

Ignore position update event if output channel is full, it is safe enough as the owner has a builtin ticker to trigger o.run

NOTE: this is a release-4.0 quick fix, I will update master/release-5.0 and add test cases later

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below) (Have tested in an internal test cluster)
    • setup a ticdc cluster, create one or more changefeeds
    • running cdc for 10mins, get memory profile of cdc owner by http://<ip>:<port>/debug/pprof/heap
    • find the memory usage of etcd.clientv3, should be less than 5MB

Release note

  • Fix a bug that too much memory may be consumed in etcd watch client in cdc owner.

@amyangfei amyangfei added this to the v4.0.9 milestone Dec 18, 2020
@amyangfei
Copy link
Contributor Author

/run-all-tests

cdc/owner.go Outdated Show resolved Hide resolved
Co-authored-by: leoppro <i@leop.pro>
Copy link
Contributor

@zier-one zier-one left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Dec 18, 2020
@amyangfei
Copy link
Contributor Author

/run-all-tests

@zier-one zier-one added needs-cherry-pick-release-4.0 Should cherry pick this PR to release-4.0 branch. needs-cherry-pick-5.0 labels Dec 18, 2020
@amyangfei amyangfei removed needs-cherry-pick-release-4.0 Should cherry pick this PR to release-4.0 branch. needs-cherry-pick-5.0 labels Dec 18, 2020
@amyangfei
Copy link
Contributor Author

/run-all-tests

@liuzix
Copy link
Contributor

liuzix commented Dec 18, 2020

LGTM

@ti-srebot ti-srebot removed the status/LGT1 Indicates that a PR has LGTM 1. label Dec 18, 2020
@ti-srebot ti-srebot added the status/LGT2 Indicates that a PR has LGTM 2. label Dec 18, 2020
@amyangfei amyangfei merged commit 04e0284 into pingcap:release-4.0 Dec 18, 2020
@amyangfei amyangfei deleted the fix-memory-leak branch December 18, 2020 07:27
@amyangfei amyangfei added type/bug The issue is confirmed as a bug. priority/P0 The issue has P0 priority. labels Dec 24, 2020
@AkiraXie AkiraXie added the area/ticdc Issues or PRs related to TiCDC. label Mar 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ticdc Issues or PRs related to TiCDC. priority/P0 The issue has P0 priority. severity/critical status/LGT2 Indicates that a PR has LGTM 2. type/bug The issue is confirmed as a bug. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants