Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix autoRecover memory leak. #3361

Merged
merged 2 commits into from
Jul 26, 2022

Conversation

horizonzy
Copy link
Member

Descriptions of the changes in this PR:
fixes-#3360

@horizonzy horizonzy force-pushed the fix-auto-recover-memory-leak branch from c727699 to 4cd2d4e Compare June 25, 2022 08:30
Copy link
Contributor

@hangc0276 hangc0276 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to close the ledger handle in Auditor#checkAllLedgers() to unregister the listener.

localAdmin.asyncOpenLedgerNoRecovery(ledgerId, (rc, lh, ctx) -> {
openLedgerNoRecoverySemaphore.release();
if (Code.OK == rc) {
checker.checkLedger(lh,
// the ledger handle will be closed after checkLedger is done.
new ProcessLostFragmentsCb(lh, callback),
conf.getAuditorLedgerVerificationPercentage());
// we collect the following stats to get a measure of the
// distribution of a single ledger within the bk cluster
// the higher the number of fragments/bookies, the more distributed it is
numFragmentsPerLedger.registerSuccessfulValue(lh.getNumFragments());
numBookiesPerLedger.registerSuccessfulValue(lh.getNumBookies());
numLedgersChecked.inc();

@horizonzy
Copy link
Member Author

We need to close the ledger handle in Auditor#checkAllLedgers() to unregister the listener.

localAdmin.asyncOpenLedgerNoRecovery(ledgerId, (rc, lh, ctx) -> {
openLedgerNoRecoverySemaphore.release();
if (Code.OK == rc) {
checker.checkLedger(lh,
// the ledger handle will be closed after checkLedger is done.
new ProcessLostFragmentsCb(lh, callback),
conf.getAuditorLedgerVerificationPercentage());
// we collect the following stats to get a measure of the
// distribution of a single ledger within the bk cluster
// the higher the number of fragments/bookies, the more distributed it is
numFragmentsPerLedger.registerSuccessfulValue(lh.getNumFragments());
numBookiesPerLedger.registerSuccessfulValue(lh.getNumBookies());
numLedgersChecked.inc();

In ProcessLostFragmentsCb, it will close it.

// unregister the listener
ledgerManager.unregisterLedgerMetadataListener(ledgerId, listener);
assertFalse(ledgerManager.listeners.containsKey(ledgerId));
assertFalse(watchers.containsKey(ledgerStr));
verify(mockZk, times(1)).removeWatches(any(String.class),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
verify(mockZk, times(1)).removeWatches(any(String.class),
verify(mockZk, times(1)).removeWatches(eq(getLedgerPath(ledgerId)),

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good suggestion. 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, 👍

Copy link
Contributor

@hangc0276 hangc0276 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job!

Copy link
Member

@zymap zymap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zymap
Copy link
Member

zymap commented Jul 6, 2022

@eolivelli @dlg99 Would you like to help to review this PR? Thanks!

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch !

@dlg99 this may explain some problems you have seen in production


@Override
public void run() {
Set<LedgerMetadataListener> listeners = AbstractZkLedgerManager.this.listeners.get(ledgerId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is running in the same thread of "unregisterLedgerMetadataListener"

why do you need a inner class ? can't we simply code this as a regular Java method ?

private void cancelLedgerWatchers(long ledgerId) {
.... 
}

and call it from unregisterLedgerMetadataListener ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow the origin code style. See ReadLedgerMetadataTask.

@hangc0276 hangc0276 mentioned this pull request Jul 20, 2022
11 tasks
@zymap
Copy link
Member

zymap commented Jul 26, 2022

@eolivelli Could you please take another look?

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eolivelli eolivelli merged commit da1b29a into apache:master Jul 26, 2022
@eolivelli
Copy link
Contributor

@hangc0276 please cherry pick to active branches (4.14, 4.15)

@hangc0276
Copy link
Contributor

@hangc0276 please cherry pick to active branches (4.14, 4.15)

@eolivelli sure.

zymap pushed a commit that referenced this pull request Aug 2, 2022
hangc0276 pushed a commit to hangc0276/bookkeeper that referenced this pull request Nov 5, 2022
hangc0276 pushed a commit to hangc0276/bookkeeper that referenced this pull request Nov 7, 2022
nicoloboschi pushed a commit to datastax/bookkeeper that referenced this pull request Jan 11, 2023
(cherry picked from commit da1b29a)
(cherry picked from commit 2d7c0a5)
Ghatage pushed a commit to sijie/bookkeeper that referenced this pull request Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants