Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISSUE 3220: Autorecovery does not process underreplicated empty ledgers #3239

Merged
merged 1 commit into from
Jun 21, 2022

Conversation

dlg99
Copy link
Contributor

@dlg99 dlg99 commented Apr 22, 2022

Descriptions of the changes in this PR:

Motivation

Currently decomm malfunctions on an empty ledger:

Autorecovery ends up removing such ledger from the list of underreplicated ledgers but does not update the ensemble leaving the bookie being decommed there and Auditor will add these ledger back on the next run.

In greater details:

Currently the ReplicationWorker/LedgerChecker end up skipping fragment where ES > WQ and failed bookie is not in the writeset (or a ledger is empty).

E.g. with ES = 3, WQ = 2
Ensemble: (bk1, bk2, bk3)
bk3 is dead.
if ledger is empty (and e.g. closed) it is added to the list of underreplicated ledgers but autorecovery skips it (no failed fragments, dead bookie remains in the metadata) and simply removes it from UR ledgers (until next Auditor's run adds it back).

This becomes a problem when a few bookies with large number of empty ledgers get decommed.
Every run of the Auditor ends up in increase of znodes (added UR ledgers) and a lot of busy work for the autorecovery (overall increased load on zookeeper).

Changes

Added test for the scenario I encountered.
Tests expanded with post-autorecovery validation of ledger ensembles to make sure the decomemd bookie no longer used there.

Ledger checker now does not skip fragments with firstEntryId == -1 but checks if the bookie is actually available.

This changes current behavior.
The fragments for an empty ledger will end up "rereplicated" (metadata updated).

Master Issue: #3220

@dlg99 dlg99 marked this pull request as draft April 22, 2022 04:32
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fix (in LedgerChecker) makes sense to me.

I left some comments.
I am happy to see that we are re-enabling a test that was lost because it was too flaky

@@ -383,6 +383,7 @@ public void testEmptyLedgerLosesQuorumEventually() throws Exception {
LOG.info("Killing last bookie, {}, in ensemble {}", replicaToKill,
lh.getLedgerMetadata().getAllEnsembles().get(0L));
killBookie(replicaToKill);
startNewBookie();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we changing this existing test ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior has changed now.
basically autorecovery used to ignore failed bookie for empty ledgers if it is not in the writeset for entryId 0 when WQ < ES.
e.g. ensemble = (bl1, bk2, bk3), WQ = 2, ledger is empty.
bk3 is down, it is not in the writeset for entryId=0.

Auditor would add it to underreplicated, autorecovery would skip it and simply remove from underreplicated, go back to the start of the sentence.
This is nice until you end up with an empty ledger like that. or a few thousands of them.

@@ -130,11 +144,16 @@ public void testDecommissionForLedgersWithMultipleSegmentsAndNotWriteClosed() th
lh4.addEntry(j, "data".getBytes());
}

// avoiding autorecovery fencing the ledger
servers.forEach(srv -> srv.stopAutoRecovery());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that "stopAutoRecovery" waits for autoRecovery to be totally stopped, maybe there is still some task running?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does wait.
I reduced openLedgerRereplicationGracePeriod, thus autorecovery may fence the ledger and now the test has to stop/start it to avoid flakiness in this case (add to the ledger to force ensemble change)

@@ -88,7 +98,7 @@ public void testDecommissionBookie() throws Exception {
*/
bkAdmin.decommissionBookie(BookieImpl.getBookieId(killedBookieConf));
bkAdmin.triggerAudit();
Thread.sleep(500);
Thread.sleep(5000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of the scope of this PR, but in the future it would be better to wait for a specific condition, in order to reduce flakyness

setAutoRecoveryEnabled(true);
}

@FlakyTest("https://github.com/apache/bookkeeper/issues/502")
@Test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so basically we were not running this test anymore.

great to see this running again

@dlg99 dlg99 requested a review from Ghatage April 25, 2022 18:54
@dlg99 dlg99 marked this pull request as ready for review April 25, 2022 19:06
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

Awesome work

Copy link
Contributor

@nicoloboschi nicoloboschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eolivelli eolivelli merged commit eadbdd4 into apache:master Jun 21, 2022
@hangc0276 hangc0276 added this to the 4.16.0 milestone Jul 25, 2022
zymap pushed a commit that referenced this pull request Aug 1, 2022
hangc0276 pushed a commit to hangc0276/bookkeeper that referenced this pull request Nov 5, 2022
hangc0276 pushed a commit to hangc0276/bookkeeper that referenced this pull request Nov 7, 2022
Ghatage pushed a commit to sijie/bookkeeper that referenced this pull request Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants