ManagedLedger only closes ledger on error if current ledger (#240) #2573

ivankelly · 2018-09-13T14:48:48Z

If we have a managed ledger, ml and we write 2 entries to it, if both
entries fail, both will end up calling ManagedLedgerImpl#ledgerClosed
with the ledger the write failed on as a parameter.

However, depending on timing, the second call to ledgerClosed could
end up adding a new ledger to the ledger list, even though the current
ledger is not failing (as the failing ledger was replaced by the
first call).

This was the cause of a flake in
ManagedLedgerErrorsTest#recoverLongTimeAfterMultipleWriteErrors as
reported in (#240). However, it's not possible to get a deterministic
test for this as the timings need to be very precise. The failing
addComplete needs to run before first error handling completes, but
the runnable with ledgerClosed for the second failure needs to run
after the first error handling completes, but before the write resends
from the first error handling complete.

If we have a managed ledger, ml and we write 2 entries to it, if both entries fail, both will end up calling ManagedLedgerImpl#ledgerClosed with the ledger the write failed on as a parameter. However, depending on timing, the second call to ledgerClosed could end up adding a new ledger to the ledger list, even though the current ledger is _not_ failing (as the failing ledger was replaced by the first call). This was the cause of a flake in ManagedLedgerErrorsTest#recoverLongTimeAfterMultipleWriteErrors as reported in (apache#240). However, it's not possible to get a deterministic test for this as the timings need to be very precise. The failing addComplete needs to run before first error handling completes, but the runnable with ledgerClosed for the second failure needs to run after the first error handling completes, but before the write resends from the first error handling complete.

merlimat

👍

ivankelly added the type/bug The PR fixed a bug or issue reported a bug label Sep 13, 2018

ivankelly self-assigned this Sep 13, 2018

ivankelly requested review from merlimat, maskit and sijie September 13, 2018 14:48

merlimat added this to the 2.2.0-incubating milestone Sep 13, 2018

merlimat approved these changes Sep 13, 2018

View reviewed changes

merlimat merged commit 7c62ecf into apache:master Sep 13, 2018

ivankelly mentioned this pull request Sep 14, 2018

Flaky-test: recoverLongTimeAfterMultipleWriteErrors #240

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ManagedLedger only closes ledger on error if current ledger (#240) #2573

ManagedLedger only closes ledger on error if current ledger (#240) #2573

ivankelly commented Sep 13, 2018

merlimat left a comment

ManagedLedger only closes ledger on error if current ledger (#240) #2573

ManagedLedger only closes ledger on error if current ledger (#240) #2573

Conversation

ivankelly commented Sep 13, 2018

merlimat left a comment

Choose a reason for hiding this comment