Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ManagedLedger only closes ledger on error if current ledger (#240) #2573

Merged
merged 1 commit into from
Sep 13, 2018

Conversation

ivankelly
Copy link
Contributor

If we have a managed ledger, ml and we write 2 entries to it, if both
entries fail, both will end up calling ManagedLedgerImpl#ledgerClosed
with the ledger the write failed on as a parameter.

However, depending on timing, the second call to ledgerClosed could
end up adding a new ledger to the ledger list, even though the current
ledger is not failing (as the failing ledger was replaced by the
first call).

This was the cause of a flake in
ManagedLedgerErrorsTest#recoverLongTimeAfterMultipleWriteErrors as
reported in (#240). However, it's not possible to get a deterministic
test for this as the timings need to be very precise. The failing
addComplete needs to run before first error handling completes, but
the runnable with ledgerClosed for the second failure needs to run
after the first error handling completes, but before the write resends
from the first error handling complete.

If we have a managed ledger, ml and we write 2 entries to it, if both
entries fail, both will end up calling ManagedLedgerImpl#ledgerClosed
with the ledger the write failed on as a parameter.

However, depending on timing, the second call to ledgerClosed could
end up adding a new ledger to the ledger list, even though the current
ledger is _not_ failing (as the failing ledger was replaced by the
first call).

This was the cause of a flake in
ManagedLedgerErrorsTest#recoverLongTimeAfterMultipleWriteErrors as
reported in (apache#240). However, it's not possible to get a deterministic
test for this as the timings need to be very precise. The failing
addComplete needs to run before first error handling completes, but
the runnable with ledgerClosed for the second failure needs to run
after the first error handling completes, but before the write resends
from the first error handling complete.
@ivankelly ivankelly added the type/bug The PR fixed a bug or issue reported a bug label Sep 13, 2018
@ivankelly ivankelly self-assigned this Sep 13, 2018
@merlimat merlimat added this to the 2.2.0-incubating milestone Sep 13, 2018
Copy link
Contributor

@merlimat merlimat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants