[fix][broker] Duplicate ByteBuffer when Caching Backlogged Consumers #17105

michaeljmarshall · 2022-08-15T22:03:26Z

Motivation

#12258 introduced caching for backlogged consumers. When caching the entry, it is important to duplicate the ByteBuffer so that the reader index is not shared. The current code has a race condition where the ByteBuffer reference in the cache is shared with the dispatcher. When another consumer reads from the cache, the cache calls duplicate() on the shared ByteBuffer, which copies the current reader index, which might not be 0 if the original dispatcher read data from the ByteBuffer.

Note: it seems like the caching insert method creates (or recycles) more EntryImpl instances than is really necessary. Changing that is outside this PR's scope, so I am going to leave it as is.

Modifications

Create a new Entry before inserting it into the cache.
Add a new test to the EntryCacheTest. The test fails before this change and passes after it.
Update the EntryCacheTest mocking so that it returns unique entries when mocking reads from the bookkeeper. Before, all returned LedgerEntry objects had ledgerId 0 and entryId 0, which messed with the caching for the new test.

Verifying this change

This change includes a test that failed before the PR and passes after it.

Documentation

doc-not-needed

mattisonchao

Nice catch!

mattisonchao · 2022-08-16T01:25:25Z

Small suggestion:

Could we make the insert method ensure it? It can avoid the same problem when anybody inserts it in the future.
Maybe we can make cachedData = entry.getDataBuffer().retain(); use retainedDuplicate().
Or change the insert method to insert(EntryImpl entry, boolean duplicateDataBuffer) and the duplicateDataBuffer default is true. you can set it to false when you ensure you never change the original ByteBuffer. (Just quick think and still need deep thinking).

The advantage is that we make this method more fault-tolerant and don't need create another EntryImpl

michaeljmarshall · 2022-08-16T04:18:16Z

Small suggestion:

Could we make the insert method ensure it? It can avoid the same problem when anybody inserts it in the future. Maybe we can make cachedData = entry.getDataBuffer().retain(); use retainedDuplicate(). Or change the insert method to insert(EntryImpl entry, boolean duplicateDataBuffer) and the duplicateDataBuffer default is true. you can set it to false when you ensure you never change the original ByteBuffer. (Just quick think and still need deep thinking).

The advantage is that we make this method more fault-tolerant and don't need create another EntryImpl

@mattisonchao - I didn't do it this way because the write path for tail read caching does not need the duplicate ByteBuffer created. That path technically doesn't even need an Entry, but it creates one because insert only takes an Entry. It seems like we should consider expanding/improving the EntryCache interface to meet these two different use cases. I agree that we should think about simplifying the entry reference management. I don't have benchmarks to show how much it costs to create an Entry unnecessarily, but this part of the code base is highly optimized, so I would prefer not to introduce unnecessary duplication if we can avoid it.

I am going to merge this as is so that it does not hold up the 2.11 release at all. We can definitely continue this discussion, especially because we have the PIP on improved caching getting implemented right now.

…17105) Fixes #16979 ### Motivation #12258 introduced caching for backlogged consumers. When caching the entry, it is important to duplicate the `ByteBuffer` so that the reader index is not shared. The current code has a race condition where the `ByteBuffer` reference in the cache is shared with the dispatcher. When another consumer reads from the cache, the cache calls `duplicate()` on the shared `ByteBuffer`, which copies the current reader index, which might not be 0 if the original dispatcher read data from the `ByteBuffer`. Note: it seems like the caching `insert` method creates (or recycles) more `EntryImpl` instances than is really necessary. Changing that is outside this PR's scope, so I am going to leave it as is. ### Modifications * Create a new `Entry` before inserting it into the cache. * Add a new test to the `EntryCacheTest`. The test fails before this change and passes after it. * Update the `EntryCacheTest` mocking so that it returns unique entries when mocking reads from the bookkeeper. Before, all returned `LedgerEntry` objects had ledgerId 0 and entryId 0, which messed with the caching for the new test. ### Verifying this change This change includes a test that failed before the PR and passes after it. ### Documentation - [x] `doc-not-needed` (cherry picked from commit 76f4195)

…pache#17105) Fixes apache#16979 ### Motivation apache#12258 introduced caching for backlogged consumers. When caching the entry, it is important to duplicate the `ByteBuffer` so that the reader index is not shared. The current code has a race condition where the `ByteBuffer` reference in the cache is shared with the dispatcher. When another consumer reads from the cache, the cache calls `duplicate()` on the shared `ByteBuffer`, which copies the current reader index, which might not be 0 if the original dispatcher read data from the `ByteBuffer`. Note: it seems like the caching `insert` method creates (or recycles) more `EntryImpl` instances than is really necessary. Changing that is outside this PR's scope, so I am going to leave it as is. ### Modifications * Create a new `Entry` before inserting it into the cache. * Add a new test to the `EntryCacheTest`. The test fails before this change and passes after it. * Update the `EntryCacheTest` mocking so that it returns unique entries when mocking reads from the bookkeeper. Before, all returned `LedgerEntry` objects had ledgerId 0 and entryId 0, which messed with the caching for the new test. ### Verifying this change This change includes a test that failed before the PR and passes after it. ### Documentation - [x] `doc-not-needed` (cherry picked from commit 76f4195)

…pache#17105) Fixes apache#16979 ### Motivation apache#12258 introduced caching for backlogged consumers. When caching the entry, it is important to duplicate the `ByteBuffer` so that the reader index is not shared. The current code has a race condition where the `ByteBuffer` reference in the cache is shared with the dispatcher. When another consumer reads from the cache, the cache calls `duplicate()` on the shared `ByteBuffer`, which copies the current reader index, which might not be 0 if the original dispatcher read data from the `ByteBuffer`. Note: it seems like the caching `insert` method creates (or recycles) more `EntryImpl` instances than is really necessary. Changing that is outside this PR's scope, so I am going to leave it as is. ### Modifications * Create a new `Entry` before inserting it into the cache. * Add a new test to the `EntryCacheTest`. The test fails before this change and passes after it. * Update the `EntryCacheTest` mocking so that it returns unique entries when mocking reads from the bookkeeper. Before, all returned `LedgerEntry` objects had ledgerId 0 and entryId 0, which messed with the caching for the new test. ### Verifying this change This change includes a test that failed before the PR and passes after it. ### Documentation - [x] `doc-not-needed`

…pache#17105) Fixes apache#16979 ### Motivation apache#12258 introduced caching for backlogged consumers. When caching the entry, it is important to duplicate the `ByteBuffer` so that the reader index is not shared. The current code has a race condition where the `ByteBuffer` reference in the cache is shared with the dispatcher. When another consumer reads from the cache, the cache calls `duplicate()` on the shared `ByteBuffer`, which copies the current reader index, which might not be 0 if the original dispatcher read data from the `ByteBuffer`. Note: it seems like the caching `insert` method creates (or recycles) more `EntryImpl` instances than is really necessary. Changing that is outside this PR's scope, so I am going to leave it as is. ### Modifications * Create a new `Entry` before inserting it into the cache. * Add a new test to the `EntryCacheTest`. The test fails before this change and passes after it. * Update the `EntryCacheTest` mocking so that it returns unique entries when mocking reads from the bookkeeper. Before, all returned `LedgerEntry` objects had ledgerId 0 and entryId 0, which messed with the caching for the new test. ### Verifying this change This change includes a test that failed before the PR and passes after it. ### Documentation - [x] `doc-not-needed` (cherry picked from commit 76f4195)

[fix][broker] Duplicate ByteBuffer when Caching Backlogged Consumers

4655285

michaeljmarshall added area/broker release/blocker Indicate the PR or issue that should block the release until it gets resolved doc-not-needed Your PR changes do not impact docs labels Aug 15, 2022

michaeljmarshall added this to the 2.11.0 milestone Aug 15, 2022

michaeljmarshall requested review from lhotari, rdhabalia, Technoboy-, eolivelli, codelipenghui, gaoran10 and mattisonchao August 15, 2022 22:03

michaeljmarshall self-assigned this Aug 15, 2022

mattisonchao approved these changes Aug 16, 2022

View reviewed changes

codelipenghui approved these changes Aug 16, 2022

View reviewed changes

michaeljmarshall merged commit 76f4195 into apache:master Aug 16, 2022

michaeljmarshall deleted the duplicate-bytebuff-when-caching branch August 16, 2022 04:19

michaeljmarshall added cherry-picked/branch-2.11 and removed release/blocker Indicate the PR or issue that should block the release until it gets resolved labels Aug 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix][broker] Duplicate ByteBuffer when Caching Backlogged Consumers #17105

[fix][broker] Duplicate ByteBuffer when Caching Backlogged Consumers #17105

michaeljmarshall commented Aug 15, 2022

mattisonchao left a comment

mattisonchao commented Aug 16, 2022 •

edited

Loading

michaeljmarshall commented Aug 16, 2022

[fix][broker] Duplicate ByteBuffer when Caching Backlogged Consumers #17105

[fix][broker] Duplicate ByteBuffer when Caching Backlogged Consumers #17105

Conversation

michaeljmarshall commented Aug 15, 2022

Motivation

Modifications

Verifying this change

Documentation

mattisonchao left a comment

Choose a reason for hiding this comment

mattisonchao commented Aug 16, 2022 • edited Loading

michaeljmarshall commented Aug 16, 2022

mattisonchao commented Aug 16, 2022 •

edited

Loading