Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve] [doc] Add doc for customized util class #21110

Merged
merged 5 commits into from
Sep 9, 2023

Conversation

thetumbled
Copy link
Member

@thetumbled thetumbled commented Sep 2, 2023

Fixes #21109
Fixes #21113

Motivation

There is bug in forEach method in util class like ConcurrentLongLongPairHashMap.
When it detect there is another thread acquire the write lock, it will fall back to acquire the read lock, which is right. But, it will continue to iterate the array from the last time.

  1. As another thread may write the new value before the current index for iterate, it may miss the new value.
    There are posibilities that it may read the new value or can't read the new value, which may volilate the consistence of data. This is the problem discribed in [Bug] [broker] concurrent write and iterate with ConcurrentLongLongPairHashMap. #21109.
  2. As the time spent on
    https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/util/collections/ConcurrentLongLongPairHashMap.java#561
processor.accept(storedKey1, storedKey2, storedValue1, storedValue2);

is unpredictable. There may be lots of write/remove/clear operations executed in other threads before foreach method fall back to acquire the read lock. After the update from other threads, the foreach method will still iterate the array from the index(bucket) last time, whose result is completly unpredictable.
For example in #21113.
In the case of capacity=4, the bucket calculated for key (9,9) and (10,10) is 1. So, if we put into pair (9,9,99,99) before (10,10,1010,1010), the array content:

-1 -1 -1 -1 9 9 99 99 10 10 1010 1010 ....

if we put into pair (10,10,1010,1010) before (9,9,99,99), the array content:

-1 -1 -1 -1 10 10 1010 1010 9 9 99 99 ....

Given above, i construct a test: writer thread put into pair (9,9,99,99), (10,10,1010,1010) first. When the foreach thread read to bucket 1(that is 9 9 99 99), writer thread clear all the content and put into pair (10,10,1010,1010), (9,9,99,99). Then the foreach thread continue to read the array from bucket 2, so it will read 9 9 99 99, which is the case where key and value are both same.

Modifications

We can not allow update while we iterate the array, or we can not provide any consistence gurantee. We have to remove the optimistic read logic from forEach method in all util class.
Add warning in javadocs for these util class.

ConcurrentLongHashMap
ConcurrentLongLongPairHashMap
ConcurrentLongPairSet
ConcurrentOpenHashMap
ConcurrentOpenHashSet

Verifying this change

  • Make sure that the change passes the CI checks.

This change added tests and can be verified as follows:
(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: thetumbled#29

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Sep 2, 2023
@thetumbled
Copy link
Member Author

PTAL, thanks!
@merlimat @lhotari @eolivelli @codelipenghui @hangc0276

@eolivelli
Copy link
Contributor

I am not sure that adding this constraints to forEach helps much.

It is required that if from one thread you modify the map then you see the the effects on the change, in the same thread (or in another thread in case you have some concurrency management utility that guarantees the 'happens before' semantics)

If you are scanning the map in a thread and something else is modifying it (from another thread) then it is legit/expected that you see "some values" of the map, before or after the write.

@thetumbled
Copy link
Member Author

I am not sure that adding this constraints to forEach helps much.

It is required that if from one thread you modify the map then you see the the effects on the change, in the same thread (or in another thread in case you have some concurrency management utility that guarantees the 'happens before' semantics)

If you are scanning the map in a thread and something else is modifying it (from another thread) then it is legit/expected that you see "some values" of the map, before or after the write.

I will change the implementation as adding a warning in the javadocs, in case of any developers use it incorrectly.
thanks for you review.

@codelipenghui
Copy link
Contributor

Given the benchmark results provided in the link, it seems prudent to consider replacing the customized ConcurrentMap with ConcurrentHashMap. The superior performance demonstrated by ConcurrentHashMap in put and remove operations surpasses the benefits derived from garbage collection (GC) optimization.

@BewareMyPower
Copy link
Contributor

BewareMyPower commented Sep 6, 2023

I agree with @codelipenghui. The reason to implement customized concurrent hash map might be historical and limited to the low Java version (8, or even earlier before Pulsar was open sourced).

 * <li>No boxing/unboxing from long -> Long
 * <li>Open hash map with linear probing, no node allocations to store the values

These two reasons are not convincing given the benchmark result above. I think we can open a discussion in the mail list to remove these implementations.

@thetumbled
Copy link
Member Author

I agree with @codelipenghui. The reason to implement customized concurrent hash map might be historical and limited to the low Java version (8, or even earlier before Pulsar was open sourced).

 * <li>No boxing/unboxing from long -> Long
 * <li>Open hash map with linear probing, no node allocations to store the values

These two reasons are not convincing given the benchmark result above. I think we can open a discussion in the mail list to remove these implementations.

Create a discussion here: https://lists.apache.org/thread/482q36czdprlchj26f8sqw0k3bghmcgb.
Thanks.

@thetumbled
Copy link
Member Author

As the replacement of these customized util class will take a long time, we could add the warning to javadocs first.
Could you help to review? @eolivelli @codelipenghui @BewareMyPower @lhotari
Thanks.

@codelipenghui codelipenghui added this to the 3.2.0 milestone Sep 7, 2023
Copy link
Contributor

@BewareMyPower BewareMyPower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please modify the PR title. I don't think it should be a "fix". It just adds more explanations in the API doc.

@thetumbled thetumbled changed the title [fix] [broker] concurrent write and iterate with ConcurrentLongLongPairHashMap. [improve] [doc] Add doc for customized util class Sep 7, 2023
@thetumbled
Copy link
Member Author

Please modify the PR title. I don't think it should be a "fix". It just adds more explanations in the API doc.

Fixed.

@github-actions github-actions bot added doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. and removed doc-not-needed Your PR changes do not impact docs labels Sep 7, 2023
@thetumbled thetumbled closed this Sep 8, 2023
@thetumbled thetumbled reopened this Sep 8, 2023
@thetumbled
Copy link
Member Author

/pulsarbot rerun-failure-checks

@BewareMyPower BewareMyPower merged commit e9d1d99 into apache:master Sep 9, 2023
51 of 52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Your PR contains doc changes, no matter whether the changes are in markdown or code files.
Projects
None yet
4 participants