Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][test] Fix ConcurrentModificationException in testConcurrentWriteBrokerData #18418

Merged

Conversation

AnonHxy
Copy link
Contributor

@AnonHxy AnonHxy commented Nov 10, 2022

Fixes #18417

Motivation

Fixes #18417

Modifications

Change bundles, lastBundleGains and lastBundleLosses from HashSet to ConcurrentHashSet, and remove the setter methods

Verifying this change

  • Make sure that the change passes the CI checks.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: AnonHxy#16

@AnonHxy AnonHxy self-assigned this Nov 10, 2022
@AnonHxy AnonHxy added this to the 2.12.0 milestone Nov 10, 2022
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Nov 10, 2022
@AnonHxy AnonHxy changed the title [fix][broker] Fix ConcurrentModificationException on LocalBrokerData [fix][broker] Fix ConcurrentModificationException in LocalBrokerData Nov 10, 2022
@AnonHxy AnonHxy added the type/bug The PR fixed a bug or issue reported a bug label Nov 10, 2022
Copy link
Contributor

@poorbarcode poorbarcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, all three variables run in a lock block. I think the root cause is here:

data.cleanDeltas();
data.getBundles().clear();


Hi @AnonHxy Answer for #18418 (comment)

It's interesting. The 'bundles' has clear before submitting to the thread pool, it maybe that other threads didn't see the clear action immediately?

In this for each block below: cleaning the collection, and creating new tasks to update the collection. I think this is the root cause of concurrent access.

for (int i = 0; i < 1000; i++) {
LocalBrokerData data = loadManager.getLoadManager().updateLocalBrokerData();
data.cleanDeltas();
data.getBundles().clear();
list.add(executor.submit(() -> {
try {
assertNotNull(loadManager.generateLoadReport());
} catch (Exception e) {
throw new RuntimeException(e);
}
}));
list.add(executor.submit(() -> {
try {
loadManager.writeLoadReportOnZookeeper();
} catch (Exception e) {
throw new RuntimeException(e);
}
}));

@AnonHxy
Copy link
Contributor Author

AnonHxy commented Nov 11, 2022

Normally, all three variables run in a lock block. I think the root cause is here:

data.cleanDeltas();
data.getBundles().clear();

It's interesting. The 'bundles' has clear before submitting to the thread pool, it maybe that other threads didn't see the clear action immediately?

@AnonHxy
Copy link
Contributor Author

AnonHxy commented Nov 11, 2022

In this for each block below: cleaning the collection, and creating new tasks to update the collection. I think this is the root cause of concurrent access.

It make sense to me. It seems that we could just remove the following from the test case, and keep others unchanged

 LocalBrokerData data = loadManager.getLoadManager().updateLocalBrokerData();
 data.cleanDeltas();
 data.getBundles().clear();

WDYT @poorbarcode

@AnonHxy
Copy link
Contributor Author

AnonHxy commented Nov 17, 2022

@poorbarcode do you have any other comments?:)

@nodece nodece requested a review from poorbarcode November 17, 2022 03:25
@poorbarcode
Copy link
Contributor

Hi @AnonHxy

It make sense to me. It seems that we could just remove the following from the test case, and keep others unchanged

 LocalBrokerData data = loadManager.getLoadManager().updateLocalBrokerData();
 data.cleanDeltas();
 data.getBundles().clear();

WDYT @poorbarcode

I don't know why data.clear should executed in this place (-_-)

@poorbarcode do you have any other comments?:)

I think the concurrent operation was caused by the test case, so there is no need to change the collection to be thread-safe.

@AnonHxy AnonHxy force-pushed the fix_ConcurrentModificationException_bundles branch from bad27a2 to e1ffdeb Compare November 17, 2022 06:14
@AnonHxy AnonHxy changed the title [fix][broker] Fix ConcurrentModificationException in LocalBrokerData [fix][test] Fix ConcurrentModificationException in LocalBrokerData Nov 17, 2022
@AnonHxy AnonHxy changed the title [fix][test] Fix ConcurrentModificationException in LocalBrokerData [fix][test] Fix ConcurrentModificationException in testConcurrentWriteBrokerData Nov 17, 2022
@AnonHxy AnonHxy added type/flaky-tests and removed type/bug The PR fixed a bug or issue reported a bug labels Nov 17, 2022
@AnonHxy
Copy link
Contributor Author

AnonHxy commented Nov 17, 2022

Updated. PTAL @poorbarcode

@codecov-commenter
Copy link

codecov-commenter commented Nov 17, 2022

Codecov Report

Merging #18418 (e1ffdeb) into master (27186a1) will increase coverage by 1.71%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #18418      +/-   ##
============================================
+ Coverage     45.61%   47.33%   +1.71%     
+ Complexity    10728    10462     -266     
============================================
  Files           752      697      -55     
  Lines         72521    68015    -4506     
  Branches       7791     7285     -506     
============================================
- Hits          33083    32193     -890     
+ Misses        35769    32226    -3543     
+ Partials       3669     3596      -73     
Flag Coverage Δ
unittests 47.33% <ø> (+1.71%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...pache/pulsar/client/impl/schema/ByteBufSchema.java 68.75% <0.00%> (-31.25%) ⬇️
...pulsar/client/impl/schema/LocalDateTimeSchema.java 76.92% <0.00%> (-23.08%) ⬇️
...g/apache/pulsar/client/impl/schema/ByteSchema.java 50.00% <0.00%> (-19.24%) ⬇️
...he/pulsar/client/impl/schema/ByteBufferSchema.java 43.33% <0.00%> (-16.67%) ⬇️
...apache/pulsar/client/impl/schema/StringSchema.java 78.26% <0.00%> (-13.05%) ⬇️
.../apache/pulsar/client/impl/schema/BytesSchema.java 88.23% <0.00%> (-11.77%) ⬇️
.../org/apache/bookkeeper/mledger/impl/EntryImpl.java 69.62% <0.00%> (-11.27%) ⬇️
...e/pulsar/broker/service/EntryBatchIndexesAcks.java 82.14% <0.00%> (-10.72%) ⬇️
...ce/schema/validator/StructSchemaDataValidator.java 52.38% <0.00%> (-9.53%) ⬇️
.../pulsar/broker/service/SharedConsumerAssignor.java 68.51% <0.00%> (-9.26%) ⬇️
... and 130 more

@AnonHxy AnonHxy merged commit 007129b into apache:master Nov 18, 2022
@poorbarcode
Copy link
Contributor

@AnonHxy

Updated. PTAL @poorbarcode

Sorry, I've been so busy these days.

LGTM, Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs type/flaky-tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flaky-test: BrokerServiceLookupTest.testConcurrentWriteBrokerData
4 participants