Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE #4090]Fail faster to keep consistent state #4094

Closed
wants to merge 3 commits into from
Closed

[ISSUE #4090]Fail faster to keep consistent state #4094

wants to merge 3 commits into from

Conversation

dugenkui03
Copy link
Contributor

Make sure set the target branch to develop

What is the purpose of the change

#4090

Brief changelog

XX

Verifying this change

XXXX

Follow this checklist to help us incorporate your contribution quickly and easily. Notice, it would be helpful if you could finish the following 5 checklist(the last one is not necessary)before request the community to review your PR.

  • Make sure there is a Github issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a Github issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue.
  • Format the pull request title like [ISSUE #123] Fix UnknownException when host config not exist. Each commit in the pull request should have a meaningful subject line and body.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Write necessary unit-test(over 80% coverage) to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add integration-test in test module.
  • Run mvn -B clean apache-rat:check findbugs:findbugs checkstyle:checkstyle to make sure basic checks pass. Run mvn clean install -DskipITs to make sure unit-test pass. Run mvn clean test-compile failsafe:integration-test to make sure integration-test pass.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

@dugenkui03 dugenkui03 changed the title Fail faster to keep consistent state. [ISSUE #4090]Fail faster to keep consistent state Apr 1, 2022
@codecov-commenter
Copy link

codecov-commenter commented Apr 1, 2022

Codecov Report

Merging #4094 (052ca2c) into develop (cc478bd) will increase coverage by 0.48%.
The diff coverage is 40.00%.

@@              Coverage Diff              @@
##             develop    #4094      +/-   ##
=============================================
+ Coverage      47.46%   47.94%   +0.48%     
- Complexity      4909     5008      +99     
=============================================
  Files            633      634       +1     
  Lines          42317    42537     +220     
  Branches        5544     5573      +29     
=============================================
+ Hits           20086    20395     +309     
+ Misses         19744    19639     -105     
- Partials        2487     2503      +16     
Impacted Files Coverage Δ
...ent/impl/consumer/DefaultLitePullConsumerImpl.java 69.07% <0.00%> (+0.27%) ⬆️
...ketmq/client/consumer/DefaultLitePullConsumer.java 74.55% <50.00%> (-1.52%) ⬇️
...in/java/org/apache/rocketmq/test/util/MQAdmin.java 38.88% <0.00%> (-5.56%) ⬇️
.../java/org/apache/rocketmq/acl/common/AclUtils.java 76.25% <0.00%> (-3.60%) ⬇️
...g/apache/rocketmq/broker/util/ServiceProvider.java 46.42% <0.00%> (-3.58%) ⬇️
...etmq/client/latency/LatencyFaultToleranceImpl.java 49.35% <0.00%> (-3.22%) ⬇️
...n/java/org/apache/rocketmq/store/RunningFlags.java 31.11% <0.00%> (-2.23%) ⬇️
...pache/rocketmq/store/MultiPathMappedFileQueue.java 92.30% <0.00%> (-1.93%) ⬇️
...and/acl/ClusterAclConfigVersionListSubCommand.java 20.40% <0.00%> (-1.82%) ⬇️
...e/rocketmq/client/impl/consumer/RebalanceImpl.java 43.75% <0.00%> (-1.57%) ⬇️
... and 58 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc478bd...052ca2c. Read the comment docs.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.04%) to 51.915% when pulling 052ca2c on dugenkui03:patch-05 into e3b6748 on apache:develop.

@coveralls
Copy link

coveralls commented Apr 1, 2022

Coverage Status

Coverage increased (+0.09%) to 52.044% when pulling 7fc4a6e on dugenkui03:patch-05 into e3b6748 on apache:develop.

@dugenkui03 dugenkui03 changed the title [ISSUE #4090]Fail faster to keep consistent state [ISSUE #4090]Fail faster to keep consistent state【修复重复调用DefaultLitePullConsumer#start导致的状态不一致问题】 Apr 6, 2022
@RongtongJin RongtongJin changed the title [ISSUE #4090]Fail faster to keep consistent state【修复重复调用DefaultLitePullConsumer#start导致的状态不一致问题】 [ISSUE #4090]Fail faster to keep consistent state Apr 7, 2022
Comment on lines +274 to +281
public synchronized boolean isCreateJust() {
return this.serviceState == ServiceState.CREATE_JUST;
}

public synchronized ServiceState getServiceState() {
return this.serviceState;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to add synchronized here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serviceState is in a wrong/incompletable state before synchronized #start() return, since the value of serviceState is modified several times in synchronized #start(). synchronized in isCreateJust and getServiceState avoid read wrong/incompletable value.

synchronized #start()中对serviceState进行了多次更新、并且方法返回之前的状态是 不正确/不完整 的,synchronized 是为了避免读取到这些 不正确/不完整 的状态。而且我注意到在该pr之前,serviceState的修改和读取也都使用了synchronized加锁、这已经可以保证serviceState的可见性了,因此serviceState上的volatile修饰符似乎是多余的。

@lwclover
Copy link
Contributor

It's not a shared object. If there's a thread safety issue, it's probably not being used in the right way.

@dugenkui03
Copy link
Contributor Author

@lwclover 非常感谢你的review.

首先DefaultLitePullConsumerImpl中的serviceState是可以通过DefaultLitePullConsumer被并发访问的,即使DefaultLitePullConsumerImpl是”非共享“的。

其次#isCreateJust()#getServiceState()加锁最简单、重要的原因是:在DefaultLitePullConsumerImpl中对serviceState的访问、包括读写之前都是加对象锁的,我推测可能是因为#start()#shutdown()中包含 check-then-action 操作,最初的设计者想要避免程序读取到 action 之前的无效状态

综上,我认为DefaultLitePullConsumerImplserviceState在并发环境下的安全访问作出了承诺:开发者(包括源码维护者)可以安全的访问serviceState而无需自己做额外的同步策略去防止读取到无效状态。

@dugenkui03
Copy link
Contributor Author

dugenkui03 commented Apr 25, 2022

@RongtongJin @tianliuliu 请帮忙重新review,PR有两点更改:

  1. DefaultLitePullConsumer#start()加锁以避免多个线程并发访问该方法时造成的状态不一致,这是个低频方法所以不会对性能产生坏的影响;
  2. 去掉DefaultLitePullConsumerImpl.serviceState,因为当前访问该状态的所有路径都加了对象锁、已经可以保证serviceState的可见性了。

@dugenkui03 dugenkui03 requested a review from RongtongJin April 25, 2022 17:27
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants