-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][client] Use dedicated executor for requests in BinaryProtoLookupService #23378
Conversation
Thanks for bringing this up. This seems to be a severe bug and it would justify to report it separately. It's easy to miss this PR that this is addressing a critical issue. |
Is this the only case where this can be reproduced? |
@nodece I think that the correct solution would be to call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The correct solution for keeping a reference in the onPulsarCommand
broker interceptor implementation would be to use new BaseCommand().copyFrom(command)
. That's why I don't think that the solution in the PR directly related to addressing the problem.
At present, it looks like this.
My interceptor doesn't change the command. Please notice the 1 and 2 in my picture, the broker and client in the same k thread.
Use the
The produce creation needs to do the lookup and then get partition metadata, which requests the broker by We are using the The current solution version uses a dedicated executor to create commands and make requests in the client. Pulsar uses implicit executor in many places, so often we are not sure who the executor is, as a result, this accident occurred, clarifying the executor is not a bad thing. |
It's very hard to be sure whether it had performance impacts. When an executor is defined, it will add extra overhead of queuing to the other executor and the extra thread switching overhead. It depends a lot on the situation whether this causes performance regressions or not. |
Do you have any suggestions or recommendations for resolving this issue? |
@nodece The changes in this PR LGTM. There's no risk of additional overhead of using the separate executor since the number of ops is relatively low for the operations that it applies to. I'll approve this PR. Thanks for the contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
A lot of tests fail. This is causing problems.
f2fdbbf
to
ae23e52
Compare
ae23e52
to
df0949b
Compare
df0949b
to
4f31a8e
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #23378 +/- ##
============================================
+ Coverage 73.57% 74.32% +0.75%
- Complexity 32624 34402 +1778
============================================
Files 1877 1943 +66
Lines 139502 146973 +7471
Branches 15299 16191 +892
============================================
+ Hits 102638 109239 +6601
- Misses 28908 29302 +394
- Partials 7956 8432 +476
Flags with carried forward coverage won't be shown. Click here to find out more.
|
pulsar-client/src/main/java/org/apache/pulsar/client/impl/BinaryProtoLookupService.java
Outdated
Show resolved
Hide resolved
pulsar-client/src/main/java/org/apache/pulsar/client/impl/PulsarClientImpl.java
Outdated
Show resolved
Hide resolved
…pService Signed-off-by: Zixuan Liu <nodeces@gmail.com>
Signed-off-by: Zixuan Liu <nodeces@gmail.com>
Signed-off-by: Zixuan Liu <nodeces@gmail.com>
Signed-off-by: Zixuan Liu <nodeces@gmail.com>
2d02348
to
e52bd09
Compare
Ping @lhotari |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Signed-off-by: Zixuan Liu <nodeces@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, good work @nodece
…pService (apache#23378) Signed-off-by: Zixuan Liu <nodeces@gmail.com> (cherry picked from commit f98297f) Signed-off-by: Zixuan Liu <nodeces@gmail.com>
…pService (apache#23378) Signed-off-by: Zixuan Liu <nodeces@gmail.com> (cherry picked from commit f98297f) Signed-off-by: Zixuan Liu <nodeces@gmail.com>
It was cherry-picked to the |
@nodece Please notify the dev mailing list about this change. This PR could potentially cause regressions so it's better to keep others informed about the change in branch-3.0. I'm not against changing this, but it's hard to know if there are regressions before it has been extensively tested. Are you going to test branch-3.0 with this change? |
…pService (apache#23378) (apache#23461) Signed-off-by: Zixuan Liu <nodeces@gmail.com> (cherry picked from commit 7006381)
…pService (apache#23378) (apache#23461) Signed-off-by: Zixuan Liu <nodeces@gmail.com> (cherry picked from commit 7006381)
Motivation
I'm testing Pulsar 3.0.6. We have a Pulsar interceptor, which records the data(web request/binary command) to the topic by the producer, but I got this error:
This indicates the broker sends a
PartitionMetadataRequest
command to the client, this is a bizarre behavior.How to reproduce this issue?
org.apache.pulsar.broker.intercept.BrokerInterceptor#onPulsarCommand
.PARTITION_METADATA
.Due to Pulsar's use of ThreadLocal for command creation, the command instances are being unexpectedly modified.
Modifications
Documentation
doc
doc-required
doc-not-needed
doc-complete