-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][broker] Deprecated aggregatePublisherStatsByProducerName config #18254
Conversation
… stat aggregation (deprecated aggregatePublisherStatsByProducerName)
public boolean supportsPartialProducer; | ||
/** Whether partial producer is supported at client. supportsPartialProducer is true always. */ | ||
@Deprecated | ||
public boolean supportsPartialProducer = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why still keep this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getter of this member var is already exposed to the client interface. Are we allowed to make a breaking change?
@@ -44,6 +44,7 @@ public interface PublisherStats { | |||
long getProducerId(); | |||
|
|||
/** Whether partial producer is supported at client. */ | |||
@Deprecated | |||
boolean isSupportsPartialProducer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not remove this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already exposed to the client interface. Are we allowed to make a breaking change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the correct change. If it has already been released, we cannot remove it without first deprecating it. I don't know that we have documented guidance on how long to keep methods like this marked as deprecated.
Codecov Report
@@ Coverage Diff @@
## master #18254 +/- ##
============================================
+ Coverage 34.91% 38.88% +3.96%
- Complexity 5707 8307 +2600
============================================
Files 607 685 +78
Lines 53396 67330 +13934
Branches 5712 7215 +1503
============================================
+ Hits 18644 26180 +7536
- Misses 32119 38132 +6013
- Partials 2633 3018 +385
Flags with carried forward coverage won't be shown. Click here to find out more.
|
/pulsarbot rerun-failure-checks |
In my understanding, old partitioned producer clients are not aggregated by producerName completely. So, this PR introduce breaking changes to them. In addition, I think this approach doesn't deprecate a feature but removes a feature. |
Yes, this pr is not backward-compatible to deprecate the broken feature. |
For example, when enabling
and create the partitioned producer to the partitioned topic by the v2.8.4 java client,
then the broker returns partitioned topic stats like the one below.
However, when running with your fixes on the server side and creating a producer to a partitioned topic by the v2.8.4 java client,
then the broker returns partitioned topic stats like the one below. As you can see, publishers can't be aggregated by producerName even if the partitioned producer is alone in the partitioned topic. It is breaking changes.
I understand the current index-based aggregation procedure has an issue you mentioned in this PR but are these braking changes allowed on this project? |
Thank you for sharing the example. Although it is not desirable, I think we could be lenient on this change in the stat API response(assuming this If this change is not acceptable, I guess we could push this fix to the next major version. Also, when there are thousands of producers(consumers) for a (partitioned-)topic, it is expensive to aggregate each publisher(subscriptions)'s stats on-the-fly across the brokers. So, alternatively for the next major version, I think we could further define producers(subscriptions)' API like the below and drop the
|
To fix the issue, do you have any suggestions other than what I mentioned above? |
I have no other suggestions. Why don't you discuss or vote on the dev@ ML before introducing this feature? |
Opened a pulsar community discussion thread here. https://lists.apache.org/thread/vofv1oz0wvzlwk4x9vk067rhkscn8bqo |
Motivation
The index-based publisher stat aggregation(configured by
aggregatePublisherStatsByProducerName
=false, default) can burst memory or wrongly aggregate publisher metrics if each partition stat returns a different size or order of the publisher stat list.In the worst case, if there are many partitions and publishers created and closed concurrently, the current code can create PublisherStatsImpl objects exponentially, and this can cause a high GC time or OOM.
Issue Code reference:
2c428f7#diff-02e50674125a597f8ae3405a884590759f2fdaa10104cea511d5ea44b6ff6490R224-R247
Modifications
aggregatePublisherStatsByProducerName
broker config because the default, the index-based aggregation is inherently wrong in a highly concurrent producer environment(where the order and size of the publisher stat list are not guaranteed to be the same). The publisher stats need to be aggregated by a unique key, the producer name(aggregatePublisherStatsByProducerName=true).Verifying this change
This change added unit tests.
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: heesung-sn#13