-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] [fix] [broker] Do not record a bundle-unloading into the topic load failed metrics #23334
base: master
Are you sure you want to change the base?
[Draft] [fix] [broker] Do not record a bundle-unloading into the topic load failed metrics #23334
Conversation
MutableInt failedLoadTopic1 = new MutableInt(0); | ||
MutableInt concurrencyLoadTopicAndUnloadBundle1 = new MutableInt(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since multiple threads are involved, shouldn't these be AtomicIntegers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't currently understand the reason to use MutableInt
s.
If the reason is to that a final variable is needed for an inner lambda, it might be better to use a final variable.
One trick that could be useful is to isolate variable scopes with a {....}
block. That would allow to introduce the same variable name multiple times in a single unit test without variable name collisions since the variables would be in different scopes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Don't we need to count this case as failure as well? To me if the topic load failed on the broker and client lookup failed, it should be counted as a failure. I think this is one area that Extensible Load Manager can shine as it handles lookups more gracefully during unloading -- the topic lookups are deferred until the unloading completes in the service unit state channel. Hence, the client won't see failure, and the lookup response should be slightly delayed instead of returning error code(no retry as well). |
Sorry, I misunderstood the comment. I reply again:
If users assumed it should be combined into failed to load topic, they can get the counter this way |
I still think this is a still failure case. Can we add a "reason" label instead of adding a new metric? |
I will start a PIP for this metric, it is useful to detect the following issue:
Update on 2024-09-26 02:05 UTC Since the PIP is in-progress, I mark this PR as |
Motivation & Modifications
Pulsar has a metric that indicates load topic failed:
topic_load_failed_total
, it reflects something is not expected, such asBut the topic loaded failed due to a bundle in unloading is expected, we should not record it into this metrics.
Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: x