-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][broker] Revert 5895: fix redeliveryCount #17060
[fix][broker] Revert 5895: fix redeliveryCount #17060
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
I support this revert.
I have checked the server side filtering feature code (PIP-105)
in case of filtered messages (REJECT outcome from the filter) we call PersistentSubscription#acknowledgeMessage(List<Position> positions, AckType ackType, Map<String, Long> properties
Line 375 in c1c3c86
public void acknowledgeMessage(List<Position> positions, AckType ackType, Map<String, Long> properties) { |
in case of RESCHEDULED messages we call PersistentSubscription#redeliverUnacknowledgedMessages
Line 1174 in c1c3c86
public void redeliverUnacknowledgedMessages(Consumer consumer, List<PositionImpl> positions) { |
that goes down to redeliveryTracker.addIfAbsent(position);
that (IIUC) it does not increment the delivery count
pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/InMemoryRedeliveryTracker.java
Line 74 in cc64889
trackerCache.putIfAbsent(positionImpl.getLedgerId(), positionImpl.getEntryId(), 0, 0L); |
we could add more testing on the delivery count in case of RESCHEDULED messages (we can improve the existing tests).
by the way, RESCHEDULED should not increment the delivery count because the message is not sent to the Consumer
@eolivelli - your analysis matches my understanding as well. I agree that incrementing the redelivery count is not the right design for the message filters. I am going to merge this as is and then we can work on fixing that design in a future PR. |
Reverts: apache#5881 ### Motivation The `redeliveryCount` was introduced in [PIP 22](https://github.com/apache/pulsar/wiki/PIP-22%3A-Pulsar-Dead-Letter-Topic) with this PR apache#2508. It is an extra field on a message that indicates how many times a message has been redelivered. In the original design, it was only incremented for shared subscriptions when the consumer sent `REDELIVER_UNACKNOWLEDGED_MESSAGES` to the broker. In apache#5881, this field's logic changed so that it is incremented each time a broker delivers a message to a consumer (after the initial delivery). The problem with this logic is that it counts messages that are sent to a consumer's `receiveQueue`, but not actually received by the client application, as "delivered" messages. This is especially problematic for the DLQ implementation because it relies on the counter to track deliveries, and this eager incrementing of the `redeliveryCount` could lead to fewer retries than an application would like. This PR returns the broker's behavior to the original state before apache#5881. Note that the DLQ logic is only triggered by messages that hit their ack timeout or are negatively acknowledged. This means that in some cases, a message could be delivered many times to a `receiveQueue` and once to the application and then sent to the DLQ. Given that our DLQ implementation has an intentional preference towards over delivery instead of under delivery, I think this logic should be fixed. One of the consequences of this PR is that the message filter logic for redelivering messages triggers this logic for incrementing `redeliveryCount`. See this code here: https://github.com/apache/pulsar/blob/b1a29b520d34d60e60160e3a7b9b0e26926063ee/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractBaseDispatcher.java#L198-L206 I'll need feedback from someone more familiar with message filtering to understand if this is a problematic change. If it is, I think we might need to revisit the logic in `filterEntriesForConsumer`. ### Modifications * Revert the relevant changes from apache#5895. I kept the test that was added in the PR and modified the assertion. * Fix test assertion ordering and modify expected value to align with new paradigm. ### Verifying this change This change includes modifications to tests as well as existing test coverage. ### Does this pull request potentially affect one of the following parts: This change is a break in current behavior, so I will send an email to the dev mailing list: https://lists.apache.org/thread/ts9d6zbtlz3y5xtv7p0c3dslk0vljpj2. ### Documentation - [x] `doc-not-needed` (cherry picked from commit 2fd3509)
Reverts: apache#5881 ### Motivation The `redeliveryCount` was introduced in [PIP 22](https://github.com/apache/pulsar/wiki/PIP-22%3A-Pulsar-Dead-Letter-Topic) with this PR apache#2508. It is an extra field on a message that indicates how many times a message has been redelivered. In the original design, it was only incremented for shared subscriptions when the consumer sent `REDELIVER_UNACKNOWLEDGED_MESSAGES` to the broker. In apache#5881, this field's logic changed so that it is incremented each time a broker delivers a message to a consumer (after the initial delivery). The problem with this logic is that it counts messages that are sent to a consumer's `receiveQueue`, but not actually received by the client application, as "delivered" messages. This is especially problematic for the DLQ implementation because it relies on the counter to track deliveries, and this eager incrementing of the `redeliveryCount` could lead to fewer retries than an application would like. This PR returns the broker's behavior to the original state before apache#5881. Note that the DLQ logic is only triggered by messages that hit their ack timeout or are negatively acknowledged. This means that in some cases, a message could be delivered many times to a `receiveQueue` and once to the application and then sent to the DLQ. Given that our DLQ implementation has an intentional preference towards over delivery instead of under delivery, I think this logic should be fixed. One of the consequences of this PR is that the message filter logic for redelivering messages triggers this logic for incrementing `redeliveryCount`. See this code here: https://github.com/apache/pulsar/blob/b1a29b520d34d60e60160e3a7b9b0e26926063ee/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractBaseDispatcher.java#L198-L206 I'll need feedback from someone more familiar with message filtering to understand if this is a problematic change. If it is, I think we might need to revisit the logic in `filterEntriesForConsumer`. ### Modifications * Revert the relevant changes from apache#5895. I kept the test that was added in the PR and modified the assertion. * Fix test assertion ordering and modify expected value to align with new paradigm. ### Verifying this change This change includes modifications to tests as well as existing test coverage. ### Does this pull request potentially affect one of the following parts: This change is a break in current behavior, so I will send an email to the dev mailing list: https://lists.apache.org/thread/ts9d6zbtlz3y5xtv7p0c3dslk0vljpj2. ### Documentation - [x] `doc-not-needed`
I'm "riding" on this topic to ask a tangent question. We have "redeliveryCount" meta field in the message as written in this PR description, and we also have RECONSUMETIMES as written here. What's the relationship between those? |
Reverts: #5881
Motivation
The
redeliveryCount
was introduced in PIP 22 with this PR #2508. It is an extra field on a message that indicates how many times a message has been redelivered. In the original design, it was only incremented for shared subscriptions when the consumer sentREDELIVER_UNACKNOWLEDGED_MESSAGES
to the broker.In #5881, this field's logic changed so that it is incremented each time a broker delivers a message to a consumer (after the initial delivery). The problem with this logic is that it counts messages that are sent to a consumer's
receiveQueue
, but not actually received by the client application, as "delivered" messages. This is especially problematic for the DLQ implementation because it relies on the counter to track deliveries, and this eager incrementing of theredeliveryCount
could lead to fewer retries than an application would like.This PR returns the broker's behavior to the original state before #5881.
Note that the DLQ logic is only triggered by messages that hit their ack timeout or are negatively acknowledged. This means that in some cases, a message could be delivered many times to a
receiveQueue
and once to the application and then sent to the DLQ. Given that our DLQ implementation has an intentional preference towards over delivery instead of under delivery, I think this logic should be fixed.One of the consequences of this PR is that the message filter logic for redelivering messages triggers this logic for incrementing
redeliveryCount
. See this code here:pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractBaseDispatcher.java
Lines 198 to 206 in b1a29b5
I'll need feedback from someone more familiar with message filtering to understand if this is a problematic change. If it is, I think we might need to revisit the logic in
filterEntriesForConsumer
.Modifications
Verifying this change
This change includes modifications to tests as well as existing test coverage.
Does this pull request potentially affect one of the following parts:
This change is a break in current behavior, so I will send an email to the dev mailing list: https://lists.apache.org/thread/ts9d6zbtlz3y5xtv7p0c3dslk0vljpj2.
Documentation
doc-not-needed