Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wrong redelivery count while redeliver when consumer disconnected. #5895

Merged
merged 2 commits into from
Dec 24, 2019

Conversation

codelipenghui
Copy link
Contributor

Fixes #5881

Motivation

Currently implementation only increase redelivery count when consumer call redeliver un-ack messages, since a consumer disconnect also let messages redeliver to other consumers in Shared subscription mode, but the redelivery count does not increase.

Modifications

  1. When consumer send redelivery un-ack message request or consumer disconnect, add the position to redelivery message tracker.
  2. When send messages to consumer will check if the tracker contains this message, if yes, increase redelivery count, otherwise use 0 as redelivery count.

Verifying this change

Added unit test for this change

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API: (no)
  • The schema: (no)
  • The default values of configurations: (no)
  • The wire protocol: (no)
  • The rest endpoints: (no)
  • The admin cli options: (no)
  • Anything that affects deployment: (no)

Documentation

  • Does this pull request introduce a new feature? (no)

@codelipenghui
Copy link
Contributor Author

run java8 tests
run cpp tests

@sijie
Copy link
Member

sijie commented Dec 20, 2019

run cpp tests
run java8 tests

@sijie sijie added this to the 2.5.0 milestone Dec 20, 2019
@codelipenghui
Copy link
Contributor Author

run cpp tests
run java8 tests

@tuteng
Copy link
Member

tuteng commented Dec 21, 2019

run cpp tests

1 similar comment
@tuteng
Copy link
Member

tuteng commented Dec 21, 2019

run cpp tests

@Humbedooh
Copy link
Member

run cpp tests
(testing for the third time whether this does anything, ignore)

@Humbedooh
Copy link
Member

run cpp tests

@codelipenghui
Copy link
Contributor Author

run cpp tests
run java8 tests

@codelipenghui
Copy link
Contributor Author

run java8 tests

@sijie sijie merged commit d3f45ed into apache:master Dec 24, 2019
huangdx0726 pushed a commit to huangdx0726/pulsar that referenced this pull request Aug 24, 2020
apache#5895)

Fixes apache#5881

### Motivation

Currently implementation only increase redelivery count when consumer call redeliver un-ack messages, since a consumer disconnect also let messages redeliver to other consumers in Shared subscription mode, but the redelivery count does not increase. 

### Modifications

1. When consumer send redelivery un-ack message request or consumer disconnect, add the position to redelivery message tracker.
2. When send messages to consumer will check if the tracker contains this message, if yes, increase redelivery count, otherwise use 0 as redelivery count.

### Verifying this change

Added unit test for this change
@codelipenghui codelipenghui deleted the fix_redelivery_count branch May 19, 2021 05:31
michaeljmarshall added a commit that referenced this pull request Aug 24, 2022
Reverts: #5881

### Motivation

The `redeliveryCount` was introduced in [PIP 22](https://github.com/apache/pulsar/wiki/PIP-22%3A-Pulsar-Dead-Letter-Topic) with this PR #2508. It is an extra field on a message that indicates how many times a message has been redelivered. In the original design, it was only incremented for shared subscriptions when the consumer sent `REDELIVER_UNACKNOWLEDGED_MESSAGES` to the broker.

In #5881, this field's logic changed so that it is incremented each time a broker delivers a message to a consumer (after the initial delivery). The problem with this logic is that it counts messages that are sent to a consumer's `receiveQueue`, but not actually received by the client application, as "delivered" messages. This is especially problematic for the DLQ implementation because it relies on the counter to track deliveries, and this eager incrementing of the `redeliveryCount` could lead to fewer retries than an application would like.

This PR returns the broker's behavior to the original state before #5881.

Note that the DLQ logic is only triggered by messages that hit their ack timeout or are negatively acknowledged. This means that in some cases, a message could be delivered many times to a `receiveQueue` and once to the application and then sent to the DLQ. Given that our DLQ implementation has an intentional preference towards over delivery instead of under delivery, I think this logic should be fixed.

One of the consequences of this PR is that the message filter logic for redelivering messages triggers this logic for incrementing `redeliveryCount`. See this code here: https://github.com/apache/pulsar/blob/b1a29b520d34d60e60160e3a7b9b0e26926063ee/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractBaseDispatcher.java#L198-L206

I'll need feedback from someone more familiar with message filtering to understand if this is a problematic change. If it is, I think we might need to revisit the logic in `filterEntriesForConsumer`.

### Modifications

* Revert the relevant changes from #5895. I kept the test that was added in the PR and modified the assertion.
* Fix test assertion ordering and modify expected value to align with new paradigm.

### Verifying this change

This change includes modifications to tests as well as existing test coverage.

### Does this pull request potentially affect one of the following parts:

This change is a break in current behavior, so I will send an email to the dev mailing list: https://lists.apache.org/thread/ts9d6zbtlz3y5xtv7p0c3dslk0vljpj2.

### Documentation
  
- [x] `doc-not-needed`
michaeljmarshall added a commit to datastax/pulsar that referenced this pull request Aug 24, 2022
Reverts: apache#5881

### Motivation

The `redeliveryCount` was introduced in [PIP 22](https://github.com/apache/pulsar/wiki/PIP-22%3A-Pulsar-Dead-Letter-Topic) with this PR apache#2508. It is an extra field on a message that indicates how many times a message has been redelivered. In the original design, it was only incremented for shared subscriptions when the consumer sent `REDELIVER_UNACKNOWLEDGED_MESSAGES` to the broker.

In apache#5881, this field's logic changed so that it is incremented each time a broker delivers a message to a consumer (after the initial delivery). The problem with this logic is that it counts messages that are sent to a consumer's `receiveQueue`, but not actually received by the client application, as "delivered" messages. This is especially problematic for the DLQ implementation because it relies on the counter to track deliveries, and this eager incrementing of the `redeliveryCount` could lead to fewer retries than an application would like.

This PR returns the broker's behavior to the original state before apache#5881.

Note that the DLQ logic is only triggered by messages that hit their ack timeout or are negatively acknowledged. This means that in some cases, a message could be delivered many times to a `receiveQueue` and once to the application and then sent to the DLQ. Given that our DLQ implementation has an intentional preference towards over delivery instead of under delivery, I think this logic should be fixed.

One of the consequences of this PR is that the message filter logic for redelivering messages triggers this logic for incrementing `redeliveryCount`. See this code here: https://github.com/apache/pulsar/blob/b1a29b520d34d60e60160e3a7b9b0e26926063ee/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractBaseDispatcher.java#L198-L206

I'll need feedback from someone more familiar with message filtering to understand if this is a problematic change. If it is, I think we might need to revisit the logic in `filterEntriesForConsumer`.

### Modifications

* Revert the relevant changes from apache#5895. I kept the test that was added in the PR and modified the assertion.
* Fix test assertion ordering and modify expected value to align with new paradigm.

### Verifying this change

This change includes modifications to tests as well as existing test coverage.

### Does this pull request potentially affect one of the following parts:

This change is a break in current behavior, so I will send an email to the dev mailing list: https://lists.apache.org/thread/ts9d6zbtlz3y5xtv7p0c3dslk0vljpj2.

### Documentation

- [x] `doc-not-needed`

(cherry picked from commit 2fd3509)
Nicklee007 pushed a commit to Nicklee007/pulsar that referenced this pull request Aug 29, 2022
Reverts: apache#5881

### Motivation

The `redeliveryCount` was introduced in [PIP 22](https://github.com/apache/pulsar/wiki/PIP-22%3A-Pulsar-Dead-Letter-Topic) with this PR apache#2508. It is an extra field on a message that indicates how many times a message has been redelivered. In the original design, it was only incremented for shared subscriptions when the consumer sent `REDELIVER_UNACKNOWLEDGED_MESSAGES` to the broker.

In apache#5881, this field's logic changed so that it is incremented each time a broker delivers a message to a consumer (after the initial delivery). The problem with this logic is that it counts messages that are sent to a consumer's `receiveQueue`, but not actually received by the client application, as "delivered" messages. This is especially problematic for the DLQ implementation because it relies on the counter to track deliveries, and this eager incrementing of the `redeliveryCount` could lead to fewer retries than an application would like.

This PR returns the broker's behavior to the original state before apache#5881.

Note that the DLQ logic is only triggered by messages that hit their ack timeout or are negatively acknowledged. This means that in some cases, a message could be delivered many times to a `receiveQueue` and once to the application and then sent to the DLQ. Given that our DLQ implementation has an intentional preference towards over delivery instead of under delivery, I think this logic should be fixed.

One of the consequences of this PR is that the message filter logic for redelivering messages triggers this logic for incrementing `redeliveryCount`. See this code here: https://github.com/apache/pulsar/blob/b1a29b520d34d60e60160e3a7b9b0e26926063ee/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractBaseDispatcher.java#L198-L206

I'll need feedback from someone more familiar with message filtering to understand if this is a problematic change. If it is, I think we might need to revisit the logic in `filterEntriesForConsumer`.

### Modifications

* Revert the relevant changes from apache#5895. I kept the test that was added in the PR and modified the assertion.
* Fix test assertion ordering and modify expected value to align with new paradigm.

### Verifying this change

This change includes modifications to tests as well as existing test coverage.

### Does this pull request potentially affect one of the following parts:

This change is a break in current behavior, so I will send an email to the dev mailing list: https://lists.apache.org/thread/ts9d6zbtlz3y5xtv7p0c3dslk0vljpj2.

### Documentation
  
- [x] `doc-not-needed`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Redelivery count implementation can bring down all the consumers
4 participants