Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][client] Fix failover/exclusive consumer with batch cumulate ack issue. #18454

Merged
merged 4 commits into from
Nov 15, 2022

Conversation

Technoboy-
Copy link
Contributor

@Technoboy- Technoboy- commented Nov 14, 2022

Motivation

When the consumer redeliverUnacknowledgedMessages, the MultiTopicsConsumerImpl doesn't filter the duplicate message. Also, the ConsumerImpl doesn't check the batched msg-id, this will cause missing msg in the below case :

  • Send one msg with batch size of 10.
  • Cumulate ack the first one.
  • Redeliver the msg.
  • The left 9 msg will be filtered(missing).

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@Technoboy- Technoboy- self-assigned this Nov 14, 2022
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Nov 14, 2022
@Technoboy- Technoboy- added type/bug The PR fixed a bug or issue reported a bug area/client ready-to-test labels Nov 14, 2022
@Technoboy- Technoboy- added this to the 2.12.0 milestone Nov 14, 2022
@Technoboy- Technoboy- added release/blocker Indicate the PR or issue that should block the release until it gets resolved and removed release/blocker Indicate the PR or issue that should block the release until it gets resolved labels Nov 14, 2022
@Technoboy- Technoboy- closed this Nov 14, 2022
@Technoboy- Technoboy- reopened this Nov 14, 2022
@codecov-commenter
Copy link

codecov-commenter commented Nov 14, 2022

Codecov Report

Merging #18454 (dd28dad) into master (b31c5a6) will increase coverage by 0.15%.
The diff coverage is 39.04%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #18454      +/-   ##
============================================
+ Coverage     46.98%   47.14%   +0.15%     
- Complexity    10343    10413      +70     
============================================
  Files           692      697       +5     
  Lines         67766    67981     +215     
  Branches       7259     7278      +19     
============================================
+ Hits          31842    32048     +206     
- Misses        32344    32366      +22     
+ Partials       3580     3567      -13     
Flag Coverage Δ
unittests 47.14% <39.04%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../org/apache/bookkeeper/mledger/impl/EntryImpl.java 69.62% <0.00%> (-11.27%) ⬇️
...lsar/broker/service/RedeliveryTrackerDisabled.java 50.00% <ø> (ø)
...va/org/apache/pulsar/broker/service/ServerCnx.java 48.57% <ø> (-0.12%) ⬇️
...ersistentStreamingDispatcherMultipleConsumers.java 0.00% <0.00%> (ø)
.../java/org/apache/pulsar/client/impl/ClientCnx.java 30.16% <ø> (ø)
...a/org/apache/pulsar/client/impl/TableViewImpl.java 0.00% <0.00%> (ø)
...ar/client/impl/conf/ProducerConfigurationData.java 84.70% <ø> (-0.18%) ⬇️
...va/org/apache/pulsar/client/impl/ConsumerImpl.java 15.03% <7.69%> (-0.02%) ⬇️
...keeper/mledger/impl/cache/RangeEntryCacheImpl.java 45.79% <22.44%> (-6.31%) ⬇️
...he/pulsar/client/impl/MultiTopicsConsumerImpl.java 22.86% <25.00%> (-0.09%) ⬇️
... and 76 more

Copy link
Member

@mattisonchao mattisonchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, 👍🏼
It looks like the problems are not only duplicate messages but also missing messages,
I'm wondering if It's better to add a declaration in the motivation part.

Copy link
Contributor

@poorbarcode poorbarcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch!

@Technoboy- Technoboy- merged commit 7712aa3 into apache:master Nov 15, 2022
BewareMyPower added a commit to BewareMyPower/pulsar that referenced this pull request Nov 15, 2022
### Motivation

apache#18454 fixed the potential message
loss when a batched message is redelivered and one single message of the
batch is added to the ACK tracker. However, it also leads to a
potential message duplication, see the `testConsumerDedup` test modified
by apache#18454.

The root cause is that single messages will still be passed into the
`isDuplicated` method in `receiveIndividualMessagesFromBatch`. However,
in this case, the `MessageId` is a `BatchedMessageIdImpl`, while the
`MessageId` in `lastCumulativeAck` or `pendingIndividualAcks` are
`MessageIdImpl` implementations.

### Modifications

Validate the class type in `isDuplicated` and convert a
`BatchedMessageIdImpl` to `MessageIdImpl`. Then revert the unnecessary
changes in apache#18454.

`ConsumerRedeliveryTest#testAckNotSent` is added to verify it works.

### TODO

The duplication could still happen when batch index ACK is enabled.
Because even after the ACK tracker is flushed, if only parts of a
batched message are not acknowledged, the whole batched message would
still be redelivered. I will open another PR to fix it.
congbobo184 pushed a commit that referenced this pull request Nov 18, 2022
@congbobo184 congbobo184 added the cherry-picked/branch-2.9 Archived: 2.9 is end of life label Nov 18, 2022
congbobo184 pushed a commit that referenced this pull request Nov 21, 2022
…#18486)

### Motivation

#18454 fixed the potential message loss when a batched message is redelivered and one single message of the batch is added to the ACK tracker. However, it also leads to a potential message duplication, see the `testConsumerDedup` test modified by #18454.

The root cause is that single messages will still be passed into the `isDuplicated` method in `receiveIndividualMessagesFromBatch`. However, in this case, the `MessageId` is a `BatchedMessageIdImpl`, while the `MessageId` in `lastCumulativeAck` or `pendingIndividualAcks` are `MessageIdImpl` implementations.

### Modifications

Validate the class type in `isDuplicated` and convert a `BatchedMessageIdImpl` to `MessageIdImpl`. Then revert the unnecessary changes in #18454.

`ConsumerRedeliveryTest#testAckNotSent` is added to verify it works.

### TODO

The duplication could still happen when batch index ACK is enabled. Because even after the ACK tracker is flushed, if only parts of a batched message are not acknowledged, the whole batched message would still be redelivered. I will open another PR to fix it.

### Documentation

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

### Matching PR in forked repository

PR in forked repository: BewareMyPower#8
congbobo184 pushed a commit that referenced this pull request Nov 21, 2022
…#18486)

#18454 fixed the potential message loss when a batched message is redelivered and one single message of the batch is added to the ACK tracker. However, it also leads to a potential message duplication, see the `testConsumerDedup` test modified by #18454.

The root cause is that single messages will still be passed into the `isDuplicated` method in `receiveIndividualMessagesFromBatch`. However, in this case, the `MessageId` is a `BatchedMessageIdImpl`, while the `MessageId` in `lastCumulativeAck` or `pendingIndividualAcks` are `MessageIdImpl` implementations.

Validate the class type in `isDuplicated` and convert a `BatchedMessageIdImpl` to `MessageIdImpl`. Then revert the unnecessary changes in #18454.

`ConsumerRedeliveryTest#testAckNotSent` is added to verify it works.

The duplication could still happen when batch index ACK is enabled. Because even after the ACK tracker is flushed, if only parts of a batched message are not acknowledged, the whole batched message would still be redelivered. I will open another PR to fix it.

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

PR in forked repository: BewareMyPower#8

(cherry picked from commit be1d07e)
congbobo184 pushed a commit that referenced this pull request Nov 26, 2022
liangyepianzhou pushed a commit that referenced this pull request Dec 5, 2022
…#18486)

#18454 fixed the potential message loss when a batched message is redelivered and one single message of the batch is added to the ACK tracker. However, it also leads to a potential message duplication, see the `testConsumerDedup` test modified by #18454.

The root cause is that single messages will still be passed into the `isDuplicated` method in `receiveIndividualMessagesFromBatch`. However, in this case, the `MessageId` is a `BatchedMessageIdImpl`, while the `MessageId` in `lastCumulativeAck` or `pendingIndividualAcks` are `MessageIdImpl` implementations.

Validate the class type in `isDuplicated` and convert a `BatchedMessageIdImpl` to `MessageIdImpl`. Then revert the unnecessary changes in #18454.

`ConsumerRedeliveryTest#testAckNotSent` is added to verify it works.

The duplication could still happen when batch index ACK is enabled. Because even after the ACK tracker is flushed, if only parts of a batched message are not acknowledged, the whole batched message would still be redelivered. I will open another PR to fix it.

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

PR in forked repository: BewareMyPower#8

(cherry picked from commit be1d07e)
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Dec 6, 2022
…apache#18486)

apache#18454 fixed the potential message loss when a batched message is redelivered and one single message of the batch is added to the ACK tracker. However, it also leads to a potential message duplication, see the `testConsumerDedup` test modified by apache#18454.

The root cause is that single messages will still be passed into the `isDuplicated` method in `receiveIndividualMessagesFromBatch`. However, in this case, the `MessageId` is a `BatchedMessageIdImpl`, while the `MessageId` in `lastCumulativeAck` or `pendingIndividualAcks` are `MessageIdImpl` implementations.

Validate the class type in `isDuplicated` and convert a `BatchedMessageIdImpl` to `MessageIdImpl`. Then revert the unnecessary changes in apache#18454.

`ConsumerRedeliveryTest#testAckNotSent` is added to verify it works.

The duplication could still happen when batch index ACK is enabled. Because even after the ACK tracker is flushed, if only parts of a batched message are not acknowledged, the whole batched message would still be redelivered. I will open another PR to fix it.

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

PR in forked repository: BewareMyPower#8

(cherry picked from commit be1d07e)
(cherry picked from commit 870b060)
congbobo184 pushed a commit that referenced this pull request Dec 8, 2022
…#18486)

#18454 fixed the potential message loss when a batched message is redelivered and one single message of the batch is added to the ACK tracker. However, it also leads to a potential message duplication, see the `testConsumerDedup` test modified by #18454.

The root cause is that single messages will still be passed into the `isDuplicated` method in `receiveIndividualMessagesFromBatch`. However, in this case, the `MessageId` is a `BatchedMessageIdImpl`, while the `MessageId` in `lastCumulativeAck` or `pendingIndividualAcks` are `MessageIdImpl` implementations.

Validate the class type in `isDuplicated` and convert a `BatchedMessageIdImpl` to `MessageIdImpl`. Then revert the unnecessary changes in #18454.

`ConsumerRedeliveryTest#testAckNotSent` is added to verify it works.

The duplication could still happen when batch index ACK is enabled. Because even after the ACK tracker is flushed, if only parts of a batched message are not acknowledged, the whole batched message would still be redelivered. I will open another PR to fix it.

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

PR in forked repository: BewareMyPower#8

(cherry picked from commit be1d07e)
lifepuzzlefun pushed a commit to lifepuzzlefun/pulsar that referenced this pull request Dec 9, 2022
…apache#18486)

### Motivation

apache#18454 fixed the potential message loss when a batched message is redelivered and one single message of the batch is added to the ACK tracker. However, it also leads to a potential message duplication, see the `testConsumerDedup` test modified by apache#18454.

The root cause is that single messages will still be passed into the `isDuplicated` method in `receiveIndividualMessagesFromBatch`. However, in this case, the `MessageId` is a `BatchedMessageIdImpl`, while the `MessageId` in `lastCumulativeAck` or `pendingIndividualAcks` are `MessageIdImpl` implementations.

### Modifications

Validate the class type in `isDuplicated` and convert a `BatchedMessageIdImpl` to `MessageIdImpl`. Then revert the unnecessary changes in apache#18454.

`ConsumerRedeliveryTest#testAckNotSent` is added to verify it works.

### TODO

The duplication could still happen when batch index ACK is enabled. Because even after the ACK tracker is flushed, if only parts of a batched message are not acknowledged, the whole batched message would still be redelivered. I will open another PR to fix it.

### Documentation

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

### Matching PR in forked repository

PR in forked repository: BewareMyPower#8
liangyepianzhou pushed a commit that referenced this pull request Dec 14, 2022
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jan 10, 2023
… issue. (apache#18454)

(cherry picked from commit 7712aa3)
(cherry picked from commit 6f69aa4)
lifepuzzlefun pushed a commit to lifepuzzlefun/pulsar that referenced this pull request Jan 10, 2023
…apache#18486)

### Motivation

apache#18454 fixed the potential message loss when a batched message is redelivered and one single message of the batch is added to the ACK tracker. However, it also leads to a potential message duplication, see the `testConsumerDedup` test modified by apache#18454.

The root cause is that single messages will still be passed into the `isDuplicated` method in `receiveIndividualMessagesFromBatch`. However, in this case, the `MessageId` is a `BatchedMessageIdImpl`, while the `MessageId` in `lastCumulativeAck` or `pendingIndividualAcks` are `MessageIdImpl` implementations.

### Modifications

Validate the class type in `isDuplicated` and convert a `BatchedMessageIdImpl` to `MessageIdImpl`. Then revert the unnecessary changes in apache#18454.

`ConsumerRedeliveryTest#testAckNotSent` is added to verify it works.

### TODO

The duplication could still happen when batch index ACK is enabled. Because even after the ACK tracker is flushed, if only parts of a batched message are not acknowledged, the whole batched message would still be redelivered. I will open another PR to fix it.

### Documentation

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

### Matching PR in forked repository

PR in forked repository: BewareMyPower#8
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jan 11, 2023
… issue. (apache#18454)

(cherry picked from commit 7712aa3)
(cherry picked from commit 6f69aa4)
Technoboy- pushed a commit that referenced this pull request Feb 8, 2023
…#18486)

#18454 fixed the potential message loss when a batched message is redelivered and one single message of the batch is added to the ACK tracker. However, it also leads to a potential message duplication, see the `testConsumerDedup` test modified by #18454.

The root cause is that single messages will still be passed into the `isDuplicated` method in `receiveIndividualMessagesFromBatch`. However, in this case, the `MessageId` is a `BatchedMessageIdImpl`, while the `MessageId` in `lastCumulativeAck` or `pendingIndividualAcks` are `MessageIdImpl` implementations.

Validate the class type in `isDuplicated` and convert a `BatchedMessageIdImpl` to `MessageIdImpl`. Then revert the unnecessary changes in #18454.

`ConsumerRedeliveryTest#testAckNotSent` is added to verify it works.

The duplication could still happen when batch index ACK is enabled. Because even after the ACK tracker is flushed, if only parts of a batched message are not acknowledged, the whole batched message would still be redelivered. I will open another PR to fix it.

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

PR in forked repository: BewareMyPower#8
@Technoboy- Technoboy- deleted the fix-failover-cumu branch November 11, 2023 07:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants