-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flake in C++ negative acknowledgement tests #7099
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Negative acknowledgement runs in the background on a consumer and triggers redelivery of messages. The tests verify a that messages do indeed get redelivered, and which messages they are, for the base case, batching and partitioned consumer. There's a fundamental dependency on timing in the base case. If 100ms pass between consumer creation and receiving the last message in first receive loop, redelivery will be triggered and the order of messages, as asserted by the test will fail. This first case can be fixed by moving the negative ack to run after all messages have been received. However, this can also then fail for the batch case. If the negative ack tracker kicks off during the loop to negatively ack the messages, then the redelivery will happen twice (and possibly more times depending on how many time it manages to run). For this reason, if we want the test to be deterministic, we need to disable the tracker from kicking off redelivery while we send mark the messages as negatively acked.
merlimat
assigned ivankelly, sijie, srkukarni, jerrypeng, aahmed-se and codelipenghui and unassigned ivankelly, sijie, srkukarni, jerrypeng, aahmed-se and codelipenghui
May 29, 2020
merlimat
requested review from
aahmed-se,
codelipenghui,
jerrypeng,
sijie and
srkukarni
May 29, 2020 19:46
aahmed-se
approved these changes
May 31, 2020
Huanli-Meng
pushed a commit
to Huanli-Meng/pulsar
that referenced
this pull request
Jun 1, 2020
Negative acknowledgement runs in the background on a consumer and triggers redelivery of messages. The tests verify a that messages do indeed get redelivered, and which messages they are, for the base case, batching and partitioned consumer. There's a fundamental dependency on timing in the base case. If 100ms pass between consumer creation and receiving the last message in first receive loop, redelivery will be triggered and the order of messages, as asserted by the test will fail. This first case can be fixed by moving the negative ack to run after all messages have been received. However, this can also then fail for the batch case. If the negative ack tracker kicks off during the loop to negatively ack the messages, then the redelivery will happen twice (and possibly more times depending on how many time it manages to run). For this reason, if we want the test to be deterministic, we need to disable the tracker from kicking off redelivery while we send mark the messages as negatively acked. Co-authored-by: Ivan Kelly <ikelly@splunk.com>
Huanli-Meng
pushed a commit
to Huanli-Meng/pulsar
that referenced
this pull request
Jun 1, 2020
Negative acknowledgement runs in the background on a consumer and triggers redelivery of messages. The tests verify a that messages do indeed get redelivered, and which messages they are, for the base case, batching and partitioned consumer. There's a fundamental dependency on timing in the base case. If 100ms pass between consumer creation and receiving the last message in first receive loop, redelivery will be triggered and the order of messages, as asserted by the test will fail. This first case can be fixed by moving the negative ack to run after all messages have been received. However, this can also then fail for the batch case. If the negative ack tracker kicks off during the loop to negatively ack the messages, then the redelivery will happen twice (and possibly more times depending on how many time it manages to run). For this reason, if we want the test to be deterministic, we need to disable the tracker from kicking off redelivery while we send mark the messages as negatively acked. Co-authored-by: Ivan Kelly <ikelly@splunk.com>
Huanli-Meng
pushed a commit
to Huanli-Meng/pulsar
that referenced
this pull request
Jun 12, 2020
Negative acknowledgement runs in the background on a consumer and triggers redelivery of messages. The tests verify a that messages do indeed get redelivered, and which messages they are, for the base case, batching and partitioned consumer. There's a fundamental dependency on timing in the base case. If 100ms pass between consumer creation and receiving the last message in first receive loop, redelivery will be triggered and the order of messages, as asserted by the test will fail. This first case can be fixed by moving the negative ack to run after all messages have been received. However, this can also then fail for the batch case. If the negative ack tracker kicks off during the loop to negatively ack the messages, then the redelivery will happen twice (and possibly more times depending on how many time it manages to run). For this reason, if we want the test to be deterministic, we need to disable the tracker from kicking off redelivery while we send mark the messages as negatively acked. Co-authored-by: Ivan Kelly <ikelly@splunk.com>
huangdx0726
pushed a commit
to huangdx0726/pulsar
that referenced
this pull request
Aug 24, 2020
Negative acknowledgement runs in the background on a consumer and triggers redelivery of messages. The tests verify a that messages do indeed get redelivered, and which messages they are, for the base case, batching and partitioned consumer. There's a fundamental dependency on timing in the base case. If 100ms pass between consumer creation and receiving the last message in first receive loop, redelivery will be triggered and the order of messages, as asserted by the test will fail. This first case can be fixed by moving the negative ack to run after all messages have been received. However, this can also then fail for the batch case. If the negative ack tracker kicks off during the loop to negatively ack the messages, then the redelivery will happen twice (and possibly more times depending on how many time it manages to run). For this reason, if we want the test to be deterministic, we need to disable the tracker from kicking off redelivery while we send mark the messages as negatively acked. Co-authored-by: Ivan Kelly <ikelly@splunk.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Negative acknowledgement runs in the background on a consumer and
triggers redelivery of messages. The tests verify a that messages do
indeed get redelivered, and which messages they are, for the base
case, batching and partitioned consumer.
There's a fundamental dependency on timing in the base case. If 100ms
pass between consumer creation and receiving the last message in first
receive loop, redelivery will be triggered and the order of messages,
as asserted by the test will fail.
This first case can be fixed by moving the negative ack to run after
all messages have been received. However, this can also then fail for
the batch case.
If the negative ack tracker kicks off during the loop to negatively
ack the messages, then the redelivery will happen twice (and possibly
more times depending on how many time it manages to run).
For this reason, if we want the test to be deterministic, we need to
disable the tracker from kicking off redelivery while we send mark the
messages as negatively acked.