-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
We're seeing thousands of duplicated messages when we stop a Broker #150
Comments
This happens when consumer doesn't acknowledge back to broker for the consumed the message using api: acknowledge(message) At high level: Broker persists the So, in your case: if consumer doesn't ack the message then broker will not persist the position so, broker-restart will send all the messages to consumer again. Hence, consumer needs to confirm message consumption by sending acknowledgement to broker. |
Hi @rdhabalia thanks for your response. I don't think this is the case we are facing, we have dummy consumers which are receiving and acknowledging messages all the time (like stress test in consume perf) but with a sqs like api in front of pulsar. We know there are messages being repeated, because messageId consumed are being stored in a different data store. Thanks. |
It seems a client is failing to process that exact message due to unexpected exception. And therefore, it creates ack-hole which always keeps You can verify this behavior before restarting the broker.
You can also use REST-API to get the same stats:
example:
|
@estebangarcia The other thing is that the frequency at which the # Rate limit the amount of writes generated by consumer acking the messages
managedLedgerDefaultMarkDeleteRateLimit=0.1 That means, the default (per broker) is to save it every 10secs. This frequency can also be configured individually on the namespace policies. Set this configuration to something higher (eg: I'll update the docs on that setting because it's not very clear at the moment. Also, default should probably be something like 1.0. |
@merlimat @rdhabalia thanks for your responses. We will make sure to check these things and get back to you. |
We tried setting the managedLedgerDefaultMarkDeleteRateLimit to 1 but we had the same issue, so we set it to 0 and now we have just a few duplicates. I'm wondering the implications of disabling the rate limiter and if it can have any negative impact in the long term. |
Setting that to 0, means that there will be 1 write on the bookies per each message you acknowledge. The write is very small, but it effectively multiplies the write rate on the number of subscriptions. Maybe you can try to limit that to 100/s or 1000/s to get a better tradeoff between additional writes and number of duplicates. |
I think we'll keep "playing" with this parameter until we find a better tradeoff that suit us just like you suggested. Thanks for your help! |
* adding unittest for windowing * cleaning up * removing print line
optimize for apache#134 including the following aspects: recordToEntry paralleled publish message to bookie asynchronous In our environment, produce delay is about one-fifth of the previous
We started a Redis to store the consumed messages ID, each time a new message is consumed we check if the ID is in redis. If it is that means that the message is duplicated.
Expected behavior
Shouldn't see duplicated messages
Actual behavior
We're seeing thousands of duplicated messages after a broker goes down
Steps to reproduce
Don't know if it helps but in the logs we see that all the duplicated messages have the same ledgerId and partitionIndex.
Message id duplicated messageId -> MessageIdImpl{ledgerId=9, entryId=510587, partitionIndex=2}
System configuration
Pulsar version: built from master
If you need any further information, please let us know.
The text was updated successfully, but these errors were encountered: