Quorum queue consumer count incorrect after node failure #2420
-
Environment: Initial situation: one queue on node A with one consumer on queue. Problem: if node A is disconnected from the network the queue master is changed to another node but the queue count stays at 1 when in reality there are no more consumers. The management web UI shows a count of 1 consumer for the queue but when you click on the queue for details there are no consumers. The problem does not exist with Classic queues. |
Beta Was this translation helpful? Give feedback.
Replies: 7 comments
-
Please remove the |
Beta Was this translation helpful? Give feedback.
-
and This could be something about how quorum queues keep track of consumers or emit stats. It matters a great deal here what node the consumer is connected to. @lpn-ch can you please clarify that or provide a reasonably detailed set of specific steps we can use to reproduce? |
Beta Was this translation helpful? Give feedback.
-
I can reproduce this with some nuances worth mentining. Here's an initial state of a three node cluter with a QQ: And a consumer which is connected to node A: I then stop node C. The leader moves to node A: I add one more consumer to node B: Then I start node C back online: and shut down node A: The individual queue page reports the expected consumers on the list but the consumer count is indeed off on both pages: |
Beta Was this translation helpful? Give feedback.
-
When I bring back node A up, its connection recovers and the state of the system is what is expected again. |
Beta Was this translation helpful? Give feedback.
-
I will file a new issue about this. |
Beta Was this translation helpful? Give feedback.
-
Sorry for the delay, I was out for the weekend. Here is how to reproduce my specific issue if it can help. As requested, I removed the HA polices. I created a three node cluster. Here is the initial state: one queue with one consumer on node vs-tw-ws2019-p1 I then STOP virtual machine vs-tw-ws2019-p1 (not shutdown) which is the queue master. This is what I get after: Consumer count is still 1 But in fact there are no consumers. |
Beta Was this translation helpful? Give feedback.
-
See #2421. |
Beta Was this translation helpful? Give feedback.
I can reproduce this with some nuances worth mentining. Here's an initial state of a three node cluter with a QQ:
And a consumer which is connected to node A:
I then stop node C. The leader moves to node A:
I add one more consumer to node B:
Then I start node C back online:
and shut down node A:
The individual queue page reports the expected consumers on the list but the consumer count is indeed off on both pages: