Fix Intermittent Kafka Quorum Formation Issue #122
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR addresses an intermittent issue where Kafka brokers fail to form a quorum due to mismatched voter keys in the Raft consensus algorithm. The problem occurs approximately 1 out of every 5 test runs (which I tried)...
Brokers occasionally fail to form a quorum, with logs showing repeated messages like:
The advertised listeners for the CONTROLLER listener are set to localhost (using
getHost()
method), while controller.quorum. voters specify hostnames like broker-0, broker-1, etc. This mismatch leads to brokers advertising their controller endpoints incorrectly, causing other brokers to attempt connections to localhost instead of the correct broker addresses.Using localhost for the CONTROLLER listener causes brokers to incorrectly attempt to connect to themselves rather than the intended broker, due to localhost resolving to 127.0.0.1.
I have verified such a change in a test case which failed once in 5 runs and now it's working (20 runs without any issue):