Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use least loaded broker to refresh metadata #2645

Merged
merged 1 commit into from
Sep 12, 2023

Conversation

HaoSunUber
Copy link
Contributor

@HaoSunUber HaoSunUber commented Sep 7, 2023

Seed brokers never change after client initialization. If the first seed
broker became stale (still online, but moved to other Kafka cluster),
Sarama client may use this stale broker to get the wrong metadata. To
avoid using the stale broker to do metadata refresh, we will choose the
least loaded broker in the cached broker list which is the similar to
how the Java client implementation works:

https://github.com/apache/kafka/blob/7483991a/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java#L671-L736

Contributes-to: #2637

@HaoSunUber HaoSunUber force-pushed the fix-stale-broker-issue branch from 8d06ab9 to ede5508 Compare September 8, 2023 02:24
@@ -228,7 +228,7 @@ mainLoop:
}
for _, broker := range brokers {
err := broker.Open(client.Config())
if err != nil {
if err != nil && err != ErrAlreadyConnected {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the functional tests to skip the ErrAlreadyConnected error because this change will use one broker from the broker list to refresh metadata, which is already connected when reaching the following validation.

@HaoSunUber HaoSunUber closed this Sep 8, 2023
@HaoSunUber HaoSunUber deleted the fix-stale-broker-issue branch September 8, 2023 02:27
@HaoSunUber HaoSunUber restored the fix-stale-broker-issue branch September 8, 2023 02:28
@HaoSunUber HaoSunUber reopened this Sep 8, 2023
@HaoSunUber
Copy link
Contributor Author

@dnwe Could you please review my change? Thanks

@dnwe dnwe force-pushed the fix-stale-broker-issue branch from 19b1221 to f5c9e47 Compare September 12, 2023 08:17
@dnwe dnwe changed the title Fix Sarama client using a stale broker to refresh metadata #2637 fix: use least loaded broker to refresh metadata Sep 12, 2023
Seed brokers never change after client initialization. If the first seed
broker became stale (still online, but moved to other Kafka cluster),
Sarama client may use this stale broker to get the wrong metadata. To
avoid using the stale broker to do metadata refresh, we will choose the
least loaded broker in the cached broker list which is the similar to
how the Java client implementation works:

https://github.com/apache/kafka/blob/7483991a/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java#L671-L736

Contributes-to: IBM#2637

Signed-off-by: Hao Sun <haos@uber.com>
@dnwe dnwe force-pushed the fix-stale-broker-issue branch from f5c9e47 to 98ec384 Compare September 12, 2023 08:18
Copy link
Collaborator

@dnwe dnwe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me. One minor query on the existing deregisterBroker behaviour for the seedBrokers list, but happy to approve and merge as-is

Comment on lines +766 to 780
// deregisterBroker removes a broker from the broker list, and if it's
// not in the broker list, removes it from seedBrokers.
func (client *client) deregisterBroker(broker *Broker) {
client.lock.Lock()
defer client.lock.Unlock()

_, ok := client.brokers[broker.ID()]
if ok {
Logger.Printf("client/brokers deregistered broker #%d at %s", broker.ID(), broker.Addr())
delete(client.brokers, broker.ID())
return
}
if len(client.seedBrokers) > 0 && broker == client.seedBrokers[0] {
client.deadSeeds = append(client.deadSeeds, broker)
client.seedBrokers = client.seedBrokers[1:]
Copy link
Collaborator

@dnwe dnwe Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we only seem to deregister the broker from the seedBrokers list if it's the first element in the list (after the most recent shuffle) — is that still the desired behaviour?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it is expected. seed brokers only shuffle during client initialization or hard RefreshBrokers. After that, the cached broker list is empty, and the client will use the first seed broker to fetch metadata. deregisterBroker method deregisters the first seed broker will apply that moment when this first seed broker is unavailable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know when we have a new release including this change so that our side can use it? @dnwe

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants