Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression: enabling rackawareness causes severe throughput drops #2071

Closed
lizthegrey opened this issue Nov 26, 2021 · 8 comments
Closed

regression: enabling rackawareness causes severe throughput drops #2071

lizthegrey opened this issue Nov 26, 2021 · 8 comments

Comments

@lizthegrey
Copy link
Contributor

lizthegrey commented Nov 26, 2021

Versions
Sarama Kafka Go
v1.29.0 v3.0.0 (Confluent 7) v1.17.1

Regression has been bisected to 1aac8e5

See #1927 (comment) for another report than just mine.

Configuration

Pertinent config variables:

	c.conf = conf
	if (c.LD != nil && c.LD.BoolVariationCtx(
		ctx,
		launchdarkly.FlagKafkaRackAwareFollowerFetch,
		types.UserAndTeam{User: nil, Team: nil})) {
		c.rackID = os.Getenv("AZ")
	}
broker.rack=<%= node["ec2"]["availability_zone"] %>
replica.selector.class=<%= node["kafka"]["replica"]["selector"]["class"] %>
default['kafka']['replica']['selector']['class'] = "org.apache.kafka.common.replica.RackAwareReplicaSelector"
Problem Description

When AZ is not populated (due to a bug on our side, hah), or when FlagKafkaRackAwareFollowerFetch is false, things behave normally. But throughput drops by 75% if the flag is set and the Sarama library is at or past 1aac... https://share.getcloudapp.com/nOu54vNq
There is a corresponding lag in timestamps between producer and consumer: https://share.getcloudapp.com/DOu6LBAQ

@lizthegrey
Copy link
Contributor Author

@dnwe opened as per your request

lizthegrey referenced this issue Nov 26, 2021
Historically (before protocol version 11) if we attempted to consume
from a follower, we would get a NotLeaderForPartition response and move
our consumer to the new leader. However, since v11 the Kafka broker
treats us just like any other follower and permits us to consume from
any replica and it is up to us to monitor metadata to determine when the
leadership has changed.

Modifying the handleResponse func to check the topic partition
leadership against the current broker (in the absence of a
preferredReadReplica) and trigger a re-create of the consumer for that
partition

Contributes-to: #1927
@dnwe
Copy link
Collaborator

dnwe commented Nov 26, 2021

Thank you!

@lizthegrey
Copy link
Contributor Author

anything I can do to assist here? rolling back the commit might help, but also would break that fix so...

@dnwe
Copy link
Collaborator

dnwe commented Dec 1, 2021

@lizthegrey apologies, I've actually been off sick this week so hadn't had a chance to code up a fix for this yet, but I do aim to look at it soon.

Essentially I believe the issue is that Kafka only ever computes a "preferred read replica" when your FetchRequest has gone to the leader of the partition and you've provided a client RackID. In that case the FetchResponse contains the preferred replica (if it differs from the leader) and omits any data, then a well behaved client should disconnect and start fetching from that preferred replica instead. However, when you then send FetchRequests to the follower, the "preferred read replica" field is omitted from its responses and hence #1936 kicks in because it thinks no preference exists and you're consuming from a non-leader, so it forces the consumer back to the real leader. Hence the massive drop in throughput you see as the consumer is flip flopping FetchResponses

dnwe added a commit that referenced this issue Dec 1, 2021
FetchResponse from a follower will _not_ contain a PreferredReadReplica.
It seems like the partitionConsumer would overwrite the previously
assigned value from the leader with -1 which would then trigger the
"reconnect to the current leader" changes from #1936 causing a flip-flop
effect.

Contributes-to: #2071

Signed-off-by: Dominic Evans <dominic.evans@uk.ibm.com>
@dnwe
Copy link
Collaborator

dnwe commented Dec 1, 2021

@lizthegrey so I've pushed up #2076 as a tentative fix, but (full disclaimer) I haven't yet actually tried this out against a real cluster so I'll need to run through with some of that and then code up some test cases too before I can merge, but if you'd like to try it out in the meantime that would be really helpful :)

@lizthegrey
Copy link
Contributor Author

We have test clusters for this exact reason. I will flip on and let you know.

@lizthegrey
Copy link
Contributor Author

Fixed by #2076 and confirmed working in Honeycomb production.

@mtj075
Copy link

mtj075 commented Apr 13, 2023

@lizthegrey apologies, I've actually been off sick this week so hadn't had a chance to code up a fix for this yet, but I do aim to look at it soon.

Essentially I believe the issue is that Kafka only ever computes a "preferred read replica" when your FetchRequest has gone to the leader of the partition and you've provided a client RackID. In that case the FetchResponse contains the preferred replica (if it differs from the leader) and omits any data, then a well behaved client should disconnect and start fetching from that preferred replica instead. However, when you then send FetchRequests to the follower, the "preferred read replica" field is omitted from its responses and hence #1936 kicks in because it thinks no preference exists and you're consuming from a non-leader, so it forces the consumer back to the real leader. Hence the massive drop in throughput you see as the consumer is flip flopping FetchResponses

sorry, i think this sarama's code change made the consumer client ignore kafka's leader replica(preferred replica ) .Once the isr change ,the consumer client can not fetch from leader replica. kafka had fix preferred replica from isr. https://github.com/apache/kafka/pull/12877/files#diff-78812e247ffeae6f8c49b1b22506434701b1e1bafe7f92ef8f8708059e292bf0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants