kafka_producer: deadlock when closing after failure #2978
Labels
affects-4.0
affects-5.0
affects-5.1
affects-5.2
affects-5.3
area/ticdc
Issues or PRs related to TiCDC.
priority/P1
The issue has P1 priority.
severity/major
type/bug
The issue is confirmed as a bug.
What did you do?
Kafka producer encounters two messages close to each other in time that would lead to errors such as
ErrMessageSizeTooLarge
What did you expect to see?
The changefeed should close as expected
What did you see instead?
The sink is dead-locked due to the following reason:
(*kafkaSaramaProducer).run
is exiting, and no longer selecting from thek.asyncClient.Errors()
channel.k.asyncClient.Errors()
channel is blocking, the asyncClient cannot process messages anymore, which means it cannot consume fromk.asyncClient.Input()
.k.asyncClient.Input()
, because the asyncClient has been blocked by writing tok.asyncClient.Errors()
.SendMessage
method has takenk.clientLock.RLock()
, and continues to hold it.(*kafkaSaramaProducer).run
tries to takek.clientLock.Lock()
(in(*kafkaSaramaProducer).stop
) , which is blocked bySendMessage
's read lock.Versions of the cluster
v5.0.1
The text was updated successfully, but these errors were encountered: