Skip to content

Commit

Permalink
producer: fail topics that repeatedly fail to load after 5 tries
Browse files Browse the repository at this point in the history
If a record is produced to a topic that cannot load, it makes no sense
to continue trying to load it. Rather than trying indefinitely up to
the RecordDeliveryTimeout or the RecordRetries limit, we now fail after
5 load attempts. Given that we round-robin the brokers we load metadata
from, and given that this unknown load only happens on the first record
for a topic, 5 tries should be a safe default for any produce.
  • Loading branch information
twmb committed Sep 21, 2021
1 parent a6a6ba9 commit ee8b12d
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 0 deletions.
4 changes: 4 additions & 0 deletions pkg/kgo/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -875,6 +875,10 @@ func ProduceRequestTimeout(limit time.Duration) ProducerOpt {
// one record only to produce a later one successfully. This also allows for
// easier sequence number ordering internally.
//
// If a topic repeatedly fails to load with UNKNOWN_TOPIC_OR_PARTITION, it has
// a different, internal retry limit. All records for a topic that repeatedly
// cannot be loaded are failed when the internal limit is hit.
//
// This option is different from RequestRetries to allow finer grained control
// of when to fail when producing records.
func RecordRetries(n int) ProducerOpt {
Expand Down
3 changes: 3 additions & 0 deletions pkg/kgo/producer.go
Original file line number Diff line number Diff line change
Expand Up @@ -716,6 +716,9 @@ func (cl *Client) waitUnknownTopic(
if int64(tries) >= cl.cfg.recordRetries {
err = fmt.Errorf("no partitions available after attempting to refresh metadata %d times, last err: %w", tries, retriableErr)
}
if tries > 5 && errors.Is(retriableErr, kerr.UnknownTopicOrPartition) {
err = retriableErr
}
}
}

Expand Down

0 comments on commit ee8b12d

Please sign in to comment.