Dont require all consumers drained #485

eapache · 2015-07-16T14:01:26Z

If a partitionConsumer fills up and is not being drained (or is taking a long time) remove its subscription until it can proceed again in order to not block other partitions which may still be making progress.

@Shopify/kafka @horkhe

wvanbergen · 2015-07-27T15:01:48Z

consumer.go

+						child.messages <- msg
+					}
+					child.broker.input <- child
+					continue feederLoop


Can you explain this section a bit more? Can the second child.messages <- msg block as well, or am I understand this incorrectly?

This is happening in its own goroutine. If the timeout triggers, we send a message back to the brokerConsumer telling it to take us out of the pool, and then we feed the rest of the messages to the user at our leisure (so yes, the second message send probably blocks until the user catches up, but it doesn't block any other partitions).

Once we've flushed all our messages, we put ourselves back in the pool.

So, if the messages never get drained by the consumer (because their consumer goroutine shut down for some reason, we will end up with a stuck goroutine?

I think that's OK in that scenario, just want to understand the code.

Yup, it will still shut down if/when Close is called though since that drains any outstanding messages.

wvanbergen · 2015-07-27T15:14:55Z

I think it's doable to add a test for this: consume 2 partitions, drain only one of them.

horkhe · 2015-07-27T18:40:06Z

@eapache @wvanbergen There is such a test case. It is about to be introduced in #492. It is skipped there.

eapache · 2015-07-27T21:32:18Z

OK, please re-review this PR as it should now also fix all the race conditions that the previous changes introduced.

If it looks good, I'll merge it and then you can rebase #492 without the skipped test.

wvanbergen · 2015-07-29T13:55:33Z

consumer.go


 	fetchSize           int32
 	offset              int64
 	highWaterMarkOffset int64
 }

+var errTimedOut = errors.New("timed out feeding messages to the user") // not user-facing


Capitalize error message?

https://github.com/golang/go/wiki/CodeReviewComments#error-strings

Hmmm, all the configuration errors (see config.go above) are capitalized.

good point, I've submitted a PR to fix them :)

wvanbergen · 2015-07-29T14:03:33Z

I think this looks OK. @horkhe any final comments?

Take the previous refactor to its logical conclusion by handling *all* the error logic in the brokerConsumer, not the responseFeeder. This fixes the race to close the dying channel (since the brokerConsumer can just close the trigger instead as it has ownership). At the same time, refactor `updateSubscriptionCache` into `handleResponses`, and inline the "new subscriptions" bit into the main loop; otherwise we end up processing the previous iterations results at the very beginning of the next iteration, rather than at the very end of the current one.

Prep for unblocking consumers that are not being drained

If a partitionConsumer fills up and is not being drained (or is taking a long time) remove its subscription until it can proceed again in order to not block other partitions which may still be making progress.

Dont require all consumers drained

eapache force-pushed the dont-require-all-consumers-drained branch from 6216a3f to cdb278c Compare July 27, 2015 13:44

wvanbergen reviewed Jul 27, 2015
View reviewed changes

eapache force-pushed the dont-require-all-consumers-drained branch from cdb278c to a53ff72 Compare July 27, 2015 21:16

eapache mentioned this pull request Jul 27, 2015

consumer: fix another race pointed out by Maxim #494

Closed

wvanbergen reviewed Jul 29, 2015
View reviewed changes

eapache mentioned this pull request Jul 29, 2015

Cleanup error formatting #495

Merged

eapache added 3 commits July 29, 2015 14:55

Move the consumer's channel send slightly

7e4b74b

Prep for unblocking consumers that are not being drained

consumer: don't block on undrained partitions

292f3b0

If a partitionConsumer fills up and is not being drained (or is taking a long time) remove its subscription until it can proceed again in order to not block other partitions which may still be making progress.

eapache force-pushed the dont-require-all-consumers-drained branch from f7da387 to 292f3b0 Compare July 29, 2015 18:57

eapache added a commit that referenced this pull request Aug 4, 2015

Merge pull request #485 from Shopify/dont-require-all-consumers-drained

e1729d6

Dont require all consumers drained

eapache merged commit e1729d6 into master Aug 4, 2015

eapache deleted the dont-require-all-consumers-drained branch August 4, 2015 13:24

eapache mentioned this pull request Jan 12, 2016

Sarama Snappy Compression #593

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dont require all consumers drained #485

Dont require all consumers drained #485

eapache commented Jul 16, 2015

wvanbergen Jul 27, 2015

eapache Jul 27, 2015

wvanbergen Jul 29, 2015

eapache Jul 29, 2015

wvanbergen commented Jul 27, 2015

horkhe commented Jul 27, 2015

eapache commented Jul 27, 2015

wvanbergen Jul 29, 2015

eapache Jul 29, 2015

wvanbergen Jul 29, 2015

eapache Jul 29, 2015

wvanbergen commented Jul 29, 2015

Dont require all consumers drained #485

Dont require all consumers drained #485

Conversation

eapache commented Jul 16, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wvanbergen commented Jul 27, 2015

horkhe commented Jul 27, 2015

eapache commented Jul 27, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wvanbergen commented Jul 29, 2015