Rebalance doesn't work reliably fix #533
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a fix for #532
Some words about expected behavior. fs2 is a pull-based library: stream will not push the new step until the previous step is completed. So, if we have a
flatMap
on a stream, the next step will be executed only when the current inner stream completes. Projecting this onto our situation, if we have a code like this:the next assignment will be received only when the
innerStream
finished. IfinnerStream
is based on partition streams from theassignment
, it will automatically work: on rebalance, old streams are finished, andinnerStream
is finished too. And this is the expected behavior.But sometimes not all old partition streams finished. And that's why the rebalance doesn't work. I added a test case for this, which is failed on the current stable branch.
Actually, I didn't found a root cause for this issue. Mostly because the current solution for the finishing old partition streams is pretty complex, and also distributed across multiple files (
KafkaConsumer
andKafkaConsumerActor
). I decided to try to change and simplify old partition streams completion logic. And it worked.In my implementation I don't try to track all partition streams with
partitionStreamId
. I just interrupt old partition streams on rebalance. Actually, this is a lot closer to the kafka ideology: when the partition revoked, I interrupt old streams. When the partition assigned, I create new streams for each partition. It turns out that havingRef[F, Deferred[F, Unit]]
is enough for this. On each rebalance I created a newDeferred
and saved it inRef
, and at the same time, I completed the oldDeferred
. Each partition stream in a current assignment subscribed on theDeferred
from the current assignment, and when it completes, streams interrupts. All logic is in the one methodpartitionsMapStream
.Because of a new logic for closing old partition streams functionality with tracking
partitionStreamId
is redundant now. So, I deleted it. I think it removes a bit of the inner complexity of a library.