-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Obey offsets stored in Kafka for existing groups #42
Obey offsets stored in Kafka for existing groups #42
Conversation
d22922a
to
e1bdf43
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right method to call. get_watermark_offsets
gets the lowest and highest offset values for the topic-partition pair. It doesn't know anything about the stored position for the consumer group. Different consumers could be at different positions in the stream, but this would always set consumers to read from the last message.
I would recommend adding an integration test to make sure the behavior is right - this is pretty subtle, tricky stuff.
You're absolutely right. The tests I was doing before didn't capture that correctly and running it again showed that this just points to the latest offset. So looking at this a bit more, I have a question about this. Why are we setting the offset at all? Shouldn't the offset already stored be the correct one for a given consumer group? At least by not setting the offset here when calling
That sounds good. I'll take a crack at this later. |
We set them explicitly to handle initial offsets, when a consumer group hasn't committed anything. But this might not be necessary; we set |
…ored offsets test
@spenczar, I added integration tests for testing stored offsets with different consumer groups. I also removed the CONSUMER value in One other thing I did, and wanted to check with you on (made a last single commit that would be easy to revert), is that I removed the option to allow an arbitrary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the test code here. Looks good.
This is in the |
This PR should address #41. In
Consumer.subscribe()
, instead of making an assignment based on thestart_at
position in the configuration, it'll callget_watermark_offsets()
to grab the current offset in the consumer to make that assignment. In the case there is no offset stored, it'll pull the value stored via "auto.offset.reset" in the configured consumer.I have also moved where the warning about using the LATEST offset is raised since this check isn't being done in the same location anymore.
Closes #41.