Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dont commit on consumer seek #1012

Merged
merged 3 commits into from
Feb 1, 2021
Merged

Dont commit on consumer seek #1012

merged 3 commits into from
Feb 1, 2021

Conversation

Nevon
Copy link
Collaborator

@Nevon Nevon commented Jan 28, 2021

This changes the behavior of consumer.seek so that when the consumer is running with autoCommit: false, the offsets that are seeked to are not committed. Instead, the previous offset is just resolved.

I opted for using autoCommit instead of an explicit flag in consumer.seek because I figured if you are using autoCommit: false it's because you don't want to us to commit offsets for you. I'm open to feedback on this, as I first did implement the flag, which wasn't any harder, it just seemed more elegant with less chances of accidental offset commits this way.

It's late and I'm very sleepy, so I'm not sure yet if I did this properly. One thing I know I need to check is: do we actually change the resolved offset if we seek to an offset that is lower than the previously resolved one? If not, I need to special case this and "force" the resolved offset to change.

Fixes #395

@t-d-d
Copy link
Contributor

t-d-d commented Jan 29, 2021

I think this change is good and fixes what amounts to a bug in the current behaviour. But the consumer.seek() behaviour is still rather awkward when autoCommit=true. E.g. the synchronous consumer.seek() call can succeed without throwing an error but the actual commit may still fail later in the fetch loop. It also silently just ignores partitions that aren't assigned to this member.

Now that consumer.commitOffset() exists, maybe take the opportunity to clean up the api, either:

  • introduce a new seekTo() function that never commits and mark the current one as deprecated
  • just break the current behaviour more and never commit the new offset in seek()

@Nevon Nevon mentioned this pull request Jan 29, 2021
10 tasks
@Nevon
Copy link
Collaborator Author

Nevon commented Jan 29, 2021

It also silently just ignores partitions that aren't assigned to this member.

That is actually intentional, albeit as a side-effect of the way seek was implemented. If you are not assigned the partition, you cannot commit offsets for it, and as a developer it's not that easy to keep track of what partitions you are assigned (and why should you, when we already know it?), so the easiest thing for the user would be to just provide all the partition-offsets they want to seek to, and we make sure that we seek on all the assigned partitions.

But the consumer.seek() behaviour is still rather awkward when autoCommit=true. E.g. the synchronous consumer.seek() call can succeed without throwing an error but the actual commit may still fail later in the fetch loop.

This I very much agree with though. Seek is kind of an ugly duckling. Out of your suggestions, I'm leaning more towards:

  • introduce a new seekTo() function that never commits and mark the current one as deprecated

And then changing seek's default behavior in a future version (#1013)

@ankon
Copy link
Contributor

ankon commented Jan 29, 2021

When I read the name autoCommit for the setting I would assume it works as some kind of master switch, i.e. the other two autoCommit-y settings would not ever be looked at when autoCommit is false. I think that that isn't true, and the ...IfNecessary functions do consider these values.

(Just a note to consider, please disregard if I missed something somewhere)

Copy link
Collaborator

@JaapRood JaapRood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Tying this to autoCommit is such a nice find, makes total sense. When you have auto commits enabled, it shouldn't be surprising a seek causes a commit. The use cases I described in the original issue do all require autoCommit to be disabled.

Implementation seems to make sense as well, the test a reasonable way to coverage. Haven't really thought about all the states it could get into yet, and thus unwanted side effects, but from the way it's implemented I wouldn't really expect any.

@Nevon
Copy link
Collaborator Author

Nevon commented Jan 29, 2021

When I read the name autoCommit for the setting I would assume it works as some kind of master switch, i.e. the other two autoCommit-y settings would not ever be looked at when autoCommit is false. I think that that isn't true, and the ...IfNecessary functions do consider these values.

(Just a note to consider, please disregard if I missed something somewhere)

I agree. I had the exact same thought when I was looking at the function, but I decided to keep it out of my mind for the time being as to not bloat the scope. I was thinking that maybe the two thresholds were only set if autoCommit was true, but I didn't look to see if that was actually the case. Good thing to follow up on.

@Nevon
Copy link
Collaborator Author

Nevon commented Jan 29, 2021

Now I've had a nights sleep (and a days work), and the implementation still looks okay to me. I've verified that resolving a lower offset than an already resolved offset allows you to seek backwards, so it all seems fine.

As soon as @JaapRood or @ankon has had a chance to review and hopefully ✅, I'll go ahead and merge.

Copy link
Collaborator

@JaapRood JaapRood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@Nevon Nevon merged commit 76ec563 into master Feb 1, 2021
@Nevon Nevon deleted the dont-commit-on-consumer-seek branch February 1, 2021 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consumer.seek commits an offset, causing possible data and metadata loss
4 participants