Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RedshiftBatcher: on restart should consider loader offset for robustness #20

Open
alok87 opened this issue Aug 15, 2020 · 0 comments
Open
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@alok87
Copy link
Contributor

alok87 commented Aug 15, 2020

S3 batcher does shutdown gracefully but every start should consider the last uploaded offset of batcher n the loader topic last message.

This will ensure regardless of what goes wrong, the system never looses data, never reprocess same data and thus becomes Idempotent.

Idea credits: Yelp experience https://engineeringblog.yelp.com/2016/10/redshift-connector.html

@alok87 alok87 changed the title S3 batcher on restart should consider loader offset for robustness RedshiftSink: Batcher on restart should consider loader offset for robustness Aug 15, 2020
@alok87 alok87 changed the title RedshiftSink: Batcher on restart should consider loader offset for robustness RedshiftBatcher: on restart should consider loader offset for robustness Aug 17, 2020
@alok87 alok87 added enhancement New feature or request good first issue Good for newcomers labels Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant