Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods does not recover from failure: abandoned subscription: was taking too long #252

Open
alok87 opened this issue Sep 21, 2021 · 1 comment
Labels
bug Something isn't working p2 intermittent issue, not very urgent, but need to be done soon

Comments

@alok87
Copy link
Contributor

alok87 commented Sep 21, 2021

Batcher needs to be manually restarted when the following errors happen. When Kafka faces downtime, it is seen some batcher get stuck with the following errors. They need a restart from recovering from this.

[sarama] 2021/09/20 06:41:38 consumer/broker/2 abandoned subscription to ts.db.table/0 because consuming was taking too long

They should recover without restarts, or fatal and restart on its own.

@alok87 alok87 added the bug Something isn't working label Sep 21, 2021
@alok87
Copy link
Contributor Author

alok87 commented Nov 23, 2021

The first short term fix should be to make the error Fatal so that pod restarts and do not stay like this. This is very much required in Main Sink Group where multiple topics are loaded together in one pod to save connection to Redshift.

Happens for the loader as well.

@alok87 alok87 changed the title Batchers does not recover from failure Pods does not recover from failure: abandoned subscription: was taking too long Nov 23, 2021
@alok87 alok87 added the p2 intermittent issue, not very urgent, but need to be done soon label Nov 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working p2 intermittent issue, not very urgent, but need to be done soon
Projects
None yet
Development

No branches or pull requests

1 participant