-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BB-624 Retry connection to kafka if backbeat processes started before #2574
Conversation
Missing from S3C-9338 because this usage of sdk doesn't use the env variable
Hello bourgoismickael,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
392ffe9
to
42642ee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions:
- add tests for "Prevent ProbeServer crash on startup by delaying it after the queue components are started".
- add tests for Consumer service crashes in the event of a consumer error?
46b33ab
to
13bf58c
Compare
Prevent hanging indefinitely if replication status' BackbeatConsumer succeeds to connect to kafka but then FailedCRRPRoducer or ReplayProducer fails
13bf58c
to
dbe3cc5
Compare
If probing too soon on startup some CRR components can crash. Check consumer exists for metric function to use it. Also the probe server is started after the queue (like in ZENKO)
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
The following options are set: create_pull_requests, create_integration_branches |
@nicolas2bert |
// if connection to destination fails, process will stop & restart | ||
next => this._destination.init(next), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About this it looks like on zenko it could crash with a "callback already called" if destination Kafka is not up on startup
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
The following options are set: create_pull_requests |
Here is an Integration e2e run: https://github.com/scality/Integration/actions/runs/11699734386/job/32582300296 It should not trigger any exit implemented in this PR as all kafka dependencies are already UP in those tests |
/create_integration_branches |
/approve |
I have successfully merged the changeset of this pull request
The following branches have NOT changed:
Please check the status of the associated issue BB-624. Goodbye bourgoismickael. The following options are set: approve, create_integration_branches |
crashexit after a timeout on connection error with kafka client (for CRR, Lifecycle, Bucket notification). Even for bucket notification destination.New behavior: components will try to connect for 60s instead of 30s and
crashexit and restart if kafka client can't connect with error like: