Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send Kafka a TERM signal at pod stop and wait for shutdown #207

Merged
merged 5 commits into from
Nov 18, 2018

Conversation

solsson
Copy link
Contributor

@solsson solsson commented Sep 29, 2018

Fixes #206.

Based on https://github.com/apache/kafka/blob/trunk/bin/kafka-server-stop.sh and https://github.com/apache/kafka/blob/trunk/bin/zookeeper-server-stop.sh but these scripts don't wait for shutdown to complete.

Got the wait loop from https://stackoverflow.com/questions/17894720/kill-a-process-and-wait-for-the-process-to-exit

Currently I've only tested the PR locally with little load. @stigok Can you confirm that you no longer get corrupted indices?

@solsson
Copy link
Contributor Author

solsson commented Sep 29, 2018

The last log entry I see is INFO [KafkaServer id=0] shut down completed (kafka.server.KafkaServer)

@solsson
Copy link
Contributor Author

solsson commented Sep 29, 2018

Maybe Zookeeper doesn't need controlled shutdown. I see no effect in logs of invoking the script.

@stigok
Copy link

stigok commented Sep 30, 2018

I'm unable to reproduce the bad indices. I don't know how I ended up with them in the first place. We've been having a lot of pod restarts and failed probes running in AKS, so it could've been caused by a lot of different factors.

@stigok
Copy link

stigok commented Sep 30, 2018

But this is PR is certainly a step in the right direction 👍

@solsson solsson merged commit 198666d into master Nov 18, 2018
@stigok
Copy link

stigok commented Nov 18, 2018

I had bad indexes again after my disks went full. Maybe that is a "good way" to simulate broken indexes.

  • Configure log.retention.bytes to a value greater than available disk-space
  • Produce enough messages to fill the disk
  • Watch Kafka die
  • Expand disk and expect to see bad indexes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants