Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Libbeat] Gracefully shut down on SIGHUP #10704

Merged
merged 7 commits into from
Mar 14, 2019
Merged

Conversation

cwurm
Copy link
Contributor

@cwurm cwurm commented Feb 12, 2019

SIGHUP is sent to a process if the user's terminal is disconnected. It is one of the five termination signals. Currently, Beats is going to terminate suddenly, i.e. the log will cut off, no shutdown actions are going to be run, etc.

As we handle more and more state (e.g. writing state to beat.db files) it becomes more important to gracefully shutdown in "less than ideal" situations. Beats should probably not be run directly from a terminal or a terminal multiplexer, but it probably happens, and if so the consequences of a network disconnect or clumsy fingers (I forget the number of times I've hit the wrong key when using shortcuts in tmux/screen) should not be severe if we can help it.

This change catches SIGHUP as it already does SIGINT/SIGTERM and runs the usual shutdown actions. This can include waiting for a period of time specified using the shutdown_timeout option Filebeat and Packetbeat support.

I thought we might want to exit with a non-zero exit code if we receive this unexpected signal, but I don't see a good way to specifiy the exit code in Libbeat at the moment.

@cwurm cwurm requested a review from urso February 12, 2019 19:26
@cwurm cwurm requested a review from a team as a code owner February 12, 2019 19:26
Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend putting some tests in place from the system tests to send in a SIGHUP. This way it gets tested on multiple OSes and we can make sure nothing unusual is happening upstream the Go runtimes signal handling (it has some builtin handlers).

@@ -44,10 +44,17 @@ func HandleSignals(stopFunction func(), cancel context.CancelFunc) {

// On ^C or SIGTERM, gracefully stop the sniffer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but this comment is now out of date (and has been for a while). Perhaps: // On termination signals, gracefully stop the Beat

@jsoriano
Copy link
Member

jsoriano commented Mar 1, 2019

Even if it was not intended for that, SIGHUP is commonly used to request a configuration reload in services/daemons in unix systems. One of the reasons to use this signal for that is that daemons are usually not run in terminals, so they are not going to receive it by this reason. I think that Beats should be included in this category.

In any case I think it is fine to merge this by now, I agree that we should handle this signal in some way.

@cwurm
Copy link
Contributor Author

cwurm commented Mar 1, 2019

I recommend putting some tests in place from the system tests to send in a SIGHUP. This way it gets tested on multiple OSes and we can make sure nothing unusually is happening upstream the Go runtimes signal handling (it has some builtin handlers).

Good call, I've added a system test that sends SIGHUP and checks if it shut down correctly. Let me know what you think @andrewkroh.

@cwurm cwurm merged commit e9537aa into elastic:master Mar 14, 2019
@cwurm cwurm deleted the libbeat_sighup branch March 14, 2019 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants