-
-
Notifications
You must be signed in to change notification settings - Fork 3
supervise.8
supervise - start and monitor a service.
supervise dir [parent_ident]
supervise(8) switches to the directory named dir. It checks for the file down. If this file exists, supervise doesn't start the service. If the directory run/svscan exists, supervise creates the directory dir/supervise in run/svscan where run is either /run or /var/run tmpfs filesystem (depending on your operating system). The directory run/svscan is created by svscan(8) when DISABLE_RUN environment variable isn't set. It also opens dir/supervise/lock in exclusive mode to prevent multiple copies of supervise run for the same service. It exits 100 if it cannot open dir/supervise/lock. The directory dir must be writable to supervise. This directory is used to maintain status information in binary format and also create few named pipes. The status information can be read by svstat(8). If DISABLE_RUN is set, or if your system doesn't have the /run or /var/run filesystem, dir/supervise will be created in orig/dir directory where orig is the directory in which dir is present. orig is usually /service on most systems.
supervise then executes ./init if it exists. In case ./init exits with non-zero status, it pauses for 60 seconds before restarting ./init. The pause is requiree so that supervise doesn't loop too quickly consuming high usage of the CPU. dir has to be relative to the current working directory and cannot start with the dot (.) or the slash (/) character. parent_ident is passed as a command line argument by svscan(8) when starting supervise log process when dir/log exists. This is useful when listing supervised log processes using the ps(1) command.
After ./init exits with zero exit status, supervise starts ./run. It restarts ./run if ./run exits. In case ./run exits with non-zero status, it pauses for a second after restarting ./run. The sleep avoids supervise from looping quickly when ./run has a problem. supervise expects ./run to remain in the foreground. Sometimes daemon fork themselves into background, which some consider bad software design. If you want to monitor such a daemon, set the sticky bit on ./run. This makes supervise go into subreaper mode using prctl(2) PR_SET_CHILD_SUBREAPER on Linux or procctl(2) PROC_REAP_ACQUIRE on FreeBSD. In subpreaper mode or when the environment variable SETPGID is set, ./run will have it's process Group ID set to the value of it's PID. Setting the process Group ID is required to monitor ./run reliably when ./run has a command which double forks (forks in the background). It is also required in such cases to make svc(8) command operate and control supervise reliably for such double forked daemon/commands in ./run. ./run is passed two command line arguments with dir as argv[1] and how as argv[2]. supervise uses the self pipe trick to handle all SIGCHLD events reliably. This requires the use of two file descriptors for the selfpipe.
supervise can set environment variables for ./run using envdir(8) if the directory ./variables exists. Files in ./variables directory must be compatible as environment variables for envdir(8). If the directory ./variables doesn't have execute permissions for others group, all existing environment variables will be cleared before setting environment variables for ./run.
how Description | |
---|---|
abnormal startup When ./run exits on its own | |
system failure When supervise is unable to fork to execute ./run | |
manual restart When svc -u or -r is used to start the service | |
one-time startup When svc -o is used to start the service | |
auto startup Normal startup after supervise is run by svscan or | |
manually |
If the file dir/down exists, supervise(8) does not start ./run immediately. You can use svc(8) to start ./run or to give other commands to supervise(8). supervise uses dir/supervise/control fifo to read these commands.
On receipt of SIGTERM, supervise sends SIGTERM followed by SIGCONT to its child. It uses killpg(3) to send the signal if runnning in supreaper mode or when SETPGID environment variable is set. It uses kill(2) to send signals when not running in subreaper mode and SETPGID environment variable is not set.
if the file dir/shutdown exists supervise(8) executes shutdown when asked to exit. dir is passed as the first argument and the pid of the process that exited is passed as the second argument to shutdown.
if the file dir/alert exists supervise(8) executes alert whenever run exits. dir is passed as the first argument, the pid of the process that exited is passed as the second argument, the exit value or signal (if killed by signal) is passed as the third argument to alert. The fourth argument is either of the strings exited or stopped / signalled.
supervise(8) may exit immediately after startup if it cannot find the files it needs in dir or if another copy of supervise(8) is already running in dir. Once supervise(8) is successfully running, it will not exit unless it is killed or specifically asked to exit. On a successful startup supervise(8) opens the fifo dir/supervise/ok in O_RDONLY|O_NDELAY mode. You can use svok(8) to check whether supervise(8) is successfully running. You can use svscan(8) to reliably start a collection of supervise(8) processes. svscan mirrors the service directory in /run or /var/run directory (whichever is found first). So /run/dir will be analogous to /service/dir. When started by svscan, error messages printed by supervise(8) will go the standard error output of svscan(8) process.
supervise(8) can wait for another service by having a file named dir/wait. This file has two lines. The first line is time t in seconds and the second line is service name w. w refers to the service which service dir should wait t secs after service w starts up. The amount of time t is limited to a max of 32767 secs. Any value above this value will be limited to 60 secs. The wait for another service is implemented by opening dir/supervise/up in write mode.
supervise(8) opens dir/supervise/up in read mode just after it executes ./run. Hence, if service w is up, write on w/supervise/up returns immediately. If service w is down, the write will block until w is up and running. If service w doesn't have supervise running, supervise will wait for 60 seconds before opening the file w/supervise/up again in read mode. The default value of 60 seconds gets overriden by the SCANINTERVAL environment variable used by svscan(8). If service w doesn't exist, dir/wait will be ignored.
supervise(8) opens dir/supervise/dn in read mode when asked to bring down a service using svc (-d or -r option). It opens this named pipe after issuing the TERM, CONT signal to the service. Hence, if servicew is down, write on w/supervise/dn returns immediately. if service w is up, the write will block until w is down.
supervise(8) creates the following FIFOs for with O_RDONLY|O_NDELAY mode. See open(2) for description of O_RDONLY, O_NDELAY
-
dir/supervise/control - for reading commands from clients like svc(8).
-
dir/supervise/ok - clients can open this in write mode (O_WRONLY) to test if supervise(8) is running in dir. If write returns, it means supervise(8) is running.
-
dir/supervise/up - clients can open this in write mode (O_WRONLY) to test if service dir is up. Any client that opens this fifo in O_WRONLY mode, will block until the service dir is up. If write returns, it means service dir has executed dir/run. svc(8) is one such client (-w option) that can be used to check if a service is up.
-
dir/supervise/dn - clients can open this in write mode to test if service dir is down. This works like exactly like dir/supervise/up. svc(8) (-W option) can be used to check if a service is down.
supervise logs informational, warning and error messages to descriptor 2. Informational messages can be turned on by setting the environment variable VERBOSE. Warning messages can be turned off by setting the environment variable SILENT. If you are using svscan for service startup (as setup for indimail-mta), you can set environment variables for supervise in /service/.svscan/variables directory.
supervise is designed to run forever, however it can exit 100 during startup if it fails to open dir/supervise/lock, or exit 111 if it encounters system errors.
svc(8), svok(8), svstat(8), svps(1), svscanboot(8), svscan(8), svctool(8), minisvc(8), readproctitle(8), fghack(8), pgrphack(8), multilog(8), tai64n(8), tai64nlocal(8), setuidgid(8), envuidgid(8), envdir(8), softlimit(8), setlock(8), open(2)