Skip to content
bernardd edited this page Feb 7, 2014 · 8 revisions

erld is a small program designed to solve the problem of running Erlang programs as a UNIX daemon.

The UNIX daemons most of us are used to have the following properties (amongst others):

  • Can be started/stopped from an init script
  • On startup, control does not return to the console until the program has successfully started (or failed to do so).
  • Startup diagnostic information can be printed to the console to indicate progress, but output ceases once the daemon is running.
  • On returning to the console, the return code indicates success (0) or failure (some other number).
  • Log rotation can be triggered by sending a SIGHUP

Each of these items is hard (or impossible) to do in a "pure" Erlang application. We'll look at each problem in turn and how erld solves it.

Init Script

This one isn't actually hard in itself - you just need to write a script. A sample init script is included with erld to get you started. The script handles things like writing a pid file so that the correct pid can later be stopped, and a lock file to prevent multiple instances of the daemon being started.

The erld_app behaviour also provides a stop function in your app which erld calls at shutdown, allowing for you to have cleanup code to provide a graceful shutdown. Erlang's embedded heritage means that a graceful shutdown isn't something that's always easy to achieve - application:stop/1 is only called after most of your processes are already dead. application:prep_stop/1 is more helpful, but it runs in the context of the application controller, so you can't use it to, for example, shut down other apps. erld_app:stop/0, by contrast, runs in the context of the RPC system, so can happily call any other code.

Deferring return to the console and managing output

Erlang's normal startup operates in one of two modes - interactive, or detached. Unfortunately, there's no neat way to detach an interactive session (ie, you have to have an open shell sitting somewhere, even if it's hidden behind 'screen' or similar), and a detached session detaches as soon as the VM starts up - not very useful for getting immediate feedback from your application's startup process.

erld solves this problem in a similar manner to GNU screen. It wraps Erlang's stdout and stderr, feeding it back to the console until the Erlang application itself indicates to it that it has successfully started and is ready to be detached. This is achieved in the code with a single simple call:

erld:detach()

Upon receipt of this message, erld detaches Erlang's stdout and stderr from the console, redirecting it to a log file, and returns control to the console with a success code (0).

Returning a useful error code

The standard Erlang detached behaviour always returns 0 to the shell, no matter how badly things might have gone wrong. Even if your program calls halt(1), a detached startup will return 0 before that is executed. erld, by contrast, catches any return code generated before it is detached and returns that to the shell.

Log Rotation

Erlang (being mostly OS-agnostic) provides no way to catch signals such as SIGHUP which is commonly used to trigger log rotation. erld captures SIGHUPs sent to it and uses them as a cue to call a function you specify. This was designed for log rotation, but in practise could be used to trigger any behaviour you like.

Restarting

In addition to providing the above daemon-style functionality, erld also replaces and extends Erlang's built-in heartbeat functionality. While the built-in system is useful, it can be problematic when trying to shut down a misbehaving system - you kill the process and heartbeat restarts it - you kill heartbeat and the process restarts it. But we still might want to automatically restart a system that has exited abnormally from its running state.

erld provides very similar behaviour to Erlang's heartbeat system, but it does not start watching the Erlang app until it is detached from the console (so that startup failures won't keep looping), and will not restart it once you have issued a shutdown command.

To add the heartbeat to your app, you simply add an erld_heartbeat process to your top-level supervisor. The supervisor specification can be obtained by calling

erld_heartbeat:erld_heartbeat_spec()

Command line options

The following command line options may be used. (Run with -h to see default values).

  • -f Run in the foreground (never detach).
  • -l <log file> The log file to which output will be written after detachment.
  • -p <pid file name> The pid file name
  • -c <cookie file> The cookie file name
  • -n <node name> The node name for the erld process
  • -t <heartbeat timeout in seconds> The heartbeat timeout after which erld will deem the process to have stopped responding.
  • -T <heartbeat warning in seconds> The heartbeat timeout after which erld will log a warning (but perform no other action).
  • -r <restart limit> The maximum number of short restarts (see below) erld will perform on the application before giving up.
  • -i <time defining a short restart> The time between restarts that erld will deem "short", triggering an increment in the restart count as used by -r.
  • -g <kill grace period in seconds> The period of time erld will wait after a shutdown command is issued before harshly killing the process.
  • -M <log rotation module name> The module name containing the log rotation function.
  • -M <log rotation function name> The log rotation function name (must be arity 0).
  • -d Turn on debugging

Licensing

The C code in erld is licensed under the GPL. The Erlang parts, because they need to be included directly in the code of the application using erld, are licensed under the MIT license.