-
Notifications
You must be signed in to change notification settings - Fork 277
Monit in AppScale
Monit is a process management tool used throughout AppScale to start and stop processes.
To see a summary of processes running on a node, run:
monit summary
or, alternatively:
monit status
The latter being more verbose.
AppScale generates configuration files and places them in /etc/monit/conf.d/. Here you can see the string it looks for to verify a process is still running. If a match is not made, it'll try to restart the process. Previous versions of AppScale used a ruby tool called "god", but it tied processes down to a pid number. Erlang processes would change pid over time causing issues where they would erroneously get restarted.
An example configuration file is "taskqueue-64839.cfg" in /etc/monit/conf.d/:
check process taskqueue-64839 matching "python /root/appscale/AppTaskQueue/taskqueue_server.py"
group taskqueue
start program = "/bin/bash -c ' python /root/appscale/AppTaskQueue/taskqueue_server.py 1>>/var/log/appscale/taskqueue-64839.log 2>>/var/log/appscale/taskqueue-64839.log'"
stop program = "/bin/kill -9 `ps aux | grep taskqueue_server.py | awk {'print $2'}`"
Here we see a matching string required to make sure the process is still up, and then a start/stop command. You'll see a similar patterned for all other configuration files in this directory.
Monit is installed and run as a service. So to either reload monit configurations:
service monit reload
Similarly, you can run these other commands:
service monit start
service monit stop
service monit restart
For applications you'll see an additional statement:
if totalmem > 400 MB for 10 cycles then restart
Which makes sure we don't have applications that use up too much memory. In this statement if we find that a process stays over 500MB for 10 straight cycles we'll restart the application. The limit here can be set in your AppScalefile as the "max_memory" parameter.
Monit management and configuration files generation happen in two places:
- appscale/AppController/lib/monit_interface.rb
- appscale/lib/monit_interface.py
Monit can do much more than how AppScale uses it. You can have it send you emails if a failure is detected, or set different resource limits. Learn from from the official monit website.