Why Monit?
One morning, I went on-line to check my WordPress website. Lo and behold, I saw this error: 'Error establishing a database connection.' My website had been down for 4 hours, luckily in the middle of the night.
I used a free website monitoring service called StatusCake. Sure enough, it did send me an email alerting me about this problem. But, sending an email at 2am was not helpful in solving the problem. What I really needed was a tool that not only detected when the database process went down, but would also restart the process without human intervention. Monit is such a tool.
For the rest of this post, I assume you want Monit
to monitor a LAMP
server (Linux, Apache2
, MySQL
, PHP
).
Install Monit.
To install Monit
on Debian or Ubuntu, execute this command:
$ sudo apt-get install monit
As part of the installation, a monit
service is created:
$ sudo chkconfig --list | grep -i monit
monit 0:off 1:off 2:on 3:on 4:on 5:on 6:off
Configure Monit
The main Monit
configuration file is /etc/monit/monitrc
. To edit it, you need sudo
privileges.
$ sudo vi /etc/monit/monitrc
After you make a change to the file, follow these steps to bring it into effect:
Validate configuration file syntax.
$ sudo monit -t
If no error is returned, proceed to next step.
Restart Monit
.
$ sudo service monit restart
Global settings
The key global settings to customize are:
Test interval
By default, Monit
checks your system at 2-minute intervals. To customize the interval, change the value (from 120) in the following statement. The unit of measure is seconds.
set daemon 120
Log file location
You can specify whether Monit
logs to syslog or a log file of your choice.
# set logfile syslog facility log_daemon
set logfile /var/log/monit.log
Mail server
Specify a mail server for Monit
to send email alerts. I set up exim4
as an SMTP
server on the localhost
. For instructions, refer to my previous post.
set mailserver localhost
Email format
Hopefully, you won't receive many alert emails, but when you do, you want the maximum information about the potential problem. The default email format contains all the information known to Monit
, but you may customize the format in which the information is delivered. To customize, use the set mail-format
statement.
set mail-format {
from: monit@$HOST
subject: monit alert -- $EVENT $SERVICE
message: $EVENT Service $SERVICE
at $DATE
on $HOST
$ACTION
$DESCRIPTION
Your faithful employee,
Monit
}
For a description of the set mail-format
statement, click here.
Global alerts
If any actionable event occurs, Monit
sends an email alert to a predefined address list. Each email address is defined using the set alert
statement.
set alert root@localhost not on { instance, action }
In the above example, root@localhost
is the email recipient. Please refer to my earlier post about redirecting local emails to a remote email account.
Note that an event filter is defined (not on { instance, action }
). Root@local
will receive an email alert on every event unless it is of the instance
or action
type. An instance
event is triggered by the starting or stopping of the Monit
process. An action
event is triggered by certain explicit user commands, e.g., to unmonitor or monitor a service. Click here for the complete list of event types that you can use for filtering.
By default, Monit
sends an email alert when a service fails and another when it recovers. It does not repeat failure alerts after the initial detection. You can change this default behavior by specifying the reminder option in the set alert
statement. The following example sends a reminder email on every fifth test cycle if the target service remains failed:
set alert root@localhost with reminder on 5 cycles
Enabling reporting and service management
You can dynamically manage Monit
service monitors, and request status reports. These capabilities are delivered by an embedded web server. By default, this web server is disabled. To enable it, include the set httpd
statement.
set httpd port 2812 and
use address localhost
allow localhost
Note: I've only allowed local access to the embedded web server. The Useful Commands section below explains the commands to request reporting and management services.
Resource monitor settings
The following are the key resources to monitor on a LAMP server.
System performance
You can configure Monit
to send an alert when system resources are running below certain minimum performance threshold. The system resources that can be monitored are load averages, memory, swap and CPU usages.
check system example.com
if loadavg (1min) > 4 then alert
if loadavg (5min) > 2 then alert
if memory usage > 75% then alert
if swap usage > 25% then alert
if cpu usage (user) > 70% then alert
if cpu usage (system) > 30% then alert
if cpu usage (wait) > 20% then alert
Filesystem usage
You can create a monitor which is triggered when the percentage of disk space used is greater than an upper threshold.
check filesystem rootfs with path /
if space usage > 90% then alert
You may have more than 1 filesystem created on your server. Run the df
command to identify the filesystem name (rootfs
) and the path it was mounted on (/).
MySQL
Instead of putting the MySQL-specific statements in the main configuration file, I elect to put them in /etc/monit/conf.d/mysql.conf
. This is a personal preference. I like a more compact main configuration file. All files inside the /etc/monit/conf.d/
directory are automatically included in Monit
configuration.
The following statements should be inserted into the mysql.conf
file.
check process mysql with pidfile /var/run/mysqld/mysqld.pid
start program = "/etc/init.d/mysql start"
stop program = "/etc/init.d/mysql stop"
if failed unixsocket /var/run/mysqld/mysqld.sock then restart
if 5 restarts within 5 cycles then timeout
If the MySQL process dies, Monit
needs to know how to restart it. The command to start the MySQL process is specified by the start program
clause. The command to stop MySQL is specified by the stop command
clause.
A timeout event is triggered if MySQL is restarted 5 times in a span of 5 consecutive test cycles. In the event of a timeout, an alert email is sent, and the MySQL process will no longer be monitored. To resume monitoring, execute this command:
$ sudo monit monitor mysql
Apache
I put the following Apache-specific statements in the file /etc/monit/conf.d/apache.conf
.
check process apache2 with pidfile /var/run/apache2.pid
start program = "/etc/init.d/apache2 start"
stop program = "/etc/init.d/apache2 stop"
if failed host example.com port 80 protocol http request "/monit/token" then restart
if 3 restarts within 5 cycles then timeout
if children > 250 then restart
if loadavg(5min) greater than 10 for 8 cycles then stop
At every test cycle, Monit
attempts to retrieve http://example.com/monit/token
. This URL points to a dummy file created on the webserver specifically for this test. You need to create the file by executing the following commands:
$ mkdir /var/www/monit
$ touch /var/www/monit/token
$ chown -R www-data:www-data /var/www/monit
Besides testing web access, the above configuration also monitors resource usages. The Apache
process is restarted if it spawns more than 250 child processes. Apache is also restarted if the server's load average is greater than 10 for 8 cycles.
Useful commands
To print a status summary of all services being monitored, execute the command below:
$ sudo monit summary
The Monit daemon 5.4 uptime: 3h 48m
System 'example.com' Running
Filesystem 'rootfs' Accessible
Process 'mysql' Running
Process 'apache2' Running
To print detailed status information of all services being monitored, execute the following:
$ sudo monit status
The Monit daemon 5.4 uptime: 3h 52m
System 'example.com'
status Running
monitoring status Monitored
load average [0.00] [0.01] [0.05]
cpu 0.0%us 0.0%sy 0.0%wa
memory usage 377092 kB [74.0%]
swap usage 53132 kB [10.3%]
data collected Wed, 22 Apr 2015 13:21:47
...
Process 'apache2'
status Running
monitoring status Monitored
pid 12909
parent pid 1
uptime 6d 15h 18m
children 10
memory kilobytes 2228
memory kilobytes total 335420
memory percent 0.4%
memory percent total 65.9%
cpu percent 0.0%
cpu percent total 0.0%
port response time 0.001s to example.com:80/my-monit-dir/token [HTTP via TCP]
data collected Wed, 22 Apr 2015 13:21:47
To unmonitor a particular service (e.g., apache2):
$ sudo monit unmonitor apache2
To unmonitor all services:
$ sudo monit unmonitor all
To monitor a service:
$ sudo monit monitor apache2
To monitor all services:
$ sudo monit monitor all
Conclusion
I'd recommend that you run Monit
on your server in addition to signing up for a remote website monitoring service such as StatusCake. While the 2 services do overlap, they also complement each other. Monit
runs locally on your server, and can restart processes when a problem is detected. However, a networking problem may go undetected by Monit
. That is where a remote monitoring service shines. In the event of a network failure, the remote monitor fails to connect to your server, and will therefore report a problem that may otherwise go unnoticed.