Monitoring Dirvish backup servers using Nagios

published Feb 01, 2008, last modified Jun 26, 2013

Dirvish is an excellent disk-based rotating backup application. Nagios is a fabulous service monitor. Combine the two using this Nagios plugin and you will know, at all times, the status of your latest backup run.

The script

Stash it in /usr/lib/nagios/plugins of the Dirvish backup machine, naming it check_dirvish. This script assumes that your Dirvish vaults are in /mnt/backup, so tune it if that isn't true in your case:

#!/bin/bash

for a in /mnt/backup/* ; do
        if [ -f `ls -d "$a/"* 2> /dev/null | grep -v /dirvish | sort -g | tail -1 `/rsync_error ] ; then
                echo "CRITICAL: latest backup in vault $a failed"
                exit 2
        else
                /bin/true
        fi
done
echo "OK: All backups OK"

The security setup

Create a nagios user on your Dirvish backup machine, and set up SSH passwordless authentication.

Now, if your Dirvish vaults are accessible only to root, set up sudo to allow Nagios to run this script as root:

nagios ALL = NOPASSWD: /usr/lib/nagios/plugins/check_dirvish

The Nagios setup

Finally, set Nagios up:

define command{
        command_name    ssh_dirvish_sudo
        command_line    /usr/lib/nagios/plugins/check_by_ssh -t 29 -H $HOSTADDRESS$ -C 'sudo /usr/lib/nagios/plugins/check_dirvish'
        }
define service{
        use                             generic-service
        host_name                       gabriela
        service_description             Backups
        check_command                   ssh_dirvish_sudo
        }

Of course, the sudo call is only needed if the Dirvish vaults are restricted for the nagios user.