Introduction

On my previous post – Some Icinga Core basics – Part Two – I discussed a little about plugins and the default ones that get installed with Icinga.  In addition we looked at setting up an basic plugin (SSH) and adding a new host to Icinga (with basic up/down check and the SSH service check).  In this post I’ll cover monitoring remote Linux servers in more depth – specifically using NRPE.

NOTE: while a lot of this is stated as Raspberry Pi specific it’s really just general Icinga configuration on a Debian based system (e.g. Ubuntu)

NRPE – Nagios Remote Plugin Executor

A remote Linux server can have the NRPE agent installed.  This allows the same plugins you previously ran on the localhost to be run remotely.  The difference is that the Icinga server will run the NRPE plugin (check_nrpe) which then calls a remote command (with or without arguments).  The remote command will run the plugin locally on the remote server and return the result back to check_nrpe.  The excellent Icinga documentation has a section on NRPE here.

Installation

There are two parts (server and client) that need to be installed for correct connectivity:

First – The nagios-nrpe-plugin must be installed on the Icinga Server.

If you installed Icinga as per one of my previous posts then you would have run:

foo@raspberrypi ~ $ sudo apt-get –no-install-recommends install nagios-nrpe-plugin

This installs the NRPE plugin and no more.  Install it now if it hasn’t been done previously.

If you’re not sure if it’s installed run:

foo@raspberrypi ~ $ dpkg -s nagios-nrpe-plugin

Look for the Status section(e.g. install ok installed)

Second – The NRPE plguin must be installed on the remote Linux server.

foo@raspberrypi ~ $ sudo apt-get install nagios-nrpe-server

This will also install nagios-plugins-basic and nagios-plugins-standard.

Configuration

With these components installed we need to configure a couple of things to get up and running:

On the remote server add the Icinga monitoring server IP address to the allowed_hosts property.  Edit file /etc/nagios/nrpe.cfg and browse to the allowed_hosts line.

foo@raspberrypi ~ $ sudo nano /etc/nagios/nrpe.cfg

Change the entry to the Icinga server IP address.

Add a new command to help us check the NRPE daemon is up.  Edit the file /etc/nagios/nrpe_local.cfg and add the following to the end.

foo@raspberrypi ~ $ sudo nano /etc.nagios/nrpe_local.cfg

command[check_nrpe_daemon]=/bin/echo “NRPE OK”

Save and Exit.  Restart the NRPE service:

foo@raspberrypi ~ $ sudo service nagios-nrpe-server restart

From your Icinga server check it’s all working as expected:

foo@raspberrypi ~ $ sudo /usr/lib/nagios/plugins/check_nrpe -H <ip_address_of_icinga_server> -c check_nrpe_daemon

NOTE: The -c indicates the command you want to call at the remote end.

This should return NRPE OK.

Blame NRPE?

There are typically two methods to run remote checks:

  1. You define custom commands on the remote server and call these from NRPE.
  2. You call the base commands from NRPE and pass the required arguments.

Method 1

This involves defining custom commands with the required arguments on the remote server (in a .cfg file).  This in turn means there is more configuration done on the remote server(s).  By default there are a handful of commands enabled this way.  These can be seen in /etc/nagios/nrpe.cfg:

command[check_users]=/usr/lib/nagios/plugins/check_users -w 5 -c 10
command[check_load]=/usr/lib/nagios/plugins/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1
command[check_zombie_procs]=/usr/lib/nagios/plugins/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/lib/nagios/plugins/check_procs -w 150 -c 200

From the Icinga server we can invoke any of these by running check_nrpe and calling the command name after the -c argument.  Let’s quickly check the load on our remote server:

foo@raspberrypi ~ $ sudo /usr/lib/nagios/plugins/check_nrpe -H <ip_address_of_remote_server> -c check_load

This should return something similar to:

OK – load average: 0.00, 0.01, 0.05|load1=0.000;15.000;30.000;0; load5=0.010;10.000;25.000;0; load15=0.050;5.000;20.000;0;

What this has really done is run /usr/lib/nagios/plugins/check_load -w 15,10,5 -c 30,25,20 (taken from the command list above) on the remote server and passed the output back to Icinga via the NRPE plugin.

Method 2

This is stated as a security risk as it allows arguments to be remotely passed.  This is enabled with the option dont_blame_nrpe on the remote host (in /etc/nagios/nrpe.cfg).  Change value to 1.

If you decide to do this you should comment out the commands above (in /etc/nagios/nrpe.cfg) and uncomment (remove the leading #) the following commands in the same file:

#command[check_users]=/usr/lib/nagios/plugins/check_users -w $ARG1$ -c $ARG2$
#command[check_load]=/usr/lib/nagios/plugins/check_load -w $ARG1$ -c $ARG2$
#command[check_disk]=/usr/lib/nagios/plugins/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
#command[check_procs]=/usr/lib/nagios/plugins/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$

Notice these expect additional arguments to be passed to the command.  Let’s quickly look at the check_disk command and try it out.  Ensure you’ve changed dont_blame_nrpe to 1 and commented/uncommented the commands as stated above.

We’ll call this with NRPE and the arguments for warning % (-w), critical % (-c) and the disk to check (-p):

foo@raspberrypi ~ $ sudo /usr/lib/nagios/plugins/check_nrpe -H <ip_address_of_remote_server> -c check_disk -a 20% 40% /dev/sda1

In this command -a passes the arguments with a space separating each one.  The return should be something like:

DISK OK – free space: / 16833 MB (93% inode=95%);| /=1215MB;15216;11412;0;19021

Extend the New Host monitoring

Hopefully you’ve seen from the configuration above how we build up remote checks.  In this example we’re going to use the existing and new custom commands (Method 1) and extend the checks for our example remote server from part two vmub01 (192.168.0.20).

On the remote server edit the nrpe_local.cfg file (this is a good place to place your custom commands but you can create your own .cfg file if required)

foo@vmub01 ~ $ sudo nano /etc/nagios/nrpe_local.cfg

Add the following custom command to check our local disk /dev/sda1:

command[check_sda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/sda1

Save and exit.  Restart the NRPE service:

foo@vmub01 ~ $ sudo service nagios-nrpe-server restart

On the Icinga server edit the host definition we created in part two (/etc/icinga/objects/vmub01.cfg):

foo@raspberrypi ~ $ sudo nano /etc/icinga/objects/vmub01.cfg

Add the following service definitions then save and exit:

define service {
        service_description             Disk Space sda1
        host_name                       vmub01
        check_command                   check_nrpe_1arg!check_sda1
        use                             generic-service
}

define service {
        service_description             Current Load
        host_name                        vmub01
        check_command                   check_nrpe_1arg!check_load
        use                             generic-service
}

define service
        service_description             Current Users
        host_name                        vmub01
        check_command                   check_nrpe_1arg!check_users
        use                             generic-service
}

These service definitions add the custom command from above and two taken from /etc/nagios/nrpe.cfg.  Notice we’re using check_nrpe_1arg.  I found that if you’re not using arguments you can get a null value back from check_nrpe as it expects a second argument for -a.  This can be seen in /etc/nagios-plugins/config/check_nrpe.cfg.

Restart the Icinga service:

foo@raspberrypi ~ $ sudo service icinga restart

Browse to your Icinga page (http://ip_address/icinga), select Host Detail and you should see vmub01 green and UP.

Click the small document icon beside the host name (labelled Service Details). You should see the new services here – this will likely be PENDING but will show you when it will run the check under the Status Information heading.

icinga-vmub01-newservices-pending

When complete they should be green and UP.

icinga-vmub01-newservices-up

OK, Warning, Critical, Unknown

It’s worth mentioning a little more about the states that are returned to Icinga and how they’re determined.  When you run certain plugins a warning and critical value is required – this might be a number for a users check or ping check or it might be a percentage for disk space check.  Running the plugin with –help will help you determine which is required (Sometimes you can do either or).  If you’ve supplied a warning and critical value to the plugin and these values are breached when the check is made then a WARNING or CRITICAL state is passed back to Icinga (instead of OK). Lastly, there is the unknown state – this can be triggered by a variety of reasons e.g. misconfiguration.  You can see examples above of warning and critical values being supplied to load, users, disks and processes.

Wrap up

In the last three posts I’ve covered some basics of using Icinga core to get your monitoring up and running.  They should have given you a glimpse into the configuration files and using plugins.  I intend to cover more topics in the future (like host dependencies, using icons/images and managing remote Windows servers).

Some Icinga Core basics – Part One

Some Icinga Core basics – Part Two

Some Icinga Core basics – Part Four

Some Icinga Core basics – Part Five

Some Icinga Core basics – Part Three
Tweet about this on TwitterShare on Google+0Share on Facebook0Email this to someone
Tagged on:                 

14 thoughts on “Some Icinga Core basics – Part Three

Leave a Reply

Your email address will not be published. Required fields are marked *