Do Nagios NCPA Memory Stats Match the Output of the Linux Utility Free?
Many administrators like sanity checks when investigating new tools. We sometimes hear the objection that NCPA memory stats don’t match the output of the Linux utility, a statement that happens (wonderfully) to be both true and not true at the same time. The discrepancies mostly have to do with reporting units, conversions, and how everything other than total memory is determined.
Methodology
On the Linux host we are monitoring, with the NCPA agent installed, we’ll run free
. Then, from our XI box, we’ll run the NCPA memory check both as a regular check and also as a manual check from the command line. Then compare.
Total Memory
Let’s start with how NCPA memory metrics match the output of free
: total memory. Running free on a test box, we get:
[root@localhost memory]# free
Total used free shared buffers cached
Mem: 8193024 6984832 1208192 0 202888 974112
-/+ buffers/cache:5807832 2385192
Swap: 262136 0 262136
It is important to know that free by default returns memory stats in kibibytes. Yes, it is true that if you run man free
on some distributions, it will say memory stats are given in kilobytes, but that is an old version of the man page.
NCPA by default returns memory stats in gibibytes, so in the XI interface, after running the NCPA Wizard against the host, we are going to see a result like this:
So, how do we compare total memory between free
and NCPA output and the XI interface?
The simplest thing to do is go to your browser and look up a converter, BUT be sure to specify kibibytes and gibibytes for units. I only point this out because I used incorrect units at least twice.
Alternately, you can run check_ncpa.py from the command line and specify output in kibibytes like this:
[root@centos7x64 ~]# /usr/local/nagios/libexec/check_ncpa.py -H
192.168.3.33 -t 'a' -P 5693 -M memory/virtual -u Ki -w 80 -c 90
where the -u
flag specifies units in Ki
for kibibytes.
We get
OK: Used memory was 67.40 % (Available: 2670920.00 KiB, Total:
8193024.00 KiB, Free: 1202984.00 KiB, Used: 5148448.00 KiB) |
'available'=2670920.00KiB;80;90; 'total'=8193024.00KiB;80;90;
'free'=1202984.00KiB;80;90; 'used'=5148448.00KiB;80;90;
But What About the Free/Used/Available Metrics?
I will concede the point that on these metrics, Free and NCPA do not entirely agree, but there are simple reasons. The “free” memory value between the two measures is only a little different, and the difference is at least partially attributable to the small amount of memory load from NCPA checking memory.
Free and NCPA calculate memory metrics differently. Why? That’s an interesting rabbit hole to go down, but it suffices to say NCPA uses psutil, and available memory is “the memory that can be given instantly to processes without the system going into swap.”
That’s a handy metric. The NCPA percentage memory used calculation is (total – available)/100), which gives administrators a solid idea of how host memory is performing.
The very clever will notice that for NCPA, none of (used + free), (used + available), or (used + free + available) sum to total memory in our example. Again, the psutil document will be helpful here. Basically, they are not meant to sum.
Conclusion
Administrators applying sanity checks to their NCPA results may indeed initially question the sanity of NCPA output. With a solid understanding of the units of measure in question as well as what is actually being measured, administrators can see that NCPA memory stats check out.