[collectd] Collectd and nagios

Hanik, Filip Filip.Hanik at laquinta.com
Fri Nov 18 01:05:15 CET 2005


I take the data from collectd,

type,inst,val and run it through a simple parser, that invokes the send_nsca program.
The send_nsca program opens a socket to the nagios server, where I have configured the nsca daemon,
The daemon then feeds the data to nagios.

The goal is to have all our nodes, report into one monitor server, so that we can see the status easily from there.
I'm using this project to learn see (I come from Pascal and tons of java) so low level C is quite the challenge :)

Filip


Filip Hanik
Sr Software Engineer
La Quinta Corporation 
http://www.lq.com/

 

-----Original Message-----
From: collectd-bounces at verplant.org [mailto:collectd-bounces at verplant.org] On Behalf Of Florian Forster
Sent: Thursday, November 17, 2005 4:24 PM
To: The system statistics collection daemon " collectd" ' list.
Subject: Re: [collectd] memory.c and /proc/meminfo

Hi again :)

On Thu, Nov 17, 2005 at 03:53:02PM -0600, Hanik, Filip wrote:
> On fedora 4, you get the following output
> 
> [root at filip sbin]# free -m
>              total       used       free     shared    buffers     cached
> Mem:          2026       1969         56          0         43       1270
> -/+ buffers/cache:        655       1370
> Swap:         1983          0       1983
> 
> [root at filip sbin]# cat /proc/meminfo
> MemTotal:      2074728 kB
> MemFree:         57524 kB
> Buffers:         44148 kB
> Cached:        1301648 kB
> 
> So as you can see, the buffers and cached are not necessarily part of 
> the memory that is "used".  To get the true usage from the stats (ie 
> percentage free RAM) I had to do the modification.

Okay, this is what collectd currently does:
    MemTotal == MemFree + Buffers + Cached + MemUsed =>  MemUsed  := MemTotal - (MemFree + Buffers + Cached)

In numbers:
    MemUsed  := 2074728 - (57524 + 44148 + 1301648) == 671408

    MemFree  =~  2.8%
    Buffers  =~  2.1%
    Cached   =~ 62.7%
    MemUsed  =~ 32.4%

On Thu, Nov 17, 2005 at 02:17:37PM -0600, Hanik, Filip wrote:
> 	if (mem_used >= (mem_free + mem_buffered + mem_cached))
> 	{
> 		//mem_used is the total, if mem_used is the biggest number
> 		mem_free = mem_free + mem_buffered + mem_cached;
> 		//mem_used -= mem_free + mem_buffered + mem_cached;
> 		mem_used -= mem_free;
> 		memory_submit (mem_used, mem_buffered, mem_cached, mem_free);
> 	}

What you do here is:
    MyFree = MemFree + Buffers + Cached
    MyUsed = MemTotal - MyFree

In numbers:
    MyFree   := 57524 + 44148 + 1301648 == 1403320
    MyUsed   := 2074728 - 1403320 == 671408

    MyFree   =~ 67.6%
    MyUsed   =~ 32.4%
    Buffers  =~  2.1%
    Cached   =~ 62.7%

Clearly, for these numbers to make sense you need to set `Buffers' and `Cached' to zero.

> Before we made this modification, we would have numbers like 97% 
> memory used, cause collectd didn't take into consideration that 
> cached/buffered memory might be available for us.

collectd uses four numbers. If you add MemUsed, Buffers and Cached you'll get a very large number that is far from practical (this is the kind of information e.g. Net-SNMPd will give you). 
Adding MemFree, Buffers and Cached will give you `anything not used bu programs' which may be interesting, e.g. for monitoring purposes..

I guess my point is: All four numbers are available right now. To get percentages you need to get (at least) two numbers and do some computation, so adding MemFree, Buffers and Cached is applicable.. So I don't see why you'd need to change anything within collectd..

> We are using collectd to report back the stats to nagios, so a few 
> bytes here and there wont make a difference, But we need to report 
> correcly on memory utilization (ie, memory taken up by apps) in case a 
> process starts leaking.

Interesting :) How do you feed the data to nagios? You're not using the RRD files, are you?!

Regards,
-octo
--
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/

--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.362 / Virus Database: 267.13.3/173 - Release Date: 11/16/2005
 
    

-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.362 / Virus Database: 267.13.3/173 - Release Date: 11/16/2005
 



More information about the Collectd mailing list