[collectd] irregular data

Franklin, Dave Dave.Franklin at arrisi.com
Mon Nov 25 22:14:50 CET 2013


I’d like to keep an eye on potential java leaks - and I’ve also got a browser running on the platform so I need to monitor javascript heap usage as well. I think the answer for me is to just configure the collection intervals for these particular plugins to be able to handle any flurries of user activity that may cause the heap to change significantly. If the intervals are short enough, then I should be able to correlate potential jumps in heap usage with specific user activities. (These are units driven by automated test so there’s plenty of data on the automation side in terms of what it’s doing to the device.)

= D


From: bill [mailto:bilsch at gmail.com]
Sent: Monday, November 25, 2013 3:39 PM
To: Franklin, Dave
Cc: collectd at verplant.org
Subject: Re: [collectd] irregular data

I have found that simply tracing gc time per collector and alerting ( nagios ) if above a threshold to be good enough. what are you trying to accomplish?

Bill Schwanitz

If A is a success in life, then A equals x plus y plus z. Work is x; y is play; and z is keeping your mouth shut. - Albert Einstein.

On Nov 25, 2013, at 3:02 PM, "Franklin, Dave" <Dave.Franklin at arrisi.com<mailto:Dave.Franklin at arrisi.com>> wrote:
Folks,

I’ve been trying to tackle the measurement of an irregular statistic on an embedded platform: embedded JVM garbage collection. During any given “interval” for collectd, I may have no GC activity or I might have a dozen instances where the JVM performed garbage collection. I have a file wherein the GC numbers are stored (time of occurrence, JVM heap before, JVM heap after, time required to garbage-collect), so I can write a read plugin to simply read the file. It should be easy enough to keep track of the last time it ran, so I can know exactly when/where to index in that file so I can start the file read at the right point (and read to the end). But the fact that I may have multiple values in any one interval is throwing me off.

So if I have a set of N data points (including timestamp), can I simply iterate through a list, calling plugin_dispatch_values( &vl ) where I’ve not only set up the “standard” vl data elements but also the vl.time element also, with the appropriate timestamp?

E.G. (very psedocode-ish) :
// Iterate through the N values for heap during this interval
for iter=0; iter<N; iter++
{
  gcdata = dataArray[iter];
  vl.values = gcdata.heap;
  vl.time = gcdata.timestamp;
  sstrncpy( … host, plugin, type, type_instance, etc ...);
  plugin_dispatch_values(&vl);
}

Other obvious alternatives would be (1) to write the plugin so it would average all values of interest and just report ONE set of data (and perhaps a metric for the number of GCs that occurred during that interval); or (2) to only report the most RECENT set of data, or (3) to have the read plugin interval much shorter than how fast I expect the GC to run. But if it’s possible, I’d rather get all of the instances recorded. Have any folks had to deal with such irregular values before?

Thanks,
Dave

_______________________________________________
collectd mailing list
collectd at verplant.org<mailto:collectd at verplant.org>
http://mailman.verplant.org/listinfo/collectd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.verplant.org/pipermail/collectd/attachments/20131125/ebf3cec9/attachment-0001.html>


More information about the collectd mailing list