[collectd] collectd + YaketyStats

Fri Jan 9 22:48:42 CET 2009

Hi Mark,

On Fri, Jan 09, 2009 at 09:30:04AM -0500, Mark Plaksin wrote:
> I'm one of the authors of YaketyStats (yaketystats.org).  I'd like to
> talk about two things:

first off, nice job on YaketyStats :) If we could get both projects to
play with one another nicely, that'd be a huge win for both, I think :)

> 1)  Help me see the value of MIN and AVERAGE RRD CFs :)  
[...]
> The other day I came across a case there both AVERAGE and MIN would be
> useful to have!  To help debug an apparent entropy problem I started
> collecting entropy stats once a second.  Entropy is almost always fine
> (around 4k) but sometimes it drops near 0.  With MAX as the only CF,
> the only way to see the dips is to look at a small window of time.
> Being able to graph MIN would definitely help!

The minimum is always especially interesting, if you're regarding
metrics, where low values are a bad thing. Take, for example, the
predicted time an UPS will be able to provide energy for the current
load.

The average is always interesting, if values can change rapidly. If, for
example, the number of established TCP connections on your mail server
is next to nothing, then suddenly jumps to a couple hundred or thousand,
then jumps back to nothing again, this is usually not very alarming. In
fact, it shows that your mail system can handle the occasional peak. If
consolidated, the average of this scenario will be low, because
generally there were not that many connections open. If, on the other
hand, your system is at it's limits, there will be many open connections
for a long time, thus the average will be high. In both scenarios, the
maximum consolidation *may* show nearly identical graphs, although the
situation is vastly different.

Last, there's ``binary stats'' where neither MIN nor MAX are very
interesting at all: The graphs of air conditioning systems, for example,
may show zero if the compressor is not running and one when it is. Of
course, for basically all sensible time spans, the minimum will be zero
and the maximum will be one. But the average gives one a very good idea
of how long/often the compressor actually was running.

If you graph all three consolidation functions, you probably get the
best overall picture. If you look at [0], you will see the minimum and
maximum values as a light blue area and the average as a dark blue line.

> Do you think AVERAGE and MIN are useful in general or only in certain
> specific situations?

Well, generally I find it most intuitive to look at average graphs, but
for finding problems either MIN or MAX, or the relation between all
three is usually the most helpful information to have.

> 2)  The possibility of using the YaketyStats UI (aka Jart) to view
> collectd RRD files.
>
> Regarding (2), I don't think it would be too hard to adjust Jart
> and/or collectd RRD files so Jart can be used with collectd data.

That's be awesome :)

> To use Jart with collectd there are just two or three issues that
> would need to be resolved:
> 
> A)  The CF issue.  Right now Jart assumes all you've got is MAX.  Jart
> would need to handle the others, possibly via a per-user and/or
> per-graph preference.

I don't think that's a very pressing issue: Jart can ignore the other
CFs for now - they won't get in the way. Ultimately, it'd be the best
overall, if that would be user-configurable.

> B)  Jart assumes you have one DS per RRD file.  There's a
> proof-of-concept implementation of handling more than one DS per file,
> but it's not the prettiest.  Unless there are compelling reasons to
> keep multiple DSes per file, it would be nice to break them up.

Most of collectd's ``data sets'' have only one ``data source'' (which
basically directly translates to: Most RRD files have only one DS). Some
have two, especially interface statistics (RX and TX) and other
IO-stats.

I think splitting the IO-stats up into two data sets / RRD files doesn't
make much sense - IO, as the name implies, will always consist of input
and output, so putting those two into one file is the reasonable thing
to do.

Oh, there's the `system load' that has three data sources and very very
few other types with more than two data sources. If you ask me, they can
be ignored, but other people may disagree ;)

> C)  DS names.  Jart assumes you have one DS named 'yabba' :)  Doesn't
> seem too hard to have Jart look in the RRD file and grab the DS name.
> Of course, it's more complex if there are multiple DSes per RRD.

Hm, I think it'd be sensible to save that information in your ``Play-
lists'', so that the DS names don't have to be queried again and again,
but that's probably cosmetic.

You could also interpret the DS name to be part of the full path to the
information. So that instead of using `foo/bar.rrd' you could use
`foo/bar.rrd/rx' meaning `use the rx DS from the file foo/bar.rrd'. But
since I don't really know anything about your internal code structure,
that could be total gibberish..

Regards,
-octo

[0] <http://verplant.org/temp/entropy.png>
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mailman.verplant.org/pipermail/collectd/attachments/20090109/61ef6684/attachment.pgp