[collectd] collectd rrdtool performance

Florian Forster octo at verplant.org
Thu Dec 20 11:50:09 CET 2007

Hi Thorsten,

have you read the `Tuning RRDtool for performance' document in the
RRDTool Wiki? If not, it may be of interest for you.

On Wed, Dec 19, 2007 at 08:29:37AM -0800, Thorsten von Eicken wrote:
> Even if I set the RRDTool plugin cache to 60 seconds the situation is
> not much better.

Really? I've heard and experienced that this setting reduces the IO load
a lot: Since the disks always write blocks of 512 bytes it doesn't make
much difference if you write one value or 64.

> The biggest issue I see is that 150 hosts = 10500 RRDs. I'm planning
> to go ahead and reorganize a little how the RRD data is stored by
> placing all related variables of a plugin into a single RRD as opposed
> to the current scheme where almost every variable is in its own RRD.

Hm, I'd do some benchmarks first. The effect is that the DSes are stored
nearby, basically the same effect the `CacheTimeout' option uses.

> The reason I'm writing is to get some feedback on how to do this so it
> can be accepted into the collectd source. Here are the options I see:

To be honest: I'm not a big fan. Pulling together values, e. g. the CPU
states (idle, user, system, ...) or memory usage (used, cached, ...)
results in absolute inflexibility: Hosts queried over SNMP only have
free and used memory, Mac OS X has `wired' memory, older Linux systems
don't know `steal' CPU time and so on. To cover all that you need to
build the superset of all possible values, resulting in huge files.
Also, half a year after you ``covered all possible uses'' someone will
come up with a new idea that now needs to be pressed into that scheme
somehow. That's exactly the reasons why I split up all files in
version 4.

> 2_ add a CompactRRD option to each relevant plugin to switch between
>    the standard layout and the new compact layout

If this brings a significant performance gain, we can talk about that.
An assumption will not suffice, though.

> - try rrdtool 1.2.24, which has disk I/O vadvise optimizations

Yes, it does perform a lot better.

> By the way, I'm very impressed how well collectd has handled the
> current overload situation. It looks like there is enough buffering
> between the network input and the rrdtool updates such that apparently
> no data gets lost. Nice!

Thanks :) I had situations such as yours in mind when I wrote that
caching / updating mechanism..

Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mailman.verplant.org/pipermail/collectd/attachments/20071220/bd366765/attachment.pgp 

More information about the collectd mailing list