[collectd] Collectd performance issue

Florian Forster octo at verplant.org
Tue Sep 25 13:40:51 CEST 2007


Hi Giuseppe,

On Tue, Sep 25, 2007 at 10:57:31AM +0200, Giuseppe Fiameni wrote:
> After some tests, I noticed that collectd demon "collects" nearly 50MB
> of data per year (basic configuration).

That's about 12-13 Bit/s which seems a bit low..

> Thus, it seems that collecting data, let me say for 200 servers (200 x
> 50MB ~ 10 GB), could be quite difficult if we'll decide to get them in
> one single server (Forward=true).

Hm, how do you measure the collected data?

If you write it to RRD-files the files are created with their final
size, i. e. the amount of disk- space doesn't change with time - unless
you add more plugins/servers.
If you write to CSV-files you'll likely end up with a lot more than 50
MByte per host and year.
If you count the network traffic you'll be well below 500 kBit/s per
host, so that a 100 MBit/s link will take 200 hosts easily.

> What is your opinion about? Could you please suggest me which the best
> way to use collectd is?

Okay, a usual setup is that you have a central server which receives
performance data over the network and writes it to RRD-files. The usual
bottleneck is the storage system, because RRDTool accesses the RRD-files
in a very cache-unfriendly manner. There are basically three points you
can tickle to get the most of your system:
- Use a very new version of RRDTool. That is, use 1.2.24 (or later) or
  the 1.3.* line (which is beta right now).
- Read <http://oss.oetiker.ch/rrdtool-trac/wiki/TuningRRD>, tune your
  system and see what works for you.
- The easiest, yet very effective point: Tell collectd to update the
  RRD-files less frequently. It will combine many values into one update
  operation, which is a lot easier to the system. See the `CacheTimeout'
  option of the `rrdtool' plugin in the collectd.conf(5) manpage.

Hardware wise, you're basically okay as long as all your RRD-files fit
into memory. I've heard that a RAID-5 over a couple of 10k or 15k RPM
disks works best, but I don't have any personal experience there..

Hope I didn't misunderstand you. If you were concerned about collectd's
performance let me know.. Usually, RRDTool's performance is more of an
issue though.

Regards,
-octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mailman.verplant.org/pipermail/collectd/attachments/20070925/6dc849a7/attachment.pgp 


More information about the collectd mailing list