[collectd] cpu wait time on collectd server
eric fauser
ef_cd at apa.at
Wed Sep 12 17:36:39 CEST 2007
Hi
> Athlon 64 3200, 1Gb ram running gentoo In the moment it
> receives stats from 30 hosts. In total there are 1426 rrd files.
our specs are 70 hosts (3556 rrd files ) reporting to a server
which has a disk-backend of 6x 36GB SAS disks (raid5) and
8GB memory for the page cache, but we are running collectd3 now.
(cpu 2x DualCore 3.2GHz)
> large cpu wait times averaging about 70%. it must be waiting
> on network IO because disk write throughput is only ~1Mb/sec,
as we used 2GB Ram , we ran first into a udp-kernel-buffer problem
(netstat -su)
and then into a diskbottleneck-problem.
phys.ReadIO increased to 50MByte/sec and cpu-wait-io gone
from 20% to 60%
the only way was to dramatically increase the memory (page-cache)
(in our case 8GB)
so, imho with collectd4 the solution should be
1.) use rrd-cachetimeout
2.) very fast disk-backend
3.) as much ram as possible ;)
eric
More information about the collectd
mailing list