[collectd] Strange behaviour of collectd with rrdcached

Lindsay Holmwood lindsay at holmwood.id.au
Fri Apr 24 15:03:26 CEST 2015


On 24 April 2015 at 18:32, Sergey <a_s_y at sama.ru> wrote:
>
> nothing new:
>
> [pid 10586] poll([{fd=3, events=POLLIN|POLLPRI}], 1, 500) = 1 ([{fd=3, revents=POLLIN}])
> [pid 10586] read(3, "update /var/lib/collectd/rrd/hostname/cpu-0/cpu-user.rrd 1429861837:34507652\n", 8192) = 87
> [pid 10586] write(3, "-1 illegal attempt to update using time 1429861837.000000 when last update time is 1429861847.000000 (minimum one second step)\n", 127) = 127
>
> Attempt to write timestamp 1429861837 after 1429861847.

The possibilities that come to mind here are:

 - rrdcached is doing an out-of-order update to the RRD file due to a bug
 - collectd is passing the values to rrdcached with out of order timestamps
 - collectd is receiving the data out of order from the collectd agent

To determine if it's an rrdcached bug, try connecting over the
rrdcached socket and issuing a bunch of UPDATE commands, then issue a
FLUSH, then check if the RRD files on disk have been updated. The
rrdcached man page has extensive documentation that covers the
protocol syntax to do this.

To determine if collectd is passing out of order values to rrdcached,
kill rrdcached, and use something like nc to listen at the same socket
path, and grep to see if the timestamps are monotonically increasing.

To determine if collectd is receiving the data out of order, use
tshark to capture the traffic, and use the display filters to show a
single sample metric, and check if the timestamps are monotonically
increasing.

Out of those three possibilities, my bet would be on collectd
receiving the data from agents out of order.

Cheers,
Lindsay



More information about the collectd mailing list