[collectd] long sleeps when using collectd and ntpd
octo at collectd.org
Tue Sep 24 14:47:23 CEST 2013
On Tue, Sep 24, 2013 at 10:53:02AM +0200, danta wrote:
> Whenever ntpd sets the clock in the past (because of a clock drift),
> collectd sleeps until it is back at it's 'normal' time.
> When trying to debug the problem, we found that the do_loop function
> in collectd.c determines the amount of time to sleep based on times
> abtained from the "gettimeofday" function. Was this a design choice?
> Wouldn't it be better to use a monotonic clock?
it would be possible to use a monotonic clock _on Linux_, but according
to POSIX (i.e. to be portable) one must use a real time clock. Also, I
don't think this is causing the behavior; the problem is a bit more
Each callback, after it returns, is put into a heap which is sorted by
the absolute, real time when it should be called next. I.e. the loop in
src/collectd.c is still waking up periodically, but next to no work is
done inside this loop, especially not the reads.
If you were to "fix" this, you'd for example end up with the metric for
12:00:00 being followed by the metric for 11:45:10, i.e. almost
15 minutes into the "past". This would result in collectd refusing this
measurement because it is "too old". This would continue until the time
has progressed enough.
Last, but not least, metrics _have_ to use the wall clock time.
Otherwise you won't be able to correlate between metrics and alerts /
behavior you observe.
Hope this helps, best regards,
collectd – The system statistics collection daemon
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 836 bytes
Desc: Digital signature
More information about the collectd