[collectd] long sleeps when using collectd and ntpd

danta mertelo1 at axsguard.net
Tue Sep 24 12:12:21 CEST 2013


Hi Dave,

We see the problem on virtual machines, where the host machine has it's 
clock set to the local time. I realize this is a wrong setup but 
unfortunately we can't change the settings on the host machine. Also the 
host clock seems rather unstable, leading to large drifts if the ntpd 
server can't be reached for large periods of time.

The long sleep problem also occurs if you correct your date to a date in 
the past.

For the moment we patched the collectd code to exit if it detects a 
large time drift (a watchdog will then restart it), but I think this 
isn't a clean solution.

Best regards,
Lode

On 09/24/2013 11:27 AM, Dave Cottlehuber wrote:
> On 24. September 2013 at 10:56:07, danta (mertelo1 at axsguard.net) wrote:
>> We have some problems using collectd and ntpd.
>>
>> Whenever ntpd sets the clock in the past (because of a clock drift),
>> collectd sleeps until it is back at it's 'normal' time.
>> To give an example:
>> - Suppose it's 10:00am,
>> - ntpd sees a large clock drift and set the clock to 09:45am
>> - collectd will sleep till 10:00am
>>
>> When trying to debug the problem, we found that the do_loop function in
>> collectd.c determines the amount of time to sleep based on times
>> abtained from the "gettimeofday" function. Was this a design choice?
>> Wouldn't it be better to use a monotonic clock?
>>
>> Greetz,
>> Lode
> Hi Lode
>
> I can't answer for collectd but this doesn't sound like the correct usage for ntpd.
>
> Is the large drift an example, & the timescale is a few seconds delay?
>
> Your ntpd should use a driftfile which over time will keep things in line most of the time, even if your ntpd servers are temporarily unavailable.
>
> Are these VMs that are being hibernated or similar? If so, use host clock sync and not ntpd. Obviously the hosts will use ntpd though!
>
> A+
> Dave
>
>




More information about the collectd mailing list