[collectd] failure :( caused by the network malfunction

Fri Nov 24 18:56:32 CET 2006

Hi Sebastian,
nice to hear another developer.

Sebastian Harl napsal(a):
> Hi,
> 
> On Fri, Nov 24, 2006 at 04:18:15PM +0100, Lubo?? Stan??k wrote:
>> I do not know where the problem hides.
> 
> Well, this is quite obvious: If the main loop takes more than
> COLLECTD_HEARTBEAT seconds to finish only few rrd updates will happen during
> that time and the value is assumed to be "unknown" in most cases. You still
> get a valid value any once in a while but this is not sufficient for any
> graphs or the like.
> 

Thanks, it helped me to get the matter.
The collectd.log contains 62 records 'Not sleeping...' between 15:04:20
and 16:16:26. The distance between rrd_updates (according to the
collectd.log) is in all cases longer than COLLECTD_HEARTBEAT.

>> I did some investigations too. Although I would like to see using
>> kernel's threads because of the multiprocessor support, it seems that
>> the optimal possibility will be the 'pth' due to the multiplatform support.
> 
> Um... I really do not like userspace thread libraries. pthreads are defined as
> a POSIX standard and should be available on all platforms collectd is
> currently running on. Does anybody know of any specific problems that would
> arise from the use of pthreads?
> 

Sorry, I am new to the threads theme.
I am pouring my head with ashes... :)
Going to study more...

Best regards,
Lubos

P.S.: Pour one's head with ashes means to acknowledge the mistake in my
language but I guess it is the same in yours.