[collectd] collectd 3.10.1 segfault after several days

Muralito muralito at montevideo.com.uy
Tue Oct 3 02:49:41 CEST 2006


Florian Forster escribió:
> Hi,
> 
> On Mon, Oct 02, 2006 at 05:09:31PM -0200, Muralito wrote:
>> Collectd 3.10.1 segfaults after several (6 or 7) days in use.
>> The following messages are in the system log.
>>
>> Sep 23 13:55:24 sc430 kernel: collectd[28438]: segfault at 00002b8f00000018 rip 00002b8f8bd28490 rsp 00007fff1f3b2e00 error 4
>> Sep 29 11:17:27 sc430 kernel: collectd[3632]: segfault at 00002b6600000018 rip 00002b6694b17490 rsp 00007fff165c4340 error 4
> 
> sorry, the information is a little sparse for an analysis.. What kernel
> and operating system are you using? What plugins are enabled? What
> architecture do you use? Did you compile collectd yourself or use
> precompiled packages? Is the error reproduceable?
> 

HW: Dell Poweredge SC430. Processor Pentium D 830. ECC RAM.
OS: OpenSuSE 10.1 x86-64 kernel 2.6.16.21-0.13-smp
Collectd: precompiled RPM package from suse guru. 
collectd-3.10.1-1.guru.suse101 (from 
http://linux01.gwdg.de/~pbleser/rpm-navigation.php?cat=System/collectd/)

/etc/collectd.conf: (only non # lines)
Mode Local
LoadPlugin apache
LoadPlugin battery
LoadPlugin cpu
LoadPlugin cpufreq
LoadPlugin df
LoadPlugin disk
LoadPlugin hddtemp
LoadPlugin load
LoadPlugin memory
LoadPlugin ntpd
LoadPlugin processes
LoadPlugin swap
LoadPlugin tape
LoadPlugin traffic
LoadPlugin users
<Plugin hddtemp>
         Host 127.0.0.1
         Port 7634
</Plugin>

I don't know how to reproduce it yet. It happened only 2 times and I was 
unable to correlate facts.

Apache plugin is loaded but apache is not configured to serve the data.
Tape and battery plugin are also loaded but do not collect data.
I will disable them, but first i want to give one or to weeks with the 
same cfg and see if it segfaults again.
All others plugins collects data and seems to work well.

*** I was wrong with my first report. It wasn't running serveral days.
The errors occured 30 and 150 seconds after starting (as I understand 
the logs below). ***

Sep 23 13:52:51 sc430 collectd[27686]: Exiting normally
Sep 23 13:52:54 sc430 collectd[28437]: Plugin `tape' doesn't provide a 
read function.
Sep 23 13:52:54 sc430 collectd[28438]: cpufreq found 2 cpu(s)
Sep 23 13:55:24 sc430 kernel: collectd[28438]: segfault at 
00002b8f00000018 rip 00002b8f8bd28490 rsp 00007fff1f3b2e00 error 4

Sep 29 11:12:56 sc430 collectd[3647]: Exiting normally
Sep 29 11:16:57 sc430 collectd[3219]: Plugin `tape' doesn't provide a 
read function.
Sep 29 11:16:57 sc430 collectd[3632]: cpufreq found 2 cpu(s)
Sep 29 11:17:27 sc430 kernel: collectd[3632]: segfault at 
00002b6600000018 rip 00002b6694b17490 rsp 00007fff165c4340 error 4

Other line that appears several times is : Not sleeping because 
`timeval_sub_timespec' returned non-zero!

Regards,
Muralito.



More information about the collectd mailing list