[collectd] collectd 4.2.4 network issues with Solaris 8
Eric LeBlanc
eleblanc at taleo.com
Tue Feb 12 17:37:16 CET 2008
Hi,
I have a SPARC Solaris 8 with GCC 3.4.6:
=====================================================================
[xxxxxxx at rootasp]: gcc -v
Reading specs from /usr/local/lib/gcc/sparc-sun-solaris2.8/3.4.6/specs
Configured
with: ../configure --with-as=/usr/ccs/bin/as --with-ld=/usr/ccs/bin/ld --enable-shared --enable-languages=c,c++,f77
Thread model: posix
gcc version 3.4.6
=====================================================================
Here the output of the configure:
=====================================================================
Configuration:
Libraries:
libcurl . . . . . . no (curl-config failed)
libiokit . . . . . no
libiptc . . . . . . no (Linux only)
libkstat . . . . . yes
libkvm . . . . . . no
libmysql . . . . . no
libnetlink . . . . no (Linux only library)
libnetsnmp . . . . no (libnetsnmp not found)
liboconfig . . . . yes (shipped version)
liboping . . . . . yes (shipped version)
libpcap . . . . . . no (libpcap not found)
libperl . . . . . . no
libpthread . . . . yes
librrd . . . . . . no (rrd.h not found)
libsensors . . . . no (Linux only library)
libstatgrab . . . . no (libstatgrab not found)
libupsclient . . . no (libupsclient-config failed)
libxmms . . . . . . no
Features:
daemon mode . . . . yes
debug . . . . . . . no
Bindings:
perl . . . . . . . yes
Modules:
apache . . . . . . no
apcups . . . . . . yes
apple_sensors . . . no
battery . . . . . . no
cpu . . . . . . . . yes
cpufreq . . . . . . no
csv . . . . . . . . yes
df . . . . . . . . yes
disk . . . . . . . yes
dns . . . . . . . . no
email . . . . . . . yes
entropy . . . . . . no
exec . . . . . . . yes
hddtemp . . . . . . yes
interface . . . . . yes
iptables . . . . . no
ipvs . . . . . . . no
irq . . . . . . . . no
load . . . . . . . yes
logfile . . . . . . yes
mbmon . . . . . . . yes
memcached . . . . . yes
memory . . . . . . yes
multimeter . . . . yes
mysql . . . . . . . no
netlink . . . . . . no
network . . . . . . yes
nfs . . . . . . . . no
nginx . . . . . . . no
ntpd . . . . . . . yes
nut . . . . . . . . no
perl . . . . . . . no (needs libperl)
ping . . . . . . . yes
processes . . . . . no
rrdtool . . . . . . no
sensors . . . . . . no
serial . . . . . . no
snmp . . . . . . . no
swap . . . . . . . yes
syslog . . . . . . yes
tape . . . . . . . yes
tcpconns . . . . . no
unixsock . . . . . yes
users . . . . . . . yes
vserver . . . . . . no
wireless . . . . . no
xmms . . . . . . . no
=====================================================================
I got three errors that I easily fixed by modifing the source code:
=====================
= FIRST ERROR: mbmon plugin =
=====================
mbmon.c: In function `trim_spaces':
mbmon.c:243: warning: subscript has type `char'
make[3]: *** [mbmon.lo] Error 1
=====
= FIX =
=====
In src/mbmon.c, line 243, I changed this line:
for (l = strlen (s) - 1; (l > 0) && isspace (s[l]); l--)
by
for (l = strlen (s) - 1; (l > 0) && isspace ((int)s[l]); l--)
The parameter of isspace() function *really* want an integer on Solaris... It
seems that we must cast explicitly.
===================
= SECOND ERROR: unixsock =
===================
unixsock.c: In function `us_handle_client':
unixsock.c:615: warning: control reaches end of non-void function
=====
= FIX =
=====
In src/unixsock.c, line 615, I added this line:
return ((void *) 0);
Since this is a function that it must return something. I don't think by
adding this return does change anything.
===================
= THIRD ERROR: plugin.c =
===================
plugin.c: In function `plugin_read_thread':
plugin.c:223: warning: control reaches end of non-void function
=====
= FIX =
=====
Like second error, I added this line:
return ((void *) 0);
in src/plugin.c, line 224.
Now, I did a make install.
With the default configuration, it works well, but if we modify like this:
=====================================================================
LoadPlugin cpu
LoadPlugin df
LoadPlugin disk
LoadPlugin interface
LoadPlugin load
LoadPlugin logfile
LoadPlugin memory
LoadPlugin network
LoadPlugin ntpd
LoadPlugin swap
LoadPlugin syslog
LoadPlugin users
<Plugin network>
Server "ourserver" "23826"
TimeToLive "128"
Forward false
CacheFlush 1800
</Plugin>
=====================================================================
The daemon crashes when we try to start it.
Here the output of the truss when we try to run (skipped many lines):
=====================================================================
20174: fstat(6, 0xFFFFFFFF78608B50) = 0
20174: ioctl(6, TCGETA, 0xFFFFFFFF78608A8C) Err#22 EINVAL
20174: read(6, " / d e v / m d / d s k /".., 512) = 482
20174: read(6, 0x10018AC04, 512) = 0
20174: lseek(6, 0, SEEK_CUR) = 482
20174: close(6) = 0
20174: statvfs("/", 0xFFFFFFFF78609A20) = 0
20174: time() = 1202832936
20174: time() = 1202832936
20174: ioctl(1, TCGETA, 0xFFFFFFFF78607DAC) Err#6 ENXIO
20174: fstat(1, 0xFFFFFFFF78607E70) = 0
20174: Incurred fault #5, FLTACCESS %pc = 0xFFFFFFFF79D024C4
20174: siginfo: SIGBUS BUS_ADRALN addr=0xFFFFFFFF79E068B5
20174: Received signal #10, SIGBUS [default]
20174: siginfo: SIGBUS BUS_ADRALN addr=0xFFFFFFFF79E068B5
=====================================================================
Here the output of the pstack of the core file:
=====================================================================
[xxxxx1 at rootasp]: pstack core
----------------- lwp# 4 / thread# 5 --------------------
ffffffff79d024c4 write_part_number (ffffffff786090c0, ffffffff786090c8, 1,
47b1c628, 8, ff0000) + c8
ffffffff79d04694 add_to_buffer (ffffffff79e068a8, 400, ffffffff79e06680,
ffffffff79e06cb8, 10012bb90, ffffffff786092b0) + 108
ffffffff79d04ba0 network_write (10012bb90, ffffffff786092b0, 0,
ffffffff7aa014d0, 0, 0) + d8
000000010000c9cc ???????? (ffffffff7aa01ab8, ffffffff786092b0,
ffffffffffffffff, 0, 6466, ffffffff7860938c)
ffffffff7aa014d0 df_submit (ffffffff786098f0, 100181520, ffffffffffffffff, 0,
2f000001, ffffffff786098f0) + 1dc
ffffffff7aa01890 df_read (0, 0, ffffffff7d720000, 0, 0, 0) + 3ac
000000010000b1c4 ???????? (0, ffffffff7d5093a1, 0, 0, 0, 1000)
ffffffff7d61ecd8 _thread_start (0, 0, 0, 0, 0, 0) + 40
----------------- lwp# 1 / thread# 1 --------------------
ffffffff7e0a4e48 _libc_nanosleep (0, ffffffff7ffffb20, 0, fffffffffffffff8,
f, ffffffff7ffffb41) + 8
0000000100004fcc ???????? (1, ffffffff7ffffc28, ffffffff7ffffc38, 1001223c0,
100000000, 0)
0000000100003cc4 ???????? (0, 0, 0, 0, 0, 0)
----------------- lwp# 2 / thread# 2 --------------------
ffffffff7e0a74c4 _signotifywait (ffffffff7d720000, 1001218b8, 0, 0, 0, 0) + 8
ffffffff7d614d38 thr_yield (0, 0, 0, 0, 0, 0) + 8c
----------------- lwp# 3 / thread# 4 --------------------
ffffffff7e0a59a4 ioctl (4800, 100198190, ffffffff7af01830, 100123d60,
ffffffffffffffff, 100198190) + c
ffffffff7ae00f30 cpu_read (0, 0, ffffffff7d720000, 0, 0, 0) + 98
000000010000b1c4 ???????? (0, ffffffff7d5093a1, 0, 0, 0, 1000)
ffffffff7d61ecd8 _thread_start (0, 0, 0, 0, 0, 0) + 40
-------------------------- thread# 3 --------------------
ffffffff7d610800 _reap_wait (ffffffff7d7260c8, ffffffff7d720000, 0, 0, 0, 0)
+ 38
ffffffff7d610550 _reaper (ffffffff7d721d18, ffffffff7d7260c8, 1,
ffffffff7d721cf0, 0, ffffffff7d509c50) + 38
ffffffff7d61ecd8 _thread_start (0, 0, 0, 0, 0, 0) + 40
-------------------------- thread# 6 --------------------
ffffffff7d60db80 _mutex_adaptive_lock (1001215f0, ffffffff7d72f5d8, 4c00,
1000, fffeffff, 1) + 164
ffffffff7d60d888 _cmutex_lock (1001215f0, ffffffff7d720000, 0,
ffffffff7d60aea8, 0, 1000) + 84
ffffffff7d60aea8 cond_wait (ffffffff78407c50, 0, 0, 1001215f0, 100121608,
ffffffff7d720000) + 13c
ffffffff7d60ad48 pthread_cond_wait (100121608, 1001215f0, ffffffff7d720000,
0, 0, 0) + 8
000000010000b3a8 ???????? (0, ffffffff7d5093a1, 0, 0, 0, 1000)
ffffffff7d61ecd8 _thread_start (0, 0, 0, 0, 0, 0) + 40
-------------------------- thread# 7 --------------------
ffffffff7d60db80 _mutex_adaptive_lock (1001215f0, ffffffff7d72f5d8, 4c00,
1000, fffeffff, 1) + 164
ffffffff7d60d888 _cmutex_lock (1001215f0, ffffffff7d720000, 0,
ffffffff7d60aea8, 0, 1000) + 84
ffffffff7d60aea8 cond_wait (ffffffff78205c50, 0, 0, 1001215f0, 100121608,
ffffffff7d720000) + 13c
ffffffff7d60ad48 pthread_cond_wait (100121608, 1001215f0, ffffffff7d720000,
0, 0, 0) + 8
000000010000b3a8 ???????? (0, ffffffff7d5093a1, 0, 0, 0, 1000)
ffffffff7d61ecd8 _thread_start (0, 0, 0, 0, 0, 0) + 40
-------------------------- thread# 8 --------------------
ffffffff7d60db80 _mutex_adaptive_lock (1001215f0, ffffffff7d72f5d8, 4c00,
1000, fffeffff, 1) + 164
ffffffff7d60d888 _cmutex_lock (1001215f0, ffffffff7d720000, 0,
ffffffff7d60aea8, 0, 1000) + 84
ffffffff7d60aea8 cond_wait (ffffffff78003c50, 0, 0, 1001215f0, 100121608,
ffffffff7d720000) + 13c
ffffffff7d60ad48 pthread_cond_wait (100121608, 1001215f0, ffffffff7d720000,
0, 0, 0) + 8
000000010000b3a8 ???????? (0, ffffffff7d5093a1, 0, 0, 0, 1000)
ffffffff7d61ecd8 _thread_start (0, 0, 0, 0, 0, 0) + 40
================================================================
I think there's something wrong with the function write_part_number() in
src/network.c, line 329. I think there's a issue with the buffer.
I'm wondering if you could take a look at it?
If needed, I can provide you a core file.
Thank you very much!
E.
--
Eric LeBlanc <eleblanc at taleo.com>
Unix System Administrator
Taleo inc.
More information about the collectd
mailing list