[collectd] collectd 4.2.4 network issues with Solaris 8

Eric LeBlanc eleblanc at taleo.com
Tue Feb 12 17:37:16 CET 2008


Hi,

I have a SPARC Solaris 8 with GCC 3.4.6:

=====================================================================
[xxxxxxx at rootasp]: gcc -v
Reading specs from /usr/local/lib/gcc/sparc-sun-solaris2.8/3.4.6/specs
Configured 
with: ../configure --with-as=/usr/ccs/bin/as --with-ld=/usr/ccs/bin/ld --enable-shared --enable-languages=c,c++,f77
Thread model: posix
gcc version 3.4.6
=====================================================================

Here the output of the configure:
=====================================================================
Configuration:
  Libraries:
    libcurl . . . . . . no (curl-config failed)
    libiokit  . . . . . no
    libiptc . . . . . . no (Linux only)
    libkstat  . . . . . yes
    libkvm  . . . . . . no
    libmysql  . . . . . no
    libnetlink  . . . . no (Linux only library)
    libnetsnmp  . . . . no (libnetsnmp not found)
    liboconfig  . . . . yes (shipped version)
    liboping  . . . . . yes (shipped version)
    libpcap . . . . . . no (libpcap not found)
    libperl . . . . . . no
    libpthread  . . . . yes
    librrd  . . . . . . no (rrd.h not found)
    libsensors  . . . . no (Linux only library)
    libstatgrab . . . . no (libstatgrab not found)
    libupsclient  . . . no (libupsclient-config failed)
    libxmms . . . . . . no

  Features:
    daemon mode . . . . yes
    debug . . . . . . . no

  Bindings:
    perl  . . . . . . . yes

  Modules:
    apache  . . . . . . no
    apcups  . . . . . . yes
    apple_sensors . . . no
    battery . . . . . . no
    cpu . . . . . . . . yes
    cpufreq . . . . . . no
    csv . . . . . . . . yes
    df  . . . . . . . . yes
    disk  . . . . . . . yes
    dns . . . . . . . . no
    email . . . . . . . yes
    entropy . . . . . . no
    exec  . . . . . . . yes
    hddtemp . . . . . . yes
    interface . . . . . yes
    iptables  . . . . . no
    ipvs  . . . . . . . no
    irq . . . . . . . . no
    load  . . . . . . . yes
    logfile . . . . . . yes
    mbmon . . . . . . . yes
    memcached . . . . . yes
    memory  . . . . . . yes
    multimeter  . . . . yes
    mysql . . . . . . . no
    netlink . . . . . . no
    network . . . . . . yes
    nfs . . . . . . . . no
    nginx . . . . . . . no
    ntpd  . . . . . . . yes
    nut . . . . . . . . no
    perl  . . . . . . . no (needs libperl)
    ping  . . . . . . . yes
    processes . . . . . no
    rrdtool . . . . . . no
    sensors . . . . . . no
    serial  . . . . . . no
    snmp  . . . . . . . no
    swap  . . . . . . . yes
    syslog  . . . . . . yes
    tape  . . . . . . . yes
    tcpconns  . . . . . no
    unixsock  . . . . . yes
    users . . . . . . . yes
    vserver . . . . . . no
    wireless  . . . . . no
    xmms  . . . . . . . no
=====================================================================


I got three errors that I easily fixed by modifing the source code:

=====================
= FIRST ERROR: mbmon plugin  =
=====================
mbmon.c: In function `trim_spaces':
mbmon.c:243: warning: subscript has type `char'
make[3]: *** [mbmon.lo] Error 1

=====
= FIX =
=====
In src/mbmon.c, line 243, I changed this line:

        for (l = strlen (s) - 1; (l > 0) && isspace (s[l]); l--)
by
        for (l = strlen (s) - 1; (l > 0) && isspace ((int)s[l]); l--)

The parameter of isspace() function *really* want an integer on Solaris...  It 
seems that we must cast explicitly.

===================
= SECOND ERROR: unixsock  =
===================
unixsock.c: In function `us_handle_client':
unixsock.c:615: warning: control reaches end of non-void function

=====
= FIX =
=====
In src/unixsock.c, line 615, I added this line:
          return ((void *) 0);

Since this is a function that it must return something.  I don't think by 
adding this return does change anything.

===================
= THIRD ERROR: plugin.c      =
===================
plugin.c: In function `plugin_read_thread':
plugin.c:223: warning: control reaches end of non-void function

=====
= FIX =
=====
Like second error, I added this line:
          return ((void *) 0);

in src/plugin.c, line 224.


Now, I did a make install.


With the default configuration, it works well, but if we modify like this:
=====================================================================
LoadPlugin cpu
LoadPlugin df
LoadPlugin disk
LoadPlugin interface
LoadPlugin load
LoadPlugin logfile
LoadPlugin memory
LoadPlugin network
LoadPlugin ntpd
LoadPlugin swap
LoadPlugin syslog
LoadPlugin users

<Plugin network>
       Server "ourserver" "23826"
       TimeToLive "128"
       Forward false
       CacheFlush 1800
</Plugin>
=====================================================================

The daemon crashes when we try to start it.

Here the output of the truss when we try to run (skipped many lines):

=====================================================================
20174:  fstat(6, 0xFFFFFFFF78608B50)                    = 0
20174:  ioctl(6, TCGETA, 0xFFFFFFFF78608A8C)            Err#22 EINVAL
20174:  read(6, " / d e v / m d / d s k /".., 512)      = 482
20174:  read(6, 0x10018AC04, 512)                       = 0
20174:  lseek(6, 0, SEEK_CUR)                           = 482
20174:  close(6)                                        = 0
20174:  statvfs("/", 0xFFFFFFFF78609A20)                = 0
20174:  time()                                          = 1202832936
20174:  time()                                          = 1202832936
20174:  ioctl(1, TCGETA, 0xFFFFFFFF78607DAC)            Err#6 ENXIO
20174:  fstat(1, 0xFFFFFFFF78607E70)                    = 0
20174:      Incurred fault #5, FLTACCESS  %pc = 0xFFFFFFFF79D024C4
20174:        siginfo: SIGBUS BUS_ADRALN addr=0xFFFFFFFF79E068B5
20174:      Received signal #10, SIGBUS [default]
20174:        siginfo: SIGBUS BUS_ADRALN addr=0xFFFFFFFF79E068B5
=====================================================================


Here the output of the pstack of the core file:
=====================================================================
[xxxxx1 at rootasp]: pstack core
-----------------  lwp# 4 / thread# 5  --------------------
 ffffffff79d024c4 write_part_number (ffffffff786090c0, ffffffff786090c8, 1, 
47b1c628, 8, ff0000) + c8
 ffffffff79d04694 add_to_buffer (ffffffff79e068a8, 400, ffffffff79e06680, 
ffffffff79e06cb8, 10012bb90, ffffffff786092b0) + 108
 ffffffff79d04ba0 network_write (10012bb90, ffffffff786092b0, 0, 
ffffffff7aa014d0, 0, 0) + d8
 000000010000c9cc ???????? (ffffffff7aa01ab8, ffffffff786092b0, 
ffffffffffffffff, 0, 6466, ffffffff7860938c)
 ffffffff7aa014d0 df_submit (ffffffff786098f0, 100181520, ffffffffffffffff, 0, 
2f000001, ffffffff786098f0) + 1dc
 ffffffff7aa01890 df_read (0, 0, ffffffff7d720000, 0, 0, 0) + 3ac
 000000010000b1c4 ???????? (0, ffffffff7d5093a1, 0, 0, 0, 1000)
 ffffffff7d61ecd8 _thread_start (0, 0, 0, 0, 0, 0) + 40
-----------------  lwp# 1 / thread# 1  --------------------
 ffffffff7e0a4e48 _libc_nanosleep (0, ffffffff7ffffb20, 0, fffffffffffffff8, 
f, ffffffff7ffffb41) + 8
 0000000100004fcc ???????? (1, ffffffff7ffffc28, ffffffff7ffffc38, 1001223c0, 
100000000, 0)
 0000000100003cc4 ???????? (0, 0, 0, 0, 0, 0)
-----------------  lwp# 2 / thread# 2  --------------------
 ffffffff7e0a74c4 _signotifywait (ffffffff7d720000, 1001218b8, 0, 0, 0, 0) + 8
 ffffffff7d614d38 thr_yield (0, 0, 0, 0, 0, 0) + 8c
-----------------  lwp# 3 / thread# 4  --------------------
 ffffffff7e0a59a4 ioctl (4800, 100198190, ffffffff7af01830, 100123d60, 
ffffffffffffffff, 100198190) + c
 ffffffff7ae00f30 cpu_read (0, 0, ffffffff7d720000, 0, 0, 0) + 98
 000000010000b1c4 ???????? (0, ffffffff7d5093a1, 0, 0, 0, 1000)
 ffffffff7d61ecd8 _thread_start (0, 0, 0, 0, 0, 0) + 40
--------------------------  thread# 3  --------------------
 ffffffff7d610800 _reap_wait (ffffffff7d7260c8, ffffffff7d720000, 0, 0, 0, 0) 
+ 38
 ffffffff7d610550 _reaper (ffffffff7d721d18, ffffffff7d7260c8, 1, 
ffffffff7d721cf0, 0, ffffffff7d509c50) + 38
 ffffffff7d61ecd8 _thread_start (0, 0, 0, 0, 0, 0) + 40
--------------------------  thread# 6  --------------------
 ffffffff7d60db80 _mutex_adaptive_lock (1001215f0, ffffffff7d72f5d8, 4c00, 
1000, fffeffff, 1) + 164
 ffffffff7d60d888 _cmutex_lock (1001215f0, ffffffff7d720000, 0, 
ffffffff7d60aea8, 0, 1000) + 84
 ffffffff7d60aea8 cond_wait (ffffffff78407c50, 0, 0, 1001215f0, 100121608, 
ffffffff7d720000) + 13c
 ffffffff7d60ad48 pthread_cond_wait (100121608, 1001215f0, ffffffff7d720000, 
0, 0, 0) + 8
 000000010000b3a8 ???????? (0, ffffffff7d5093a1, 0, 0, 0, 1000)
 ffffffff7d61ecd8 _thread_start (0, 0, 0, 0, 0, 0) + 40
--------------------------  thread# 7  --------------------
 ffffffff7d60db80 _mutex_adaptive_lock (1001215f0, ffffffff7d72f5d8, 4c00, 
1000, fffeffff, 1) + 164
 ffffffff7d60d888 _cmutex_lock (1001215f0, ffffffff7d720000, 0, 
ffffffff7d60aea8, 0, 1000) + 84
 ffffffff7d60aea8 cond_wait (ffffffff78205c50, 0, 0, 1001215f0, 100121608, 
ffffffff7d720000) + 13c
 ffffffff7d60ad48 pthread_cond_wait (100121608, 1001215f0, ffffffff7d720000, 
0, 0, 0) + 8
 000000010000b3a8 ???????? (0, ffffffff7d5093a1, 0, 0, 0, 1000)
 ffffffff7d61ecd8 _thread_start (0, 0, 0, 0, 0, 0) + 40
--------------------------  thread# 8  --------------------
 ffffffff7d60db80 _mutex_adaptive_lock (1001215f0, ffffffff7d72f5d8, 4c00, 
1000, fffeffff, 1) + 164
 ffffffff7d60d888 _cmutex_lock (1001215f0, ffffffff7d720000, 0, 
ffffffff7d60aea8, 0, 1000) + 84
 ffffffff7d60aea8 cond_wait (ffffffff78003c50, 0, 0, 1001215f0, 100121608, 
ffffffff7d720000) + 13c
 ffffffff7d60ad48 pthread_cond_wait (100121608, 1001215f0, ffffffff7d720000, 
0, 0, 0) + 8
 000000010000b3a8 ???????? (0, ffffffff7d5093a1, 0, 0, 0, 1000)
 ffffffff7d61ecd8 _thread_start (0, 0, 0, 0, 0, 0) + 40
================================================================

I think there's something wrong with the function write_part_number() in 
src/network.c, line 329.  I think there's a issue with the buffer.

I'm wondering if you could take a look at it?

If needed, I can provide you a core file.

Thank you very much!

E.
-- 
Eric LeBlanc <eleblanc at taleo.com>
Unix System Administrator
Taleo inc.



More information about the collectd mailing list