[collectd] kernel message SIGCHLD set to SIG_IGN
octo at verplant.org
Tue Feb 2 22:48:58 CET 2010
On Fri, Jan 29, 2010 at 11:21:58PM +0100, Mirko Buffoni wrote:
> Jan 29 23:11:48 server kernel: application bug: collectd(7755) has SIGCHLD set to SIG_IGN but calls wait().
> Jan 29 23:11:48 server kernel: (see the NOTES section of 'man 2 wait'). Workaround activated.
> What does it mean?
hm, interesting. When collectd executes a child process (using the exec
plugin), the termination of that child process causes a signal (SIGCHLD)
to be sent to the parent process. The parent process normally has to
call wait(2) to read back the exit status of the child. If the parent
doesn't do that, you'll end up with "zombies".
What this error message is telling you is that collectd has instructed
the operating system to dismiss that signal, i.e. to *not* send such a
signal but clean up the child processes automatically. At the same time,
collectd *does* call wait(2), which is an error.
> I haven't tried to compile 4.9.1, so I don't know if it's something
> that has already been corrected.
Well, the most important issue here is to find out if there actually is
a problem in collectd. Signals and multi-threaded programs are, uhm, not
exactly a great match. It is possible that old versions of Linux and/or
the GNU libc simply don't handle this completely. Since we've changed
some signal stuff in the exec plugin recently it may be possible that we
introduces a bug there, too.
Can you check if the Linux you're using already features
/proc/$PID/status? If so, could you post that file of a running collectd
Florian octo Forster
Hacker in training
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: Digital signature
Url : http://mailman.verplant.org/pipermail/collectd/attachments/20100202/fa030d34/attachment.pgp
More information about the collectd