[collectd] proposed patch for intermittent exec plugin failure to launch child process, dl_open_worker assertion

Riches Jr, Robert M robert.m.riches.jr at intel.com
Fri Feb 22 17:56:57 CET 2013


Thanks to a suggestion from Florian Forster that dynamic objects are still open when fork() is called by the exec plugin, the attached patch appears to work pretty well.  I would like to submit it to be considered for inclusion into the codebase.

It records every dynamic object's lt_dlhandle when it is opened/loaded.  Then, after the exec plugin has called fork(), the child process calls lt_dlclose() on all of those objects immediately before calling execvp().  Access to the list of lt_dlhandles is synchronized, so it should be thread-safe.  It is only in the child process and immediately before calling execvp() that the objects are closed.  The call to execvp() is moved from exec.c into plugin.c to avoid a situation where closing the dynamically loaded exec plugin would saw off the limb on which the execution thread was sitting.

Testing of the patch was done in the same environment in which the original problem (failure of the child process to do its thing) was seem.  Without the patch, failure happened about 30-50% of the time--rarely taking more than 3-5 attempts to see a failure.  With the patch, in over 50 attempts, there was not a single failure of the child process to launch and do its thing.  Oddly, in about 7 of the 50+ attempts, collectd printed the dl_open_worker assertion failure, but the child process survived and did its thing, perfectly as far as I could tell.  No other problematic symptoms were observed during testing.

I made my best attempt to code indentation and such in a manner consistent with each of the files--perhaps modulo some spaces vs. tabs.  Is there some additional process I should take this patch through in order to get it considered for inclusion into the codebase?  (Submission of the patch has been approved by my manager if that is a concern.)

Thanks,

Robert Riches

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.verplant.org/pipermail/collectd/attachments/20130222/2c86357c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: close-dlhandles.PATCH
Type: application/octet-stream
Size: 4159 bytes
Desc: close-dlhandles.PATCH
URL: <http://mailman.verplant.org/pipermail/collectd/attachments/20130222/2c86357c/attachment.obj>


More information about the collectd mailing list