[collectd] collectd) memory usage growing in python plugin.

James김민석 james.kim at nexr.com
Fri Apr 20 04:45:12 CEST 2012


Hello,

I've checked the collectd source code from 'collectd-5.0' branch in git.
and made some patch to relieve the memory leak symptoms.
The solution was all about releasing the reference count of 'new reference'
python object in python extension.
There is wonderful document about python extension.
http://edcjones.tripod.com/refcount.html

I've done several tests to see it resolves the leak.
following is the result and I guess it works.

# case when the collectd plugin keeps works nomaly without any error.

  (memory chages after keep running collectd for 16 Hours)
case #1 (based on 'collectd-5.0' branch)                      : 8140 KB  =>
 17596 KB
case #2 (based on 'collectd-5.0' branch with the patch) : 7672 KB  =>
 7728 KB


# case when the collectd plugin keeps fails in the business logic (see
attachment)

  (memory chages after keep running collectd for 16 Hours)
case #3 (based on 'collectd-5.0' branch)                      : 10728 KB
 =>  68560 KB
case #4 (based on 'collectd-5.0' branch with the patch) :   8060 KB  =>
 8184 KB

I would like to contribute my  patch and experiences.
Please check my attachment.

Thank you

HTH.
james.kim
NexR corp, The Best Big Data Solution provider.
Korea



2012년 4월 16일 오후 12:23, James김민석 <james.kim at nexr.com>님의 말:

> Hello,
>
> I'm james.
>
> Thank you for releasing the collectd.
>
> I developed a python plugin for collected.
>
> and experiencing memory consumption growth.
>
> as time goes, memory size of collected process keeps growing.
>
>
> I've searched similar cases with my case(
> http://www.mail-archive.com/collectd@verplant.org/msg01688.html)
>
> but not good for me with patched version.
>
>
> this is my environment.
>
> os: linux centos 5.5 Linux  2.6.18-128.1.6.el5xen x86_64
>
> collectd: collectd-5.0 branch. in git (
> https://github.com/octo/collectd/tree/collectd-5.0)
>
> python: 2.6.2, 2.7.3
>
>
> my plugin is simple, just creates a collectd packet and dispatch to
> collected server.(at the bottom of this mail)
>
> and attached more complex plugin for zookeeper monitoring
>
> my collectd collects every 5second.
>
> and memory grows 4k~ 20kbytes every  5~ 60sec(random)  in both case(
> simple and complex plugin.)
>
>
>
> I've experimented several cases.
>
> 1. without python plugin, memory usage keeps steady ( 1.5M )
>
> 2. if python plugin is enabled, memory usage begins grow
>
>     with my plugin, 5064KB(initial)-> 5456(after 3hour)
>
> 3. strange thing is that just putting empty python configuration,
>
>    it still grows.
>
>
>  LoadPlugin python
>
>  <Plugin python]>
>
>   ModulePath "/path/to/collectd_agent/share/python/"
>
>   LogTraces true
>
>   Interactive false
>
>  </Plugin>
>
>
> 4.  it is not related to collected debug mode.(--enable-debug with
> recompile)
>
>      => no change.
>
>
> 5. the more python plugins are enabled, the more memory grows.
>
>      with 4 python plugins( with more complex plugins attached), it will
> grow 17Mbytes after 5hours.
>
> 6. if python plugin raises some exceptions, memory grows even faster.
>
>      with 4 python plugins( with more complex plugins), it will grow
> 72Mbytes after 5hours.
>
>
> I've tried guppy(http://pypi.python.org/pypi/guppy/) to track python
> memory leak.
>
> but there is no symptom in python plugin itself.
>
>
> Finally. I've used a valgrind(
> http://www.cprogramming.com/debugging/valgrind.html) to spot problems
> using following command.
>
> valgrind --tool=memcheck --suppressions=/python/path/valgrind-python.supp
> --leak-check=full --error-limit=no  --log-file=valog ./sbin/collectd
>
> if  python plugin is disabled in configuration, there is no lost memory in
> report.
>
> but if enabled python plugin ,it reports some following reports.
>
>    ( to suppress warning, I've recompiled python ( --without-pymalloc)
> and reinstalled collected )
>
> I think that valgrind only reports  about initial step of collected and
> unable to track the multi thread workers.
>
>
>
>
> Do you have any idea, what cause this problem?
>
>  thanks you for reading.
>
>
>
> ------------valgrind reports
> -----------------------------------------------------------------
>
>  370 bytes in 1 blocks are possibly lost in loss record 2,847 of 3,541
>
> ==30213==    at 0x4C2210C: malloc (vg_replace_malloc.c:195)
>
> ==30213==    by 0x7CED83C: PyString_FromString (stringobject.c:143)
>
> ==30213==    by 0x7D00A8F: PyType_Ready (typeobject.c:4059)
>
> ==30213==    by 0x7CE3277: _Py_ReadyTypes (object.c:2132)
>
> ==30213==    by 0x7D673C0: Py_InitializeEx (pythonrun.c:179)
>
> ==30213==    by 0x7A4CE70: cpy_config (python.c:989)
>
> ==30213==    by 0x4099C4: cf_read (configfile.c:361)
>
> ==30213==    by 0x4062C8: main (collectd.c:477)
>
>
>
> ==7443== 44 bytes in 1 blocks are possibly lost in loss record 833 of 3,469
>
> ==7443==    at 0x4C2276F: malloc (vg_replace_malloc.c:263)
>
> ==7443==    by 0x7CEEB3E: PyString_FromStringAndSize (stringobject.c:88)
>
> ==7443==    by 0x7D630CA: do_mkvalue (modsupport.c:427)
>
> ==7443==    by 0x7D6382C: do_mktuple (modsupport.c:268)
>
> ==7443==    by 0x7D6321B: do_mkvalue (modsupport.c:298)
>
> ==7443==    by 0x7D639EB: va_build_value (modsupport.c:537)
>
> ==7443==    by 0x7D63D6E: Py_BuildValue (modsupport.c:485)
>
> ==7443==    by 0x7D6F0B2: _PySys_Init (sysmodule.c:1405)
>
> ==7443==    by 0x7D6845F: Py_InitializeEx (pythonrun.c:215)
>
> ==7443==    by 0x7A4DEC0: cpy_config (python.c:996)
>
> ==7443==    by 0x4099C4: cf_read (configfile.c:361)
>
> ==7443==    by 0x4062C8: main (collectd.c:477)
>
>
> ==22090== 44 bytes in 1 blocks are possibly lost in loss record 843 of
> 3,581
>
> ==22090==    at 0x4C2276F: malloc (vg_replace_malloc.c:263)
>
> ==22090==    by 0x7CE57C8: PyObject_Malloc (obmalloc.c:943)
>
> ==22090==    by 0x7CED87E: PyString_FromStringAndSize (stringobject.c:88)
>
> ==22090==    by 0x7D620FA: do_mkvalue (modsupport.c:427)
>
> ==22090==    by 0x7D6285C: do_mktuple (modsupport.c:268)
>
> ==22090==    by 0x7D6224B: do_mkvalue (modsupport.c:298)
>
> ==22090==    by 0x7D62A1B: va_build_value (modsupport.c:537)
>
> ==22090==    by 0x7D62D9E: Py_BuildValue (modsupport.c:485)
>
> ==22090==    by 0x7D6E4D2: _PySys_Init (sysmodule.c:1408)
>
> ==22090==    by 0x7D6742F: Py_InitializeEx (pythonrun.c:222)
>
> ==22090==    by 0x7A4BEA0: cpy_config (python.c:999)
>
> ==22090==    by 0x409924: cf_read (configfile.c:361)
>
>
>
> ---------------- following is a simple python plugin ----------------------
>
>
> # Simple Python module
>
> """
>
>   Import python_plugin_test
>
>   <Module python_plugin_test>
>
>
>
>   </Module>
>
>
> """
>
> import socket
>
> import collectd
>
> import gc
>
>
>
> def configer(ObjConfiguration):
>
>    collectd.debug('Configuring Stuff')
>
>
>
>
> def initiator():
>
>     collectd.debug('initing stuff')
>
>     gc.set_debug(gc.DEBUG_LEAK)
>
>
>
>
> def dispatcher(data=None):
>
>     metric = collectd.Values();
>
>     metric.plugin = 'python_plugin_test_counter'
>
>     metric.type = 'mysql_threads'
>
>     metric.values = [5,10,15,10]
>
>     metric.dispatch()
>
>     #gcresult=gc.collect()
>
>     #collectd.info("python_plugin_test:gc:%s"%gcresult)
>
>
>
>
>
>
>
> #== Hook Callbacks, Order is important! ==#
>
> collectd.register_config(configer)
>
> collectd.register_init(initiator)
>
> collectd.register_read(dispatcher)
>
>
> ---------------------------
> --
> James.kim(김민석)
> 010 3266 8040
>
>
>


-- 
James.kim(김민석)
010 3266 8040
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.verplant.org/pipermail/collectd/attachments/20120420/760816d6/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: collectd5.0_leak_fix.patch
Type: application/octet-stream
Size: 4282 bytes
Desc: not available
URL: <http://mailman.verplant.org/pipermail/collectd/attachments/20120420/760816d6/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: zookeeper_stat_plugin.py
Type: application/octet-stream
Size: 7130 bytes
Desc: not available
URL: <http://mailman.verplant.org/pipermail/collectd/attachments/20120420/760816d6/attachment-0003.obj>


More information about the collectd mailing list