[collectd] rrdc_update failed with status -1

Benjamin DUPUIS benjamin.dupuis at quake.fr
Thu Mar 29 11:46:16 CEST 2012


I've modified
* src/collectd.c Line 211 : if (mkdir (orig_dir, 0775) == -1)
* src/common.c   Line 551 : if (mkdir (dir, 0775) == 0)

It's now working.
I don't know if it's a good solution perhaps it'll be a good idea to have a configuration entry in the configuration of rrdcached / rrdtools ?

Best regards,

----- Message d'origine -----
De: Benjamin DUPUIS <benjamin.dupuis at quake.fr>
Date: Wed, 28 Mar 2012 11:29:42 +0200
Sujet: Re: [collectd] rrdc_update failed with status -1
À: collectd at verplant.org

>After debugging, collectdmon ignore umask and always create directories with permissions 755
>
>drwxrwsr-x  4 root rrdcached 4096 mars 28 11:21 rrd2
>cd rrd2  && mkdir opuet && ls
>drwxrwsr-x  2 root rrdcached 4096 mars 28 11:21 opuet
>
>systest3t:/var/lib/collectd/rrd2# umask
>0002
>systest3t:/var/lib/collectd/rrd2# collectdmon -P /var/run/collectdmon.pid -c /usr/sbin/collectd -- -C /etc/collectd.conf
>
>Run collectd, directory is created with
>drwxr-sr-x 12 root rrdcached 4096 mars 28 11:23 systest3t
>
>
>
>----- Message d'origine -----
>De: Benjamin DUPUIS <benjamin.dupuis at quake.fr>
>Date: Wed, 28 Mar 2012 10:25:25 +0200
>Sujet: Re: [collectd] rrdc_update failed with status -1
>À: Bruno Prémont <bonbons at linux-vserver.org>,         Benjamin DUPUIS <benjamin.dupuis at quake.fr>
>Cc: collectd at verplant.org
>
>>Hi,
>>
>>umask doesn't seem to be take into account, here is the output of the debugged init.d script
>>
>>+ '[' -r /etc/default/collectdmon ']'
>>+ . /etc/default/collectdmon
>>++ umask 002
>>+ case "$1" in
>>+ start
>>+ echo -n 'Starting collectd: '
>>Starting collectd: + '[' -r /etc/collectd.conf ']'
>>+ daemon collectdmon -P /var/run/collectdmon.pid -c /usr/sbin/collectd -- -C /etc/collectd.conf
>>...
>>+ /bin/bash -c 'ulimit -S -c 0 >/dev/null 2>&1 ; collectdmon -P /var/run/collectdmon.pid -c /usr/sbin/collectd -- -C /etc/collectd.conf'
>>
>>but folder are created in 2755
>>
>>/var/lib/collectd/rrd2
>>Access: (2775/drwxrwsr-x)  Uid: (    0/    root)   Gid: (  500/rrdcached)
>>
>>/var/lib/collectd/rrd2/systest3t
>>Access: (2755/drwxr-sr-x)  Uid: (    0/    root)   Gid: (  500/rrdcached)
>>
>>
>>any idea ?
>>
>>Best regards
>>
>>----- Message d'origine -----
>>De: Bruno Prémont <bonbons at linux-vserver.org>
>>Date: Wed, 28 Mar 2012 09:37:52 +0200
>>Sujet: Re: [collectd] rrdc_update failed with status -1
>>À: Benjamin DUPUIS <benjamin.dupuis at quake.fr>
>>Cc: Stian Øvrevåge <sovrevage at gmail.com>,         Cyril Feraudet  <collectd at feraudet.com>, collectd at verplant.org
>>
>>>On Wed, 28 Mar 2012 09:30:23 Benjamin DUPUIS wrote:
>>>> I had the same problem
>>>> 
>>>> Collectd is run at root
>>>> RRDCached is run at rrdcached.
>>>> 
>>>> When  rrd files are created they are owned by root.root, so rrdcached cannot write 
>>>> chmod 777 (yes it's bad) on your rrds files and it'll work
>>>> 
>>>> I'm searching for a greater solution.
>>>
>>>One way would be to set UMASK for collectd to allow write access to
>>>group and chmod g+s to the directory(ies) where RRDs get created while
>>>having those assigned to a group of which rrdcached is a member.
>>>
>>>e.g.
>>>  groups rrdcached
>>>     rrdcached rrd
>>>
>>>  find /path/to/rrd/store/ -type d -exec chgrp rrd {} +
>>>  find /path/to/rrd/store/ -type d -exec chmod g+ws {} +
>>>
>>>Bruno
>>>
>>>> Best regards
>>>> 
>>>> 
>>>> ----- Message d'origine -----
>>>> De: Stian Øvrevåge <sovrevage at gmail.com>
>>>> Date: Mon, 26 Mar 2012 23:33:31 +0200
>>>> Sujet: Re: [collectd] rrdc_update failed with status -1
>>>> À: Cyril Feraudet <collectd at feraudet.com>
>>>> Cc: collectd at verplant.org
>>>> 
>>>> >Tried now using
>>>> >
>>>> >rrdcached -f 7200 -w 3600 -z 900 -b
>>>> >/opt/collectd/var/lib/collectd/rrd/ -P FLUSH,BATCH,UPDATE,STATS -l
>>>> >127.0.0.1 -l unix:/tmp/rrdcached.sock
>>>> >
>>>> >No change tho. No packets after the connection has been established.
>>>> >
>>>> >Tried simulating using telnet, working just fine:
>>>> >
>>>> >root at collectd-new:/home/kbandusr# telnet localhost 42217
>>>> >Trying 127.0.0.1...
>>>> >Connected to localhost.
>>>> >Escape character is '^]'.
>>>> >STATS
>>>> >9 Statistics follow
>>>> >QueueLength: 0
>>>> >UpdatesReceived: 12
>>>> >FlushesReceived: 0
>>>> >UpdatesWritten: 0
>>>> >DataSetsWritten: 0
>>>> >TreeNodesNumber: 1
>>>> >TreeDepth: 1
>>>> >JournalBytes: 0
>>>> >JournalRotate: 0
>>>> >UPDATE no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1328.rrd
>>>> >1332797080:0:0:0:0
>>>> >0 errors, enqueued 1 value(s).
>>>> >BATCH
>>>> >0 Go ahead.  End with dot '.' on its own line.
>>>> >UPDATE no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1328.rrd
>>>> >1332797085:0:0:0:0
>>>> >UPDATE no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1328.rrd
>>>> >1332797090:0:0:0:0
>>>> >.
>>>> >0 errors
>>>> >quit
>>>> >Connection closed by foreign host.
>>>> >
>>>> >Regards,
>>>> >Stian Øvrevåge
>>>> >
>>>> >On Mon, Mar 26, 2012 at 10:08 PM, Cyril Feraudet <collectd at feraudet.com> wrote:
>>>> >> Have look to "-P" option of rrdcached about permission.
>>>> >>
>>>> >> I've debugged some issue with rrdcached using strace to see entire error message sent to collectd.
>>>> >>
>>>> >> Cyril
>>>> >>
>>>> >> On 26 mars 2012, at 22:01, Stian Øvrevåge wrote:
>>>> >>
>>>> >>> It always works with sockets and never works with network.
>>>> >>>
>>>> >>> So I don't think there is an error with the rrd or polling itself.
>>>> >>>
>>>> >>> Also; tcpdump shows that collectd with connect to rrdcached but does
>>>> >>> NOT send ANY
>>>> >>> packets when updating and returning thousands of errors. I believe
>>>> >>> that the connection handled by collectd is somewhat faulty... I can
>>>> >>> connect to rrdcached by telnet and issue commands without problem...
>>>> >>>
>>>> >>> Brgds,
>>>> >>> Stian Øvrevåge
>>>> >>>
>>>> >>> On Mon, Mar 26, 2012 at 9:57 PM, Cyril Feraudet <collectd at feraudet.com> wrote:
>>>> >>>> Hi,
>>>> >>>>
>>>> >>>> Many issue :
>>>> >>>> - Your rrd was updated before with a value in the future : rrdtool info /opt/collectd/var/lib/collectd/rrd/no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1309.rrd to check las update
>>>> >>>> - Host polling no-kvh020-sw01 has his time in the past.
>>>> >>>> - More than one no-kvh020-sw01 configured
>>>> >>>> - ...
>>>> >>>>
>>>> >>>> Cyril
>>>> >>>> On 26 mars 2012, at 21:00, Stian Øvrevåge wrote:
>>>> >>>>
>>>> >>>>> Hi,
>>>> >>>>>
>>>> >>>>> When trying to use DaemonAddress "127.0.0.1:42217" I'm receiving
>>>> >>>>>
>>>> >>>>> Mar 26 20:52:25 collectd-new collectd[13065]: Filter subsystem:
>>>> >>>>> Built-in target `write': Dispatching value to all write plugins failed
>>>> >>>>> with status -1.
>>>> >>>>> Mar 26 20:52:25 collectd-new collectd[13065]: rrdcached plugin:
>>>> >>>>> rrdc_update (/opt/collectd/var/lib/collectd/rrd/no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1309.rrd,
>>>> >>>>> [1332787945:0:0:0:0], 1) failed with status -1.
>>>> >>>>>
>>>> >>>>> For ALL updates.
>>>> >>>>>
>>>> >>>>> This is my rrdcached plugin config:
>>>> >>>>>
>>>> >>>>> <Plugin "rrdcached">
>>>> >>>>>  #DaemonAddress "unix:/tmp/rrdcached.sock"
>>>> >>>>>  DaemonAddress "127.0.0.1:42217"
>>>> >>>>>  DataDir "/opt/collectd/var/lib/collectd/rrd"
>>>> >>>>>  #CreateFiles true
>>>> >>>>>  #CollectStatistics true
>>>> >>>>>  StepSize 30
>>>> >>>>>  HeartBeat 600
>>>> >>>>>  RRaTimespan 3600
>>>> >>>>>  RRaTimespan 86400
>>>> >>>>>  RRaTimespan 604800
>>>> >>>>>  RRaTimespan 2678400
>>>> >>>>>  RRaTimespan 166224000
>>>> >>>>> </Plugin>
>>>> >>>>>
>>>> >>>>> I run rrdcached with "/opt/rrdtool-1.4.7/bin/rrdcached -f 7200 -w 3600
>>>> >>>>> -z 900 -b /opt/collectd/var/lib/collectd/rrd/ -l 127.0.0.1 -l
>>>> >>>>> unix:/tmp/rrdcached.sock"
>>>> >>>>>
>>>> >>>>> The reason I want to test using network sockets is that the unix
>>>> >>>>> sockets seem to cause thousands of disk reads and writes according to
>>>> >>>>> VMware disk statistics.
>>>> >>>>>
>>>> >>>>> A tcpdump with ascii decode only shows this: (No data? And statistics
>>>> >>>>> which I thought I disabled?)
>>>> >>>>>
>>>> >>>>> 20:56:52.891588 IP localhost.42381 > localhost.42217: Flags [P.], seq
>>>> >>>>> 385950220:385950226, ack 4144176433, win 349, options [nop,nop,TS val
>>>> >>>>> 133288705 ecr 133281205], length 6
>>>> >>>>> E..:x. at .@................."....1...].......
>>>> >>>>> ........STATS
>>>> >>>>>
>>>> >>>>> 20:56:52.891734 IP localhost.42217 > localhost.42381: Flags [P.], seq
>>>> >>>>> 1:21, ack 6, win 256, options [nop,nop,TS val 133288705 ecr
>>>> >>>>> 133288705], length 20
>>>> >>>>> ...............1.."......<.....
>>>> >>>>> ........9 Statistics follow
>>>> >>>>>
>>>> >>>>> 20:56:52.891760 IP localhost.42381 > localhost.42217: Flags [.], ack
>>>> >>>>> 21, win 349, options [nop,nop,TS val 133288705 ecr 133288705], length
>>>> >>>>> 0
>>>> >>>>> E..4x. at .@................."....E...])......
>>>> >>>>> ........
>>>> >>>>> 20:56:52.891777 IP localhost.42217 > localhost.42381: Flags [P.], seq
>>>> >>>>> 21:176, ack 6, win 256, options [nop,nop,TS val 133288705 ecr
>>>> >>>>> 133288705], length 155
>>>> >>>>> E...C. at .@..................E.."............
>>>> >>>>> ........QueueLength: 0
>>>> >>>>> UpdatesReceived: 0
>>>> >>>>> FlushesReceived: 0
>>>> >>>>> UpdatesWritten: 0
>>>> >>>>> DataSetsWritten: 0
>>>> >>>>> TreeNodesNumber: 0
>>>> >>>>> TreeDepth: 0
>>>> >>>>> JournalBytes: 0
>>>> >>>>> JournalRotate: 0
>>>> >>>>>
>>>> >>>>> 20:56:52.891783 IP localhost.42381 > localhost.42217: Flags [.], ack
>>>> >>>>> 176, win 357, options [nop,nop,TS val 133288705 ecr 133288705], length
>>>> >>>>> 0
>>>> >>>>> E..4x. at .@................."........e)......
>>>> >>>>> ........
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> Regards,
>>>> >>>>> Stian Øvrevåge
>>>> >>>>>
>>>> >>>>> _______________________________________________
>>>> >>>>> collectd mailing list
>>>> >>>>> collectd at verplant.org
>>>> >>>>> http://mailman.verplant.org/listinfo/collectd
>>>> >>>>
>>>> >>>>
>>>> >>>> _______________________________________________
>>>> >>>> collectd mailing list
>>>> >>>> collectd at verplant.org
>>>> >>>> http://mailman.verplant.org/listinfo/collectd
>>>> >>
>>>> >
>>>> >_______________________________________________
>>>> >collectd mailing list
>>>> >collectd at verplant.org
>>>> >http://mailman.verplant.org/listinfo/collectd
>>>> >
>>>> >
>>>> 
>>>> _______________________________________________
>>>> collectd mailing list
>>>> collectd at verplant.org
>>>> http://mailman.verplant.org/listinfo/collectd
>>>
>>>
>>>
>>
>>_______________________________________________
>>collectd mailing list
>>collectd at verplant.org
>>http://mailman.verplant.org/listinfo/collectd
>>
>>
>
>_______________________________________________
>collectd mailing list
>collectd at verplant.org
>http://mailman.verplant.org/listinfo/collectd
>
>



More information about the collectd mailing list