[collectd] rrdc_update failed with status -1

Benjamin DUPUIS benjamin.dupuis at quake.fr
Wed Mar 28 11:29:42 CEST 2012


After debugging, collectdmon ignore umask and always create directories with permissions 755

drwxrwsr-x  4 root rrdcached 4096 mars 28 11:21 rrd2
cd rrd2  && mkdir opuet && ls
drwxrwsr-x  2 root rrdcached 4096 mars 28 11:21 opuet

systest3t:/var/lib/collectd/rrd2# umask
0002
systest3t:/var/lib/collectd/rrd2# collectdmon -P /var/run/collectdmon.pid -c /usr/sbin/collectd -- -C /etc/collectd.conf

Run collectd, directory is created with
drwxr-sr-x 12 root rrdcached 4096 mars 28 11:23 systest3t



----- Message d'origine -----
De: Benjamin DUPUIS <benjamin.dupuis at quake.fr>
Date: Wed, 28 Mar 2012 10:25:25 +0200
Sujet: Re: [collectd] rrdc_update failed with status -1
À: Bruno Prémont <bonbons at linux-vserver.org>,         Benjamin DUPUIS <benjamin.dupuis at quake.fr>
Cc: collectd at verplant.org

>Hi,
>
>umask doesn't seem to be take into account, here is the output of the debugged init.d script
>
>+ '[' -r /etc/default/collectdmon ']'
>+ . /etc/default/collectdmon
>++ umask 002
>+ case "$1" in
>+ start
>+ echo -n 'Starting collectd: '
>Starting collectd: + '[' -r /etc/collectd.conf ']'
>+ daemon collectdmon -P /var/run/collectdmon.pid -c /usr/sbin/collectd -- -C /etc/collectd.conf
>...
>+ /bin/bash -c 'ulimit -S -c 0 >/dev/null 2>&1 ; collectdmon -P /var/run/collectdmon.pid -c /usr/sbin/collectd -- -C /etc/collectd.conf'
>
>but folder are created in 2755
>
>/var/lib/collectd/rrd2
>Access: (2775/drwxrwsr-x)  Uid: (    0/    root)   Gid: (  500/rrdcached)
>
>/var/lib/collectd/rrd2/systest3t
>Access: (2755/drwxr-sr-x)  Uid: (    0/    root)   Gid: (  500/rrdcached)
>
>
>any idea ?
>
>Best regards
>
>----- Message d'origine -----
>De: Bruno Prémont <bonbons at linux-vserver.org>
>Date: Wed, 28 Mar 2012 09:37:52 +0200
>Sujet: Re: [collectd] rrdc_update failed with status -1
>À: Benjamin DUPUIS <benjamin.dupuis at quake.fr>
>Cc: Stian Øvrevåge <sovrevage at gmail.com>,         Cyril Feraudet  <collectd at feraudet.com>, collectd at verplant.org
>
>>On Wed, 28 Mar 2012 09:30:23 Benjamin DUPUIS wrote:
>>> I had the same problem
>>> 
>>> Collectd is run at root
>>> RRDCached is run at rrdcached.
>>> 
>>> When  rrd files are created they are owned by root.root, so rrdcached cannot write 
>>> chmod 777 (yes it's bad) on your rrds files and it'll work
>>> 
>>> I'm searching for a greater solution.
>>
>>One way would be to set UMASK for collectd to allow write access to
>>group and chmod g+s to the directory(ies) where RRDs get created while
>>having those assigned to a group of which rrdcached is a member.
>>
>>e.g.
>>  groups rrdcached
>>     rrdcached rrd
>>
>>  find /path/to/rrd/store/ -type d -exec chgrp rrd {} +
>>  find /path/to/rrd/store/ -type d -exec chmod g+ws {} +
>>
>>Bruno
>>
>>> Best regards
>>> 
>>> 
>>> ----- Message d'origine -----
>>> De: Stian Øvrevåge <sovrevage at gmail.com>
>>> Date: Mon, 26 Mar 2012 23:33:31 +0200
>>> Sujet: Re: [collectd] rrdc_update failed with status -1
>>> À: Cyril Feraudet <collectd at feraudet.com>
>>> Cc: collectd at verplant.org
>>> 
>>> >Tried now using
>>> >
>>> >rrdcached -f 7200 -w 3600 -z 900 -b
>>> >/opt/collectd/var/lib/collectd/rrd/ -P FLUSH,BATCH,UPDATE,STATS -l
>>> >127.0.0.1 -l unix:/tmp/rrdcached.sock
>>> >
>>> >No change tho. No packets after the connection has been established.
>>> >
>>> >Tried simulating using telnet, working just fine:
>>> >
>>> >root at collectd-new:/home/kbandusr# telnet localhost 42217
>>> >Trying 127.0.0.1...
>>> >Connected to localhost.
>>> >Escape character is '^]'.
>>> >STATS
>>> >9 Statistics follow
>>> >QueueLength: 0
>>> >UpdatesReceived: 12
>>> >FlushesReceived: 0
>>> >UpdatesWritten: 0
>>> >DataSetsWritten: 0
>>> >TreeNodesNumber: 1
>>> >TreeDepth: 1
>>> >JournalBytes: 0
>>> >JournalRotate: 0
>>> >UPDATE no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1328.rrd
>>> >1332797080:0:0:0:0
>>> >0 errors, enqueued 1 value(s).
>>> >BATCH
>>> >0 Go ahead.  End with dot '.' on its own line.
>>> >UPDATE no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1328.rrd
>>> >1332797085:0:0:0:0
>>> >UPDATE no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1328.rrd
>>> >1332797090:0:0:0:0
>>> >.
>>> >0 errors
>>> >quit
>>> >Connection closed by foreign host.
>>> >
>>> >Regards,
>>> >Stian Øvrevåge
>>> >
>>> >On Mon, Mar 26, 2012 at 10:08 PM, Cyril Feraudet <collectd at feraudet.com> wrote:
>>> >> Have look to "-P" option of rrdcached about permission.
>>> >>
>>> >> I've debugged some issue with rrdcached using strace to see entire error message sent to collectd.
>>> >>
>>> >> Cyril
>>> >>
>>> >> On 26 mars 2012, at 22:01, Stian Øvrevåge wrote:
>>> >>
>>> >>> It always works with sockets and never works with network.
>>> >>>
>>> >>> So I don't think there is an error with the rrd or polling itself.
>>> >>>
>>> >>> Also; tcpdump shows that collectd with connect to rrdcached but does
>>> >>> NOT send ANY
>>> >>> packets when updating and returning thousands of errors. I believe
>>> >>> that the connection handled by collectd is somewhat faulty... I can
>>> >>> connect to rrdcached by telnet and issue commands without problem...
>>> >>>
>>> >>> Brgds,
>>> >>> Stian Øvrevåge
>>> >>>
>>> >>> On Mon, Mar 26, 2012 at 9:57 PM, Cyril Feraudet <collectd at feraudet.com> wrote:
>>> >>>> Hi,
>>> >>>>
>>> >>>> Many issue :
>>> >>>> - Your rrd was updated before with a value in the future : rrdtool info /opt/collectd/var/lib/collectd/rrd/no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1309.rrd to check las update
>>> >>>> - Host polling no-kvh020-sw01 has his time in the past.
>>> >>>> - More than one no-kvh020-sw01 configured
>>> >>>> - ...
>>> >>>>
>>> >>>> Cyril
>>> >>>> On 26 mars 2012, at 21:00, Stian Øvrevåge wrote:
>>> >>>>
>>> >>>>> Hi,
>>> >>>>>
>>> >>>>> When trying to use DaemonAddress "127.0.0.1:42217" I'm receiving
>>> >>>>>
>>> >>>>> Mar 26 20:52:25 collectd-new collectd[13065]: Filter subsystem:
>>> >>>>> Built-in target `write': Dispatching value to all write plugins failed
>>> >>>>> with status -1.
>>> >>>>> Mar 26 20:52:25 collectd-new collectd[13065]: rrdcached plugin:
>>> >>>>> rrdc_update (/opt/collectd/var/lib/collectd/rrd/no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1309.rrd,
>>> >>>>> [1332787945:0:0:0:0], 1) failed with status -1.
>>> >>>>>
>>> >>>>> For ALL updates.
>>> >>>>>
>>> >>>>> This is my rrdcached plugin config:
>>> >>>>>
>>> >>>>> <Plugin "rrdcached">
>>> >>>>>  #DaemonAddress "unix:/tmp/rrdcached.sock"
>>> >>>>>  DaemonAddress "127.0.0.1:42217"
>>> >>>>>  DataDir "/opt/collectd/var/lib/collectd/rrd"
>>> >>>>>  #CreateFiles true
>>> >>>>>  #CollectStatistics true
>>> >>>>>  StepSize 30
>>> >>>>>  HeartBeat 600
>>> >>>>>  RRaTimespan 3600
>>> >>>>>  RRaTimespan 86400
>>> >>>>>  RRaTimespan 604800
>>> >>>>>  RRaTimespan 2678400
>>> >>>>>  RRaTimespan 166224000
>>> >>>>> </Plugin>
>>> >>>>>
>>> >>>>> I run rrdcached with "/opt/rrdtool-1.4.7/bin/rrdcached -f 7200 -w 3600
>>> >>>>> -z 900 -b /opt/collectd/var/lib/collectd/rrd/ -l 127.0.0.1 -l
>>> >>>>> unix:/tmp/rrdcached.sock"
>>> >>>>>
>>> >>>>> The reason I want to test using network sockets is that the unix
>>> >>>>> sockets seem to cause thousands of disk reads and writes according to
>>> >>>>> VMware disk statistics.
>>> >>>>>
>>> >>>>> A tcpdump with ascii decode only shows this: (No data? And statistics
>>> >>>>> which I thought I disabled?)
>>> >>>>>
>>> >>>>> 20:56:52.891588 IP localhost.42381 > localhost.42217: Flags [P.], seq
>>> >>>>> 385950220:385950226, ack 4144176433, win 349, options [nop,nop,TS val
>>> >>>>> 133288705 ecr 133281205], length 6
>>> >>>>> E..:x. at .@................."....1...].......
>>> >>>>> ........STATS
>>> >>>>>
>>> >>>>> 20:56:52.891734 IP localhost.42217 > localhost.42381: Flags [P.], seq
>>> >>>>> 1:21, ack 6, win 256, options [nop,nop,TS val 133288705 ecr
>>> >>>>> 133288705], length 20
>>> >>>>> ...............1.."......<.....
>>> >>>>> ........9 Statistics follow
>>> >>>>>
>>> >>>>> 20:56:52.891760 IP localhost.42381 > localhost.42217: Flags [.], ack
>>> >>>>> 21, win 349, options [nop,nop,TS val 133288705 ecr 133288705], length
>>> >>>>> 0
>>> >>>>> E..4x. at .@................."....E...])......
>>> >>>>> ........
>>> >>>>> 20:56:52.891777 IP localhost.42217 > localhost.42381: Flags [P.], seq
>>> >>>>> 21:176, ack 6, win 256, options [nop,nop,TS val 133288705 ecr
>>> >>>>> 133288705], length 155
>>> >>>>> E...C. at .@..................E.."............
>>> >>>>> ........QueueLength: 0
>>> >>>>> UpdatesReceived: 0
>>> >>>>> FlushesReceived: 0
>>> >>>>> UpdatesWritten: 0
>>> >>>>> DataSetsWritten: 0
>>> >>>>> TreeNodesNumber: 0
>>> >>>>> TreeDepth: 0
>>> >>>>> JournalBytes: 0
>>> >>>>> JournalRotate: 0
>>> >>>>>
>>> >>>>> 20:56:52.891783 IP localhost.42381 > localhost.42217: Flags [.], ack
>>> >>>>> 176, win 357, options [nop,nop,TS val 133288705 ecr 133288705], length
>>> >>>>> 0
>>> >>>>> E..4x. at .@................."........e)......
>>> >>>>> ........
>>> >>>>>
>>> >>>>>
>>> >>>>> Regards,
>>> >>>>> Stian Øvrevåge
>>> >>>>>
>>> >>>>> _______________________________________________
>>> >>>>> collectd mailing list
>>> >>>>> collectd at verplant.org
>>> >>>>> http://mailman.verplant.org/listinfo/collectd
>>> >>>>
>>> >>>>
>>> >>>> _______________________________________________
>>> >>>> collectd mailing list
>>> >>>> collectd at verplant.org
>>> >>>> http://mailman.verplant.org/listinfo/collectd
>>> >>
>>> >
>>> >_______________________________________________
>>> >collectd mailing list
>>> >collectd at verplant.org
>>> >http://mailman.verplant.org/listinfo/collectd
>>> >
>>> >
>>> 
>>> _______________________________________________
>>> collectd mailing list
>>> collectd at verplant.org
>>> http://mailman.verplant.org/listinfo/collectd
>>
>>
>>
>
>_______________________________________________
>collectd mailing list
>collectd at verplant.org
>http://mailman.verplant.org/listinfo/collectd
>
>



More information about the collectd mailing list