[collectd] rrdc_update failed with status -1

Benjamin DUPUIS benjamin.dupuis at quake.fr
Wed Mar 28 10:25:25 CEST 2012


Hi,

umask doesn't seem to be take into account, here is the output of the debugged init.d script

+ '[' -r /etc/default/collectdmon ']'
+ . /etc/default/collectdmon
++ umask 002
+ case "$1" in
+ start
+ echo -n 'Starting collectd: '
Starting collectd: + '[' -r /etc/collectd.conf ']'
+ daemon collectdmon -P /var/run/collectdmon.pid -c /usr/sbin/collectd -- -C /etc/collectd.conf
...
+ /bin/bash -c 'ulimit -S -c 0 >/dev/null 2>&1 ; collectdmon -P /var/run/collectdmon.pid -c /usr/sbin/collectd -- -C /etc/collectd.conf'

but folder are created in 2755

/var/lib/collectd/rrd2
Access: (2775/drwxrwsr-x)  Uid: (    0/    root)   Gid: (  500/rrdcached)

/var/lib/collectd/rrd2/systest3t
Access: (2755/drwxr-sr-x)  Uid: (    0/    root)   Gid: (  500/rrdcached)


any idea ?

Best regards

----- Message d'origine -----
De: Bruno Prémont <bonbons at linux-vserver.org>
Date: Wed, 28 Mar 2012 09:37:52 +0200
Sujet: Re: [collectd] rrdc_update failed with status -1
À: Benjamin DUPUIS <benjamin.dupuis at quake.fr>
Cc: Stian Øvrevåge <sovrevage at gmail.com>,         Cyril Feraudet  <collectd at feraudet.com>, collectd at verplant.org

>On Wed, 28 Mar 2012 09:30:23 Benjamin DUPUIS wrote:
>> I had the same problem
>> 
>> Collectd is run at root
>> RRDCached is run at rrdcached.
>> 
>> When  rrd files are created they are owned by root.root, so rrdcached cannot write 
>> chmod 777 (yes it's bad) on your rrds files and it'll work
>> 
>> I'm searching for a greater solution.
>
>One way would be to set UMASK for collectd to allow write access to
>group and chmod g+s to the directory(ies) where RRDs get created while
>having those assigned to a group of which rrdcached is a member.
>
>e.g.
>  groups rrdcached
>     rrdcached rrd
>
>  find /path/to/rrd/store/ -type d -exec chgrp rrd {} +
>  find /path/to/rrd/store/ -type d -exec chmod g+ws {} +
>
>Bruno
>
>> Best regards
>> 
>> 
>> ----- Message d'origine -----
>> De: Stian Øvrevåge <sovrevage at gmail.com>
>> Date: Mon, 26 Mar 2012 23:33:31 +0200
>> Sujet: Re: [collectd] rrdc_update failed with status -1
>> À: Cyril Feraudet <collectd at feraudet.com>
>> Cc: collectd at verplant.org
>> 
>> >Tried now using
>> >
>> >rrdcached -f 7200 -w 3600 -z 900 -b
>> >/opt/collectd/var/lib/collectd/rrd/ -P FLUSH,BATCH,UPDATE,STATS -l
>> >127.0.0.1 -l unix:/tmp/rrdcached.sock
>> >
>> >No change tho. No packets after the connection has been established.
>> >
>> >Tried simulating using telnet, working just fine:
>> >
>> >root at collectd-new:/home/kbandusr# telnet localhost 42217
>> >Trying 127.0.0.1...
>> >Connected to localhost.
>> >Escape character is '^]'.
>> >STATS
>> >9 Statistics follow
>> >QueueLength: 0
>> >UpdatesReceived: 12
>> >FlushesReceived: 0
>> >UpdatesWritten: 0
>> >DataSetsWritten: 0
>> >TreeNodesNumber: 1
>> >TreeDepth: 1
>> >JournalBytes: 0
>> >JournalRotate: 0
>> >UPDATE no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1328.rrd
>> >1332797080:0:0:0:0
>> >0 errors, enqueued 1 value(s).
>> >BATCH
>> >0 Go ahead.  End with dot '.' on its own line.
>> >UPDATE no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1328.rrd
>> >1332797085:0:0:0:0
>> >UPDATE no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1328.rrd
>> >1332797090:0:0:0:0
>> >.
>> >0 errors
>> >quit
>> >Connection closed by foreign host.
>> >
>> >Regards,
>> >Stian Øvrevåge
>> >
>> >On Mon, Mar 26, 2012 at 10:08 PM, Cyril Feraudet <collectd at feraudet.com> wrote:
>> >> Have look to "-P" option of rrdcached about permission.
>> >>
>> >> I've debugged some issue with rrdcached using strace to see entire error message sent to collectd.
>> >>
>> >> Cyril
>> >>
>> >> On 26 mars 2012, at 22:01, Stian Øvrevåge wrote:
>> >>
>> >>> It always works with sockets and never works with network.
>> >>>
>> >>> So I don't think there is an error with the rrd or polling itself.
>> >>>
>> >>> Also; tcpdump shows that collectd with connect to rrdcached but does
>> >>> NOT send ANY
>> >>> packets when updating and returning thousands of errors. I believe
>> >>> that the connection handled by collectd is somewhat faulty... I can
>> >>> connect to rrdcached by telnet and issue commands without problem...
>> >>>
>> >>> Brgds,
>> >>> Stian Øvrevåge
>> >>>
>> >>> On Mon, Mar 26, 2012 at 9:57 PM, Cyril Feraudet <collectd at feraudet.com> wrote:
>> >>>> Hi,
>> >>>>
>> >>>> Many issue :
>> >>>> - Your rrd was updated before with a value in the future : rrdtool info /opt/collectd/var/lib/collectd/rrd/no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1309.rrd to check las update
>> >>>> - Host polling no-kvh020-sw01 has his time in the past.
>> >>>> - More than one no-kvh020-sw01 configured
>> >>>> - ...
>> >>>>
>> >>>> Cyril
>> >>>> On 26 mars 2012, at 21:00, Stian Øvrevåge wrote:
>> >>>>
>> >>>>> Hi,
>> >>>>>
>> >>>>> When trying to use DaemonAddress "127.0.0.1:42217" I'm receiving
>> >>>>>
>> >>>>> Mar 26 20:52:25 collectd-new collectd[13065]: Filter subsystem:
>> >>>>> Built-in target `write': Dispatching value to all write plugins failed
>> >>>>> with status -1.
>> >>>>> Mar 26 20:52:25 collectd-new collectd[13065]: rrdcached plugin:
>> >>>>> rrdc_update (/opt/collectd/var/lib/collectd/rrd/no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1309.rrd,
>> >>>>> [1332787945:0:0:0:0], 1) failed with status -1.
>> >>>>>
>> >>>>> For ALL updates.
>> >>>>>
>> >>>>> This is my rrdcached plugin config:
>> >>>>>
>> >>>>> <Plugin "rrdcached">
>> >>>>>  #DaemonAddress "unix:/tmp/rrdcached.sock"
>> >>>>>  DaemonAddress "127.0.0.1:42217"
>> >>>>>  DataDir "/opt/collectd/var/lib/collectd/rrd"
>> >>>>>  #CreateFiles true
>> >>>>>  #CollectStatistics true
>> >>>>>  StepSize 30
>> >>>>>  HeartBeat 600
>> >>>>>  RRaTimespan 3600
>> >>>>>  RRaTimespan 86400
>> >>>>>  RRaTimespan 604800
>> >>>>>  RRaTimespan 2678400
>> >>>>>  RRaTimespan 166224000
>> >>>>> </Plugin>
>> >>>>>
>> >>>>> I run rrdcached with "/opt/rrdtool-1.4.7/bin/rrdcached -f 7200 -w 3600
>> >>>>> -z 900 -b /opt/collectd/var/lib/collectd/rrd/ -l 127.0.0.1 -l
>> >>>>> unix:/tmp/rrdcached.sock"
>> >>>>>
>> >>>>> The reason I want to test using network sockets is that the unix
>> >>>>> sockets seem to cause thousands of disk reads and writes according to
>> >>>>> VMware disk statistics.
>> >>>>>
>> >>>>> A tcpdump with ascii decode only shows this: (No data? And statistics
>> >>>>> which I thought I disabled?)
>> >>>>>
>> >>>>> 20:56:52.891588 IP localhost.42381 > localhost.42217: Flags [P.], seq
>> >>>>> 385950220:385950226, ack 4144176433, win 349, options [nop,nop,TS val
>> >>>>> 133288705 ecr 133281205], length 6
>> >>>>> E..:x. at .@................."....1...].......
>> >>>>> ........STATS
>> >>>>>
>> >>>>> 20:56:52.891734 IP localhost.42217 > localhost.42381: Flags [P.], seq
>> >>>>> 1:21, ack 6, win 256, options [nop,nop,TS val 133288705 ecr
>> >>>>> 133288705], length 20
>> >>>>> ...............1.."......<.....
>> >>>>> ........9 Statistics follow
>> >>>>>
>> >>>>> 20:56:52.891760 IP localhost.42381 > localhost.42217: Flags [.], ack
>> >>>>> 21, win 349, options [nop,nop,TS val 133288705 ecr 133288705], length
>> >>>>> 0
>> >>>>> E..4x. at .@................."....E...])......
>> >>>>> ........
>> >>>>> 20:56:52.891777 IP localhost.42217 > localhost.42381: Flags [P.], seq
>> >>>>> 21:176, ack 6, win 256, options [nop,nop,TS val 133288705 ecr
>> >>>>> 133288705], length 155
>> >>>>> E...C. at .@..................E.."............
>> >>>>> ........QueueLength: 0
>> >>>>> UpdatesReceived: 0
>> >>>>> FlushesReceived: 0
>> >>>>> UpdatesWritten: 0
>> >>>>> DataSetsWritten: 0
>> >>>>> TreeNodesNumber: 0
>> >>>>> TreeDepth: 0
>> >>>>> JournalBytes: 0
>> >>>>> JournalRotate: 0
>> >>>>>
>> >>>>> 20:56:52.891783 IP localhost.42381 > localhost.42217: Flags [.], ack
>> >>>>> 176, win 357, options [nop,nop,TS val 133288705 ecr 133288705], length
>> >>>>> 0
>> >>>>> E..4x. at .@................."........e)......
>> >>>>> ........
>> >>>>>
>> >>>>>
>> >>>>> Regards,
>> >>>>> Stian Øvrevåge
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> collectd mailing list
>> >>>>> collectd at verplant.org
>> >>>>> http://mailman.verplant.org/listinfo/collectd
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> collectd mailing list
>> >>>> collectd at verplant.org
>> >>>> http://mailman.verplant.org/listinfo/collectd
>> >>
>> >
>> >_______________________________________________
>> >collectd mailing list
>> >collectd at verplant.org
>> >http://mailman.verplant.org/listinfo/collectd
>> >
>> >
>> 
>> _______________________________________________
>> collectd mailing list
>> collectd at verplant.org
>> http://mailman.verplant.org/listinfo/collectd
>
>
>



More information about the collectd mailing list