[collectd] rrdc_update failed with status -1
Benjamin DUPUIS
benjamin.dupuis at quake.fr
Thu Mar 29 13:25:30 CEST 2012
umask in init script doesn't work for me (RHEL5).
I never coded in C, only C++ for more than 10 years ago :)
----- Message d'origine -----
De: Bruno Prémont <bonbons at linux-vserver.org>
Date: Thu, 29 Mar 2012 12:00:12 +0200
Sujet: Re: [collectd] rrdc_update failed with status -1
À: Benjamin DUPUIS <benjamin.dupuis at quake.fr>
Cc: collectd at verplant.org
>On Thu, 29 Mar 2012 11:46:16 Benjamin DUPUIS wrote:
>> I've modified
>> * src/collectd.c Line 211 : if (mkdir (orig_dir, 0775) == -1)
>> * src/common.c Line 551 : if (mkdir (dir, 0775) == 0)
>>
>> It's now working.
>> I don't know if it's a good solution perhaps it'll be a good idea to
>> have a configuration entry in the configuration of rrdcached /
>> rrdtools ?
>
>I think it would even be better to change mode to 0777 and let umask do
>the whole work, eventually having global umask configuration option in
>collectd config file so umask setting does not have to be delegated to
>init script (where it is not set and thus depends on umask of init or
>the shell [re]starting the daemon).
>
>The same has to apply for csv and the other write plugins that may
>create new files.
>
>Having configurable mode gets hard, especially as then on might also
>want to control owner/group and possibly ACLs or more!
>
>Would you mind creating a patch for it?
>
>
>Best regards,
>Bruno
>
>> Best regards,
>>
>> ----- Message d'origine -----
>> De: Benjamin DUPUIS <benjamin.dupuis at quake.fr>
>> Date: Wed, 28 Mar 2012 11:29:42 +0200
>> Sujet: Re: [collectd] rrdc_update failed with status -1
>> À: collectd at verplant.org
>>
>> >After debugging, collectdmon ignore umask and always create directories with permissions 755
>> >
>> >drwxrwsr-x 4 root rrdcached 4096 mars 28 11:21 rrd2
>> >cd rrd2 && mkdir opuet && ls
>> >drwxrwsr-x 2 root rrdcached 4096 mars 28 11:21 opuet
>> >
>> >systest3t:/var/lib/collectd/rrd2# umask
>> >0002
>> >systest3t:/var/lib/collectd/rrd2# collectdmon -P /var/run/collectdmon.pid -c /usr/sbin/collectd -- -C /etc/collectd.conf
>> >
>> >Run collectd, directory is created with
>> >drwxr-sr-x 12 root rrdcached 4096 mars 28 11:23 systest3t
>> >
>> >
>> >
>> >----- Message d'origine -----
>> >De: Benjamin DUPUIS <benjamin.dupuis at quake.fr>
>> >Date: Wed, 28 Mar 2012 10:25:25 +0200
>> >Sujet: Re: [collectd] rrdc_update failed with status -1
>> >À: Bruno Prémont <bonbons at linux-vserver.org>, Benjamin DUPUIS <benjamin.dupuis at quake.fr>
>> >Cc: collectd at verplant.org
>> >
>> >>Hi,
>> >>
>> >>umask doesn't seem to be take into account, here is the output of the debugged init.d script
>> >>
>> >>+ '[' -r /etc/default/collectdmon ']'
>> >>+ . /etc/default/collectdmon
>> >>++ umask 002
>> >>+ case "$1" in
>> >>+ start
>> >>+ echo -n 'Starting collectd: '
>> >>Starting collectd: + '[' -r /etc/collectd.conf ']'
>> >>+ daemon collectdmon -P /var/run/collectdmon.pid -c /usr/sbin/collectd -- -C /etc/collectd.conf
>> >>...
>> >>+ /bin/bash -c 'ulimit -S -c 0 >/dev/null 2>&1 ; collectdmon -P /var/run/collectdmon.pid -c /usr/sbin/collectd -- -C /etc/collectd.conf'
>> >>
>> >>but folder are created in 2755
>> >>
>> >>/var/lib/collectd/rrd2
>> >>Access: (2775/drwxrwsr-x) Uid: ( 0/ root) Gid: ( 500/rrdcached)
>> >>
>> >>/var/lib/collectd/rrd2/systest3t
>> >>Access: (2755/drwxr-sr-x) Uid: ( 0/ root) Gid: ( 500/rrdcached)
>> >>
>> >>
>> >>any idea ?
>> >>
>> >>Best regards
>> >>
>> >>----- Message d'origine -----
>> >>De: Bruno Prémont <bonbons at linux-vserver.org>
>> >>Date: Wed, 28 Mar 2012 09:37:52 +0200
>> >>Sujet: Re: [collectd] rrdc_update failed with status -1
>> >>À: Benjamin DUPUIS <benjamin.dupuis at quake.fr>
>> >>Cc: Stian Øvrevåge <sovrevage at gmail.com>, Cyril Feraudet <collectd at feraudet.com>, collectd at verplant.org
>> >>
>> >>>On Wed, 28 Mar 2012 09:30:23 Benjamin DUPUIS wrote:
>> >>>> I had the same problem
>> >>>>
>> >>>> Collectd is run at root
>> >>>> RRDCached is run at rrdcached.
>> >>>>
>> >>>> When rrd files are created they are owned by root.root, so rrdcached cannot write
>> >>>> chmod 777 (yes it's bad) on your rrds files and it'll work
>> >>>>
>> >>>> I'm searching for a greater solution.
>> >>>
>> >>>One way would be to set UMASK for collectd to allow write access to
>> >>>group and chmod g+s to the directory(ies) where RRDs get created while
>> >>>having those assigned to a group of which rrdcached is a member.
>> >>>
>> >>>e.g.
>> >>> groups rrdcached
>> >>> rrdcached rrd
>> >>>
>> >>> find /path/to/rrd/store/ -type d -exec chgrp rrd {} +
>> >>> find /path/to/rrd/store/ -type d -exec chmod g+ws {} +
>> >>>
>> >>>Bruno
>> >>>
>> >>>> Best regards
>> >>>>
>> >>>>
>> >>>> ----- Message d'origine -----
>> >>>> De: Stian Øvrevåge <sovrevage at gmail.com>
>> >>>> Date: Mon, 26 Mar 2012 23:33:31 +0200
>> >>>> Sujet: Re: [collectd] rrdc_update failed with status -1
>> >>>> À: Cyril Feraudet <collectd at feraudet.com>
>> >>>> Cc: collectd at verplant.org
>> >>>>
>> >>>> >Tried now using
>> >>>> >
>> >>>> >rrdcached -f 7200 -w 3600 -z 900 -b
>> >>>> >/opt/collectd/var/lib/collectd/rrd/ -P FLUSH,BATCH,UPDATE,STATS -l
>> >>>> >127.0.0.1 -l unix:/tmp/rrdcached.sock
>> >>>> >
>> >>>> >No change tho. No packets after the connection has been established.
>> >>>> >
>> >>>> >Tried simulating using telnet, working just fine:
>> >>>> >
>> >>>> >root at collectd-new:/home/kbandusr# telnet localhost 42217
>> >>>> >Trying 127.0.0.1...
>> >>>> >Connected to localhost.
>> >>>> >Escape character is '^]'.
>> >>>> >STATS
>> >>>> >9 Statistics follow
>> >>>> >QueueLength: 0
>> >>>> >UpdatesReceived: 12
>> >>>> >FlushesReceived: 0
>> >>>> >UpdatesWritten: 0
>> >>>> >DataSetsWritten: 0
>> >>>> >TreeNodesNumber: 1
>> >>>> >TreeDepth: 1
>> >>>> >JournalBytes: 0
>> >>>> >JournalRotate: 0
>> >>>> >UPDATE no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1328.rrd
>> >>>> >1332797080:0:0:0:0
>> >>>> >0 errors, enqueued 1 value(s).
>> >>>> >BATCH
>> >>>> >0 Go ahead. End with dot '.' on its own line.
>> >>>> >UPDATE no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1328.rrd
>> >>>> >1332797085:0:0:0:0
>> >>>> >UPDATE no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1328.rrd
>> >>>> >1332797090:0:0:0:0
>> >>>> >.
>> >>>> >0 errors
>> >>>> >quit
>> >>>> >Connection closed by foreign host.
>> >>>> >
>> >>>> >Regards,
>> >>>> >Stian Øvrevåge
>> >>>> >
>> >>>> >On Mon, Mar 26, 2012 at 10:08 PM, Cyril Feraudet <collectd at feraudet.com> wrote:
>> >>>> >> Have look to "-P" option of rrdcached about permission.
>> >>>> >>
>> >>>> >> I've debugged some issue with rrdcached using strace to see entire error message sent to collectd.
>> >>>> >>
>> >>>> >> Cyril
>> >>>> >>
>> >>>> >> On 26 mars 2012, at 22:01, Stian Øvrevåge wrote:
>> >>>> >>
>> >>>> >>> It always works with sockets and never works with network.
>> >>>> >>>
>> >>>> >>> So I don't think there is an error with the rrd or polling itself.
>> >>>> >>>
>> >>>> >>> Also; tcpdump shows that collectd with connect to rrdcached but does
>> >>>> >>> NOT send ANY
>> >>>> >>> packets when updating and returning thousands of errors. I believe
>> >>>> >>> that the connection handled by collectd is somewhat faulty... I can
>> >>>> >>> connect to rrdcached by telnet and issue commands without problem...
>> >>>> >>>
>> >>>> >>> Brgds,
>> >>>> >>> Stian Øvrevåge
>> >>>> >>>
>> >>>> >>> On Mon, Mar 26, 2012 at 9:57 PM, Cyril Feraudet <collectd at feraudet.com> wrote:
>> >>>> >>>> Hi,
>> >>>> >>>>
>> >>>> >>>> Many issue :
>> >>>> >>>> - Your rrd was updated before with a value in the future : rrdtool info /opt/collectd/var/lib/collectd/rrd/no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1309.rrd to check las update
>> >>>> >>>> - Host polling no-kvh020-sw01 has his time in the past.
>> >>>> >>>> - More than one no-kvh020-sw01 configured
>> >>>> >>>> - ...
>> >>>> >>>>
>> >>>> >>>> Cyril
>> >>>> >>>> On 26 mars 2012, at 21:00, Stian Øvrevåge wrote:
>> >>>> >>>>
>> >>>> >>>>> Hi,
>> >>>> >>>>>
>> >>>> >>>>> When trying to use DaemonAddress "127.0.0.1:42217" I'm receiving
>> >>>> >>>>>
>> >>>> >>>>> Mar 26 20:52:25 collectd-new collectd[13065]: Filter subsystem:
>> >>>> >>>>> Built-in target `write': Dispatching value to all write plugins failed
>> >>>> >>>>> with status -1.
>> >>>> >>>>> Mar 26 20:52:25 collectd-new collectd[13065]: rrdcached plugin:
>> >>>> >>>>> rrdc_update (/opt/collectd/var/lib/collectd/rrd/no-kvh020-sw01/snmp/if_drop_discard_err_que-Vlan1309.rrd,
>> >>>> >>>>> [1332787945:0:0:0:0], 1) failed with status -1.
>> >>>> >>>>>
>> >>>> >>>>> For ALL updates.
>> >>>> >>>>>
>> >>>> >>>>> This is my rrdcached plugin config:
>> >>>> >>>>>
>> >>>> >>>>> <Plugin "rrdcached">
>> >>>> >>>>> #DaemonAddress "unix:/tmp/rrdcached.sock"
>> >>>> >>>>> DaemonAddress "127.0.0.1:42217"
>> >>>> >>>>> DataDir "/opt/collectd/var/lib/collectd/rrd"
>> >>>> >>>>> #CreateFiles true
>> >>>> >>>>> #CollectStatistics true
>> >>>> >>>>> StepSize 30
>> >>>> >>>>> HeartBeat 600
>> >>>> >>>>> RRaTimespan 3600
>> >>>> >>>>> RRaTimespan 86400
>> >>>> >>>>> RRaTimespan 604800
>> >>>> >>>>> RRaTimespan 2678400
>> >>>> >>>>> RRaTimespan 166224000
>> >>>> >>>>> </Plugin>
>> >>>> >>>>>
>> >>>> >>>>> I run rrdcached with "/opt/rrdtool-1.4.7/bin/rrdcached -f 7200 -w 3600
>> >>>> >>>>> -z 900 -b /opt/collectd/var/lib/collectd/rrd/ -l 127.0.0.1 -l
>> >>>> >>>>> unix:/tmp/rrdcached.sock"
>> >>>> >>>>>
>> >>>> >>>>> The reason I want to test using network sockets is that the unix
>> >>>> >>>>> sockets seem to cause thousands of disk reads and writes according to
>> >>>> >>>>> VMware disk statistics.
>> >>>> >>>>>
>> >>>> >>>>> A tcpdump with ascii decode only shows this: (No data? And statistics
>> >>>> >>>>> which I thought I disabled?)
>> >>>> >>>>>
>> >>>> >>>>> 20:56:52.891588 IP localhost.42381 > localhost.42217: Flags [P.], seq
>> >>>> >>>>> 385950220:385950226, ack 4144176433, win 349, options [nop,nop,TS val
>> >>>> >>>>> 133288705 ecr 133281205], length 6
>> >>>> >>>>> E..:x. at .@................."....1...].......
>> >>>> >>>>> ........STATS
>> >>>> >>>>>
>> >>>> >>>>> 20:56:52.891734 IP localhost.42217 > localhost.42381: Flags [P.], seq
>> >>>> >>>>> 1:21, ack 6, win 256, options [nop,nop,TS val 133288705 ecr
>> >>>> >>>>> 133288705], length 20
>> >>>> >>>>> ...............1.."......<.....
>> >>>> >>>>> ........9 Statistics follow
>> >>>> >>>>>
>> >>>> >>>>> 20:56:52.891760 IP localhost.42381 > localhost.42217: Flags [.], ack
>> >>>> >>>>> 21, win 349, options [nop,nop,TS val 133288705 ecr 133288705], length
>> >>>> >>>>> 0
>> >>>> >>>>> E..4x. at .@................."....E...])......
>> >>>> >>>>> ........
>> >>>> >>>>> 20:56:52.891777 IP localhost.42217 > localhost.42381: Flags [P.], seq
>> >>>> >>>>> 21:176, ack 6, win 256, options [nop,nop,TS val 133288705 ecr
>> >>>> >>>>> 133288705], length 155
>> >>>> >>>>> E...C. at .@..................E.."............
>> >>>> >>>>> ........QueueLength: 0
>> >>>> >>>>> UpdatesReceived: 0
>> >>>> >>>>> FlushesReceived: 0
>> >>>> >>>>> UpdatesWritten: 0
>> >>>> >>>>> DataSetsWritten: 0
>> >>>> >>>>> TreeNodesNumber: 0
>> >>>> >>>>> TreeDepth: 0
>> >>>> >>>>> JournalBytes: 0
>> >>>> >>>>> JournalRotate: 0
>> >>>> >>>>>
>> >>>> >>>>> 20:56:52.891783 IP localhost.42381 > localhost.42217: Flags [.], ack
>> >>>> >>>>> 176, win 357, options [nop,nop,TS val 133288705 ecr 133288705], length
>> >>>> >>>>> 0
>> >>>> >>>>> E..4x. at .@................."........e)......
>> >>>> >>>>> ........
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>> >>>>> Regards,
>> >>>> >>>>> Stian Øvrevåge
>> >>>> >>>>>
>> >>>> >>>>> _______________________________________________
>> >>>> >>>>> collectd mailing list
>> >>>> >>>>> collectd at verplant.org
>> >>>> >>>>> http://mailman.verplant.org/listinfo/collectd
>> >>>> >>>>
>> >>>> >>>>
>> >>>> >>>> _______________________________________________
>> >>>> >>>> collectd mailing list
>> >>>> >>>> collectd at verplant.org
>> >>>> >>>> http://mailman.verplant.org/listinfo/collectd
>> >>>> >>
>> >>>> >
>> >>>> >_______________________________________________
>> >>>> >collectd mailing list
>> >>>> >collectd at verplant.org
>> >>>> >http://mailman.verplant.org/listinfo/collectd
>> >>>> >
>> >>>> >
>> >>>>
>> >>>> _______________________________________________
>> >>>> collectd mailing list
>> >>>> collectd at verplant.org
>> >>>> http://mailman.verplant.org/listinfo/collectd
>> >>>
>> >>>
>> >>>
>> >>
>> >>_______________________________________________
>> >>collectd mailing list
>> >>collectd at verplant.org
>> >>http://mailman.verplant.org/listinfo/collectd
>> >>
>> >>
>> >
>> >_______________________________________________
>> >collectd mailing list
>> >collectd at verplant.org
>> >http://mailman.verplant.org/listinfo/collectd
>> >
>> >
>>
>> _______________________________________________
>> collectd mailing list
>> collectd at verplant.org
>> http://mailman.verplant.org/listinfo/collectd
>
>
>_______________________________________________
>collectd mailing list
>collectd at verplant.org
>http://mailman.verplant.org/listinfo/collectd
>
>
More information about the collectd
mailing list