[collectd] Bug#422208: /etc/init.d/collectd doesn't stop all the daemons
Sebastian Harl
sh at tokkee.org
Wed Aug 8 23:06:22 CEST 2007
Hi,
On Thu, Jun 14, 2007 at 09:43:31PM +0200, Bas Zoetekouw wrote:
> > > > Hmm... I cannot reproduce this. It works fine on all machines I'm running
> > > > collectd on. Please note that it might take some time (a couple of seconds)
> > > > for collectd to shut down cleanly. Could you please verify this?
> > >
> > > Ah, that indeed seems to be the case. It's a bit confusing though. Are
> > > you sure this isn't going to lead to weird race problems?
> >
> > I cannot think of any race conditions. Have you run into any problems so far?
> > What kind of problems are you thinking about?
>
> Well, maybe if the user wants to restart the daemon before the old one
> is exited? Won't the port still be taken then?
Well, a socket might still be in use if the restart takes place before the
former process has terminated. However, the appropriate plugin should retry
opening the connection for a couple of iterations in that case. If it does not
do so a separate bug should be filed and the plugin should be fixed.
> > The shutdown time should have been decreased a fair amount in version 3.10.4.
> > Can you estimate that amount of time it takes for you? I'm going to
> > investigate if it should be further decreased. Any other opinions on this?
>
> Actually, since I set up collectd on my machines, I've never noticed it
> anymore. It's jsut when you tinker with the config files and start and
> stop the daemon lots of time that it gets noticable.
>
> As the behaviour is not what most uses would expect when running an
> init.d "stop" script, maybe it would be better to just avoid it at all
> by letting the init script wait for the daemon to exit? Squid seem to
> do it like that...
Starting with version 4 the code which writes to the RRD files caches updates
to the files. In large setups the cache size might get quite big and a large
amount of data might have to be flushed when shutting down the daemon. As the
amount of time might vary a lot depending on your settings and the number of
"clients" I cannot think of any reasonable timeout to wait for the daemon to
stop. Possibly waiting infinitely is not a good idea imho and exiting with a
non-zero status is not really what you want in that case either as e.g. a
restart action would fail as well.
Can anybody think of a good solution? I'd really appreciate some more
opinions.
Cheers,
Sebastian
--
Sebastian "tokkee" Harl +++ GnuPG-ID: 0x8501C7FC +++ http://tokkee.org/
Those who would give up Essential Liberty to purchase a little Temporary
Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mailman.verplant.org/pipermail/collectd/attachments/20070808/461453d2/attachment.pgp
More information about the collectd
mailing list