[collectd-changes] collectd, the system statistics collection daemon: Changes to 'master'

Tue Aug 18 21:40:50 CEST 2009

 src/collectd.conf.pod |    8 ++++++++
 src/rrdtool.c         |   36 ++++++++++++++++++++++++++++++++++--
 2 files changed, 42 insertions(+), 2 deletions(-)

New commits:
commit c35203c82560eba66bb901aa22c5170fb8c389fb
Merge: 4151d975ca93af4570a1ca97a0408ba446ea7485 2bca2a511c1636bf448112d081d115ebc01b5ed4
Author: Florian Forster <octo at leeloo.lan.home.verplant.org>
Date:   Tue Aug 18 21:39:03 2009 +0200

    Merge branch 'mg/jitter'

commit 2bca2a511c1636bf448112d081d115ebc01b5ed4
Author: Florian Forster <octo at leeloo.lan.home.verplant.org>
Date:   Tue Aug 18 21:38:01 2009 +0200

    collectd.conf(5): Document the new `RandomTimeout' option.

commit d278a40cab2bcb6bb0387176d087ce13cd3e843b
Author: Florian Forster <octo at leeloo.lan.home.verplant.org>
Date:   Tue Aug 18 21:23:21 2009 +0200

    rrdtool plugin: Optimize away the â€˜random_timeout_modâ€™ variable.

commit bdcac4078f8052b8e4f425a1e5aea3957551e0d3
Author: Mariusz Gronczewski <xani666 at gmail.com>
Date:   Tue Aug 18 21:18:06 2009 +0200

    rrdtool plugin: Call rand(3) less often.
    
    2009/8/18 Florian Forster <octo at verplant.org>:
    > Hi Mariusz,
    >
    > On Mon, Aug 17, 2009 at 02:20:29AM +0200, Mariusz Gronczewski wrote:
    >> i was thinking how to "spread out" writes to rrd files a bit, because
    >> now its big spike every CacheTimeout or little smaller "square" on
    >> graph if u use WritesPerSecond.
    >
    > in general I like your patch, thank you very much for posting it :)
    > I have some doubts about calling rand() in such a busy place though,
    > since getting random numbers is potentially costly. Also, rand(3) is not
    > thread-safe, though I don't think that's really an issue for us.
    
    Yeah good point, but that would be probably noticable on very slow
    (like PIII 800 slow) machines with tons of rrd, and then machine would
    run out of disk bandwidth first.
    
    > Maybe a solution would be to add a â€˜random_timeoutâ€™ member to the
    > â€˜rrd_cache_tâ€™ struct, too. This member is then set when creating the
    > entry and set again right after the values have been removed. That way
    > rand(3) is only called once for each write instead of calling for every
    > check.
    Yeah, very good idea, i didnt thougth about that (well tbh. i didnt
    looked much into "interiors" of rrdtool plugin). Ive implemented it in
    attached patch, so far ive been testing it for about 1 hour and works
    pretty well.
    
    > As an interesting sidenote: With the above approach, the random write
    > times are distributed â€œuniformâ€, i. e. every delay from 0 to max-1
    > seconds has the same probability. With your code, I think the actual
    > time a value is written follows a â€œnormalâ€ distribution (you know, that
    > famous bell curve). So I'd expect the above approach to spread the value
    > quicker.
    
    Yup, exactly as u said, its much quicker like that.
    Im wondering how config variable should be called, name
    "RandomTimeout" dont mean anything useful ("random timeout of what?"),
    maybe TimeoutSpread ? RandomizeTimeout ?

commit fd48357ddeb1b58d5795015e845f3105a7ba3103
Author: Mariusz Gronczewski <xani666 at gmail.com>
Date:   Mon Aug 17 02:20:29 2009 +0200

    Random write timeout for rrdtool plugin
    
    Hi,
    
    i was thinking how to "spread out" writes to rrd files a bit, because
    now its big spike every CacheTimeout or little smaller "square" on
    graph if u use WritesPerSecond. So ive written little patch which
    "spreads out" writing by changing Cache timeout every time rrdtool
    plugin finds data to save. Basically instead of moving data older than
    CacheTimeout to write queue it moves it if its older than CacheTimeout
    +- RandomTimeout. What it changes?
    
    Without it, gathered data is "synchronised" with eachother, for
    example (CacheTimeout = 600):
    1.collectd starts
    2. after 10 minutes, data from all plugins get "too old" and get
    pushed into write queue and get saved
    3. after another 10 minutes, same thing, all data "ages" at same time
    and get saved in one big chunk
    
    With it (RandomTimeout=300) it works like that
    1. collectd starts
    2. after 5 minutes some data (lets call it A) starts to go into write queue
    3. after 10 minutes from start about 50% (on average) data is saved
    (lets call it B)
    4. finally, after 15 minutes, all "leftover" data gets saved (lets call it C)
    5. next "cycle"
    6. data A ages first (cos it was put to disk first) and like before,
    some of it gets writen earlier, some of it gets written later)
    7. after that data B ages and like before writes are spread over 10 mins
    8. same with C
    
    so first cycle (looking at i/o) looks like sinus, next 10 minute cycle
    is same sinus but flattened a bit and so on (looks like fading sinus),
    and after few cycles it gives pretty much same amount on writes per
    sec, no ugly spikes.
    Effect looks like that:
    http://img24.imageshack.us/img24/7294/drrawcgi.png
    (after few more h it will be more "smooth")
    
    Regards
    Mariusz
    
    Signed-off-by: Florian Forster <octo at huhu.verplant.org>