[collectd] Using utils_latency in the tail module to calculate percentile

Wilfried Goesgens dothebart at citadel.org
Sun Jan 12 20:33:02 CET 2014


Hi everyone,  

as already anounced on IRC, I'd like to bring the percentile calculation
available in the new statsd module to the tail module.  

Reading through the source it seems to be that there is one main difference
between the way in tail.so and statsd;(if I get that correctly?)  

The tail infrastructure does all calculation one by one, and it seems that
only one gauge can be added per regex, so if one wants Average & Max, the
regex has to be there twice?  

In contrast, utils_latency adds values to a data-row, and can run several
calculations on one dataset for reporting once the aggregation timespan has
passed, and values are to be delivered.  

   

So I started knitting eveything together as I thought was right (see patch)
and did some tests; however, are the results correct?  

    <File "/home/willi/test.log">
        Instance "blarg"
        <Match>
            Regex "S=([1-9][0-9]*)"
            DSType "GaugePercentile"
            Type "gauge"
            Instance "percentile"
        </Match>
was my test config, and I generated a logfile using  

for z in `seq 1 1000000`; do sleep 0.1;  for j in `seq 1 10`; do for i in
`seq 1 10`; do printf "S=1${i}000\n" >> test.log; done; touch test.log ;
sleep 1; done; done  

I hacked in some test code to print out the calculated gauges:   

        values[0].gauge = CDTIME_T_TO_DOUBLE (
          latency_counter_get_min(match_value->Counter));
      fprintf(stderr, "Min: %f\n", values[0].gauge);
          latency_counter_get_max  ->      fprintf(stderr, "Max:
%f\n", values[0].gauge);
          latency_counter_get_average -> fprintf(stderr, "Avg: %f\n",
values[0].gauge);
          latency_counter_get_percentile(match_value->Counter, 95));
-> fprintf(stderr, "perc: %f\n", values[0].gauge);  

and it would always give me s.th. like this:  

Min: 0.000010
Max: 0.000102
Avg: 0.000023
perc: 0.001000  

while the input data looks like this:  

S=11000
S=12000
S=13000
S=14000
S=15000
S=16000
S=17000
S=18000
S=19000
S=110000  

so, for the MIN-value  one can clearly see that this should be something
with '11' in it, regardless of where the float pushes the decimal point while
the debug print reads Min: 0.000010 ?  

   

So, my questions are:  

  - do I feed in the data correctly?  

  - do I read out the data correctly?  

  - am I right that the data series can be used to calculate several gauges
in a row?  

  - is my debug printf flawed? how should it look?  

  - to aproach a final state, should all (or most) calculations of tail be
replaced by the latency one?  

  - if the old should live alongside the new ones, make DSType 'Gauges' and
have some more settings with a list of the gauges to calculate?  

TIA,  

Wilfried Goesgens
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.verplant.org/pipermail/collectd/attachments/20140112/5038701c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: POC_use_latency_in_tail.diff
Type: text/x-patch
Size: 8276 bytes
Desc: not available
URL: <http://mailman.verplant.org/pipermail/collectd/attachments/20140112/5038701c/attachment.bin>


More information about the collectd mailing list