[collectd] Metric manipulation plugins

Sat Nov 16 08:49:06 CET 2013

Hi Pierre,

I can't answer your Collectd-specific questions, but I'm wondering why
block-based approach is needed?
If Collectd outputs data every 10 seconds, for example, isn't the
value written out every 10 seconds already aggregated in some way?

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Sun, Nov 3, 2013 at 5:04 AM, Pierre-Yves Ritschard <pyr at spootnik.org> wrote:
> Hi list,
>
> Right now in collectd we have read, write, notification and logging plugins
> which cover most our use cases.
>
> I think the model falls short when implementing plugins like aggregation,
> chaining or threshold. It seems as though we are missing an intermediate
> endpoint to plug in metric manipulation when collection windows end.
>
> As some of you may know I've been playing with a lib which implements
> generic metric manipulation, with a simple language (example syntax:
> https://gist.github.com/pyr/7070364)
>
> Now that the syntax is well implemented in a contained library, I'm looking
> for ways to implement it. I see two ways that "mangling" plugins might want
> to interact with collectd:
>
> - in a streaming fashion: processing metrics as they come in
> - in a block fashion: processing a full window of collected metrics
>
> Writing a streaming mangling plugin is an easy task, the "aggregation"
> plugin is such an example, it registers a read plugin then marks the metrics
> it generates with an attribute to avoid looping. filter_chains also
> implement a similar mechanism allowing simple streaming handling.
>
> Writing block handling plugins is much more difficult, there doesn't seem to
> be an idea of a full metric window event. So writing such plugins now need
> to be done in one of two ways:
>
> - accumulate metrics and trigger processing at regular intervals
> - accumulate metrics and trigger processing when enough events have been
> input
>
> My current design expects a full window of metrics, it is a "pure" function
> which for a specific window of metrics and configuration syntax will output
> the same window of metrics augmented with a sink (a destination write
> plugin) and potentially a state.
>
> This approach has the drawback of forcing accumulation at some point, which
> might be a problem on aggregation instances but will be negligible on
> node-local instances (actually given the in-memory size of metrics, it would
> take a very busy aggregation instance to make this noticeable /
> problematic).
>
> The simplest way of implementing this seems to be queuing up metrics in the
> sent to the write plugin and scheduling processing when the read function is
> called (waiting for a small delay to leave time for other read plugins to
> submit their metrics).
>
> My current questions are:
>
> - are collectd users at large interested by an all-encompassing mangling
> plugin (superseding the functionality found in chains, thresholds and
> aggregatio plugins) ?
> - would most people prefer a configuration that integrates in the main
> collectd.conf ? It seems a bit unwieldy to me but could be doable
> - is there a way I missed to accumulate metrics between poll intervals in a
> sound way ?
>
> Thanks for your help putting this together!
>   - pyr
>
>
>
> _______________________________________________
> collectd mailing list
> collectd at verplant.org
> http://mailman.verplant.org/listinfo/collectd
>