[collectd] Aggregating statsd metrics

David Blewett david at dawninglight.net
Sat Aug 16 17:31:50 CEST 2014

Hi All:

We are using the statsd plugin to report metrics from inside our apps.
However, the default behavior of collectd prefixes all metrics with the
server's hostname. This means we have our app metrics subdivided by the
number of nodes running the app. In our environment, we would really prefer
only having aggregate statistics across all nodes running the app. To
further complicate things, our chosen data store (InfluxDB via the
write_graphite plugin) cannot itself aggregate more then 2 time series. It
is likely that this limitation will be lifted in the future, but it seems
most efficient to do this before it hits InfluxDB.

It seems there are 3 options for doing this:

* Use a PreCache filter to strip off the hostname from all statsd plugin
metric values
* Run a single collectd instance after all others that uses the aggregation
plugin to sum values from all statsd metrics
* Run an additional collectd instance on each node with a blank Hostname
value, purely for statsd

The first option seems to be the simplest to implement, but it closes the
door to analyzing per-host metrics down the road.

The second option seems attractive to overcome this, but bug #297 [1] seems
like a blocker for it. I want this metric gathering to be as bullet-proof
as possible. The bug is from 5.2; we're running the latest 5.4. Has anyone
experienced this issue on the latest versions?

The last option seems like more of a hack, but because collectd is so
lightweight it seems feasible. Would it possibly perform better than a
single instance always doing checks for the statsd plugin? We're emitting
statsd metrics quite frequently.

1. https://github.com/collectd/collectd/issues/297


David Blewett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.verplant.org/pipermail/collectd/attachments/20140816/817a3551/attachment.html>

More information about the collectd mailing list