[collectd] [Internet] Re: collectd storing data in a db without data condensing (like rrd)?

Grzybek Mathieu CNE (BCQ STIG) mathieu.grzybek at gendarmerie.interieur.gouv.fr
Tue Apr 18 12:07:58 CEST 2017


In order to improve agility and manage bursts, my collectd daemons write 
to a RabbitMQ cluster. Then, the metrics are consumed by carbon-aggregator.

The actual situation :
- about 4 000 metrics / second on the broker (1 point per minute)
- metrics are consumed by 2 queues, for multi-datacenter replication
- about 850 hosts
- 144 135 whisper files

The advantages :
- any client can consume the metrics (graphite, real time monitoring 
with riemann…) without breaking the whole workflow
- it is easier to change the backend (no collectd reconfiguration)
- you can stop your backend without loosing the data thaks to RabbitMQ's 
- creating graphs using Grafana is really easy for the end-user

The disadvantages :
- the architecture is more complex
- graphite / whisper storage is not flexible
- managing a distributed environment is a pain in the ass

The future :
- use Elasticsearch (already used for logs with Graylog) to store metrics
- use OenTSDB connected to our Hadoop cluster to deal with long-term 
storage and data science


Le 28/12/2016 15:40, Dave Cottlehuber a écrit :
> On Tue, 20 Dec 2016, at 19:02, Andreas Schuldei wrote:
>> What are the recommended ways of storing data in a database, where the
>> data
>> is not condensed, like rrd does it?
>> I am recoding mostly temperature data of ca 60 sensors, over the time
>> frame
>> of several years/decades, and I need to be able to compare between years.
>> So it won't be a high volume of data coming in per hour, but it will be
>> some data accumulating over the years.
>> this system will run on a resource constrained server, so something with
>> a
>> modest memory footprint  would be appreciated. (i expect to upgrade that
>> system as the hardware dies, but i would prefer not to doing migrations
>> between databases every time i switch hardware.
>> what database do you recommend?
>> What would be a frontend (for plotting the data) to go with that?
> Interesting questions.
> I have been using graphite mainly because its old and stable. There are
> many new shinier alternatives, such as influxdb, but the rate of change
> in these projects is high and I don't have advanced needs nor high
> performance.
> I use it without the aggregation functions, backed by FreeBSD with zfs.
> the compression is excellent and I store 7 years of flat metrics as a
> result. It depends on what you mean by memory constrained here, but you
> could reasonably run graphite on a 4GB low end server, and possibly
> lower with some experimentation and tuning, maybe as low as 2GB RAM.
> I use the graphite-api layer
> http://graphite-api.readthedocs.io/en/latest/ with grafana to provide
> graphs/plotting.
> There are a number of more efficient carbon storage engines for
> graphite, instead of the native format, neither of which I've needed to
> use:
> - https://github.com/lomik/go-carbon
> - https://github.com/tureus/graphite-rust
> and a few more which I can't seem to find atm, which provide a more
> efficient storage layer.
> I write metrics out from collectd via write_riemann to riemann.io
> (clojure based) and subsequently trigger alerts if needed, and write the
> rest out to graphite via the carbon daemon. This step wouldn't be needed
> in your situation but it does allow some nice functionality. More
> details available if needed.
> Given collectd's write_http plugin you can send metrics to pretty much
> anything you want, although querying
> A+
> Dave
> _______________________________________________
> collectd mailing list
> collectd at verplant.org
> https://mailman.verplant.org/listinfo/collectd

Le capitaine Mathieu GRZYBEK

-------------- next part --------------
Ce message électronique et tous les fichiers attachés qu'il contient sont confidentiels et destinés exclusivement à l'usage de la personne à laquelle ils sont adressés. Si vous avez reçu ce message par erreur, merci de le retourner à son émetteur. La publication, l'usage, la distribution, l'impression ou la copie non autorisée de ce message et des attachements qu'il contient sont strictement interdits.

En cas d'urgence, composez le 17 ou le 112.
Afin de contribuer au respect de l'environnement, merci de n'imprimer cet e-mail qu'en cas de necessite.

This e-mail and any files transmitted with it are confidential and intended solely for the use of the individual to whom it is addressed. If you have received this email in error please send it back to the person that sent it to you. Unauthorized publication, use, dissemination, forwarding, printing or copying of this email and its associated attachments is strictly prohibited.

In case of emergency, dial number 17 or 112.
To contribute to the environmental protection, please print this e-mail only if necessary. 

More information about the collectd mailing list