[collectd] Writing to a set of Graphite relays

Mon Nov 24 18:27:09 CET 2014

On Fri, Nov 21, 2014 at 4:48 PM, Justin Lloyd <jllwyd at gmail.com> wrote:

> Yes, if a relay becomes unavailable, metrics from certain hosts would get
> lost until the relay was restored. It's not ideal but it's an improvement
> over our current architecture. This is in AWS and I'm using Salt for
> configuration management, so ideally I'd be able to spin up a replacement
> relay pretty quickly in the event of a complete relay failure.
>
>
My goal is to lose as few metrics as possible (a single poll-cycle or two)
during an infrastructure blip, so the receiving layer must be
redundant/resilient.

> I am still considering a possible way to use a load balancer (ELB since
> this is in AWS) but, as I mentioned, I have other considerations that may
> limit my ability to do that.
>

FWIW, I'm using ELB to front pairs of relays, and have an issue I haven't
yet been able to solve where restarting/stopping one of the relays does not
lead to a graceful reconnection of clients; they remain stuck in CLOSE_WAIT
for a while, until restarted.  There is apparently a brief history of other
services fronted with ELBs with clients ending up in this state, so I'm
still looking for a fix and/or alternate configurations.

-tt

>
>
> On Fri, Nov 21, 2014 at 1:11 PM, Tom Throckmorton <throck at gmail.com>
> wrote:
>
>> On Fri, Nov 21, 2014 at 3:36 PM, Justin Lloyd <jllwyd at gmail.com> wrote:
>>
>>> Load balancing and/or round-robin DNS were not appropriate for our
>>> situation given other architectural and application considerations in our
>>> environment.
>>>
>>> My solution turned out to be simpler than I expected once I put two and
>>> two together with the muti-node capability of the write_graphite plugin
>>> combined with a PostCacheChain, like
>>> https://gist.github.com/anonymous/b3c1cd692632216c4c35.
>>>
>>
>>
>> Ah, cool - like this ->
>> https://collectd.org/wiki/index.php/Match:Hashed/Config  So you're
>> making collectd send some metrics to one relay, some to another etc.  What
>> happens when one of the relays goes unavailable?  Not trying to poke holes,
>> but to understand your viewpoint - couple of ways to achieve
>> distribution/redundancy here, and always good to hear other ideas...
>>
>> -tt
>>
>>
>>
>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Nov 20, 2014 at 6:26 PM, Tom Throckmorton <throck at gmail.com>
>>> wrote:
>>>
>>>> On Thu, Nov 20, 2014 at 2:33 PM, Justin Lloyd <jllwyd at gmail.com> wrote:
>>>>
>>>>> Hey all,
>>>>>
>>>>> I'm looking for a way to have Collectd be able to send to a set of
>>>>> Graphite relays. Basically right now I have a three-layered Graphite setup,
>>>>> with a layer of four relays (for load distribution and server redundancy)
>>>>> sending to a layer of four aggregators that send to a set of carbon-cache
>>>>> processes on the cache server. I need my Collectd agents to be able to send
>>>>> to any of the four relays but I can't figure out how to accomplish this.
>>>>>
>>>>
>>>> Hi Justin;
>>>>
>>>> If you want clients to truly send to *any* of the relay nodes, then
>>>> maybe you want to look at fronting your relays with load-balancing, or
>>>> maybe even round-robin DNS?  Each have their benefits and tradeoffs, of
>>>> course, but in either case the collectd config is as simple as having a
>>>> single relay destination.
>>>>
>>>>
>>>>> I was hoping the filtering
>>>>> <http://collectd.org/documentation/manpages/collectd.conf.5.shtml#filter_configuration> capability
>>>>> would help but I don't see a way to specify a relay-specific write_graphite
>>>>> plugin per rule.
>>>>>
>>>>
>>>> In that case, would you be sending to *all* of the relays? (vs. any)
>>>>
>>>> -tt
>>>>
>>>>
>>>>>
>>>>> Any suggestions? I did come across backstop
>>>>> <https://github.com/obfuscurity/backstop> but I didn't want to
>>>>> introduce a Rack app into the mix.
>>>>>
>>>>> Thanks,
>>>>> Justin
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> collectd mailing list
>>>>> collectd at verplant.org
>>>>> http://mailman.verplant.org/listinfo/collectd
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.verplant.org/pipermail/collectd/attachments/20141124/3397940d/attachment.html>