graphite-dev team mailing list archive

Thread
Date

[Question #232119]: AMQP, carbon-(relay or cache) and scaling Graphite

To: graphite-dev@xxxxxxxxxxxxxxxxxxx
From: gyre007 <question232119@xxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 09 Jul 2013 13:01:39 -0000
Reply-to: question232119@xxxxxxxxxxxxxxxxxxxxx
Sender: bounces@xxxxxxxxxxxxx

New question #232119 on Graphite:
https://answers.launchpad.net/graphite/+question/232119

Hi guys,

I'm looking for some information about how to approach scaling Graphite using AMQP.
>From the research I've done I can see that the best thing to scale graphite for large about of metrics is to run a carbon-relay on each Graphite node and have it distribute the metrics across all carbon-caches (ideally running multiple on each node in the cluster) across all nodes. If you use consistent-hashing and duplication factor =1 then your metrics are distributed across all nodes in Graphite cluster and carbon-relay - thanks to consistent hashing will always write and read from the correct carbon-caches where particular metric is stored.

Now I'm wondering how to scale Graphite using AMQP - as that's how we are pushing stats to Graphite.
I can see that carbon-relay does not have the ability to talk to AMQP yet (https://github.com/graphite-project/carbon/issues/37) so spreading/distributing metrics across all carbon-caches in the cluster using consistent hashing is not possible when using AMQP as data feeder - please correct me if I'm wrong.

I take it that if I configure multiple carbon-caches running on multiple server in the Graphite cluster which will read metrics from the AMQP, then these will be storing them only on particular server on which the particular carbon-cache is running. This slightly worries me because I will end up having whisper files for same metrics across multiple servers according to how they're read from AMQP ie. cache1:a reads metrics and stores them on server1, then cache2:a reads the same metrics AT different time and stores them on server2. Now, if server 2 dies I won't have some datapoints available to graph - ie the data points stored on server2. Is this assumption correct ?

On top of that, given the nature of Whisper - this will occupy quite some storage space on BOTH servers - unnecessary IMHO - depending on the storage-aggregation. Basically if I have 1s:12hrs then this will create quite big whisper files on BOTH server1 and server2 yet most of the data stored in the whisper file will be empty as some metrics will be written on server1 and some on server2.
Would this be the case with carbon-relay too ? Or would carbon-relay never cause the same metrics to be stored on different servers ie the question is is carbon-relay giving you the option of less storage usage as the same metrics will NOT be spread across all servers in the cluster but will be stored ON THE SAME SERVER ?

This makes me wonder about the use of carbon-relay for Graphite scaling. I get the benefit of it being able to distribute the writes across all carbon-caches in the cluster and being able to read (find) quickly based on consistent-hashing algorithm (when consistent hashing is used). Is my assumption correct if I say that using carbon-relay does not make your metrics data highly available ? ie if one of your Graphite nodes dies it does not mean that you'll still be able to access the datapoints stored on the "dead" node. Unless you need to set the duplication factor to 2 ?

I'd appreciate if someone could shed some light on this for me.
Thanks

--
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.