← Back to team overview

graphite-dev team mailing list archive

[Question #217773]: Determine storage requirements and instance type for EC2

 

New question #217773 on Graphite:
https://answers.launchpad.net/graphite/+question/217773

Is there a readily available method for calculating the storage requirements for a solution to be deployed on EC2?  Suppose I had 1,000 instances and 10 counters.  How would I figure out the storage requirements and IO requirements?  Since it probably changes at 100,000 instances or 1000 counters, how does the formula change?  Is there any guidance beyond try x and then tweak and try y?  I realize that there is probably not a straightforward "you need an instance type of x with y GB available", but how can I estimate the correct size?  Assume I want some level of detail on the 1,000 instances -- how would I estimate this storage?  Adding more volumes isn't the issue so much as trying to figure out the IO requirements and ability to add/remove nodes, handle failure, etc.

It's also very unclear to me how to distribute the data files. I think that this is what Ceres is planning on addressing, but everything I see on that version appears a bit dated,so I don't know if that is the right way to go and what the implications are of such a decision.

Basically, given where the project is now, if you were planning on a potentially extremely large scale data set on EC2 (assume that an intermediate layer can do some level of near-real-time aggregation, the Graphite wouldn't be network bound and would get batched data), what would you do?

-- 
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.