← Back to team overview

graphite-dev team mailing list archive

[Question #243703]: Finding top N across 100, 000 metrics

 

New question #243703 on Graphite:
https://answers.launchpad.net/graphite/+question/243703

Hey - 

We are using Graphite to track web analytics for a consumer website. We are trying to find which pages are the most popular in the last 10 hours.

We run 20 frontend web servers which report metric counts into graphite using the following namespace pattern:

<host>.page.<page_id> 

Our query looks like this:
target=highestMax(groupByNode(*.page.*,2,"sumSeries"),10)&from=-10h&format=json

The website has over 100,000 pages and takes over 2 minutes to run the query. We have 2 m1.large EC2 instances running 2 carbon_caches on each server. Ideally I'd like the query to return in under 2 seconds. I'm trying to figure out exactly where we should optimize Graphite.

I see a couple of options:
1. We're using Graphite wrong - Querying across 100k metrics for 10 hours takes a lot of effort. We should find a different solution other than Graphite.
2. We need more carbon_caches on more machines - Spread the responsibility of caching out to more than 2 servers.
3. We should aggregate metrics using carbon-aggregator - Consolidate metrics to remove host in the namespace thus reducing the amount of metrics needing to be queried.

Any suggestions would be greatly appreciated.

Thanks,

Jason

-- 
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.