graphite-dev team mailing list archive

Thread
Date

Re: [Question #136096]: Collecting system metrics

To: graphite-dev@xxxxxxxxxxxxxxxxxxx
From: Pete Emerson <question136096@xxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 01 Dec 2010 23:42:33 -0000
Reply-to: question136096@xxxxxxxxxxxxxxxxxxxxx
Sender: bounces@xxxxxxxxxxxxx

Question #136096 on Graphite changed:
https://answers.launchpad.net/graphite/+question/136096

    Status: Answered => Open

Pete Emerson is still having a problem:
I'm still running 0.9.4 and am starting in on migrating to 0.9.6.

We're currently running a single server, using gmond / gmetric to
collect our stats, and then a cron job whips through all of those stats
on the metrics box and then opening TCP sockets to the carbon-agent on
port 2003. A small set of metrics are direct-to-carbon.

I'm migrating to 0.9.6 with federated servers (one per datacenter), and
having stats sent directly to the carbon-agent. I'm not sure whether
I'll pickle or not, I suppose if pickling saves overhead I'd do that.

I'd love to help provide a collection process per Question 136096; I'll
see if I can get permissions to open source what I write.

Currently the cron job is updating over 100k metrics every minute (it
runs in @6 seconds) to a 15GB tmpfs (not solid state drives, so if the
box goes down, we lose all the metrics).

The hardware underneath is currently a Dell PowerEdge m600 with a Quad-
core Xeon L5420, 2.5GHz, 2x6MB cache with 16GB, 667MHz Memory and 2x94GB
10k RPM SAS, RAID-1

I can move this to 8 disks in a RAID-10 configuration if I need to, and
*might* be able to get dual-proc quad cores as well.

In addition, the server is running Xen, so there is a bit of overhead
there. If I really need to I can probably get this layer stripped out,
but I'd definitely rather leave it in.

Pete

-- 
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.