← Back to team overview

graphite-dev team mailing list archive

Re: [Question #158830]: MAX_CREATES_PER_MINUTE discarding metrics

 

Question #158830 on Graphite changed:
https://answers.launchpad.net/graphite/+question/158830

    Status: Open => Answered

Nicholas Leskiw proposed the following answer:
Several things:

1.) There's two distinct operations - creates and updates.  They are
very different.

Creates only happen once during the life of a .wsp file - updates
happen when the carbon-cache fills up and many datapoints are sent to
disk.

Here's the reasoning:

Let's say you setup a bunch of new monitoring, sending 30,000 new
metrics to Graphite every minute, and MAX_CREATES_PER_MINUTE is set to
500.  Let's further say that each .wsp file is going to be about 5MB
(large, but not uncommon if you're storing years of minutely data).

Graphite makes 500 new .wsp files the first minute (allocating nearly
2.5GB of disk space) 29,500 never get created.
Next minute, the same 30,000 metrics are sent.  500 new metrics get
created (you're down to 29,000), but the first 500 still get their
data - that's an update, not a create. After 3 min 28,500. After an
hour, you've gotten all the files created, all the updates are rolling
along and you didn't impair the existing monitoring by thrashing the
disk while simultaneously creating all those new files.

This was a common place occurrence at Orbitz, (the forge in which
Graphite was created.) They would install a new codebase, and ERMA
(Extensible Reusable Monitoring API) would start sending 10's of
thousands of new metrics.  Since they were all new metrics, and many
of the applications weren't even taking customer traffic yet, it
wasn't critical that they all get created at the same time.

I hope this helps you understand the reason why it's setup like this.

2.) For perf testing - just send the data for a while, eventually all
the files will get created.

3.) For help with performance problems, please describe what kind of
performance problems you're having.  Also, carbon saves some
performance info about itself - check
carbon.agents.HOSTNAME.updateOperations
carbon.agents.HOSTNAME.avgUpdateTime

Are you using memcached?  That caches both graphs and data used for
drawing graphs and can reduce both CPU usage and I/O.

Let us know.

-Nick


On Tue, May 24, 2011 at 9:50 AM, ziggy
<question158830@xxxxxxxxxxxxxxxxxxxxx> wrote:
> New question #158830 on Graphite:
> https://answers.launchpad.net/graphite/+question/158830
>
> I was performance testing Graphite when I discovered that only a small number of the metrics I was sending to Graphite were ever being written to disk.  I found this bit of relevant code in writer.py:
>
> elif createCount >= settings.MAX_CREATES_PER_MINUTE:
>        # dropping queued up datapoints for new metrics prevents filling up the entire cache
>        # when a bunch of new metrics are received.
>        try:
>          MetricCache.pop(metric)
>        except KeyError:
>          pass
>
> So am I correct that Graphite discards metrics in the queue once it hits MAX_CREATES_PER_MINUTE?  I was very surprised to see this behavior.  The comments in carbon.conf don't mention that setting this low will lose data, I thought it would just take longer to create new whisper files.  Is there a way to make graphite non-lossy other than setting this very high?  We've had performance problems and so I was hoping to slow disk writes in order to increase Graphite's ability to receive metrics quickly, but it appears this is not the way to do that.  Can anyone verify I'm understanding this right?
>
> Thanks!
>
> --
> You received this question notification because you are a member of
> graphite-dev, which is an answer contact for Graphite.
>
> _______________________________________________
> Mailing list: https://launchpad.net/~graphite-dev
> Post to     : graphite-dev@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~graphite-dev
> More help   : https://help.launchpad.net/ListHelp
>

-- 
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.