graphite-dev team mailing list archive

Thread
Date

Re: [Question #186672]: carbon does not resume receiving metrics until cache is empty

To: graphite-dev@xxxxxxxxxxxxxxxxxxx
From: Sidnei da Silva <question186672@xxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 09 Feb 2012 18:01:14 -0000
Reply-to: question186672@xxxxxxxxxxxxxxxxxxxxx
Sender: bounces@xxxxxxxxxxxxx

Question #186672 on Graphite changed:
https://answers.launchpad.net/graphite/+question/186672

    Status: Open => Answered

Sidnei da Silva proposed the following answer:
If I understand correctly, the issue is that the check for cache size
being available only happens once at the beginning of
'optimalWriteOrder' (http://bazaar.launchpad.net/~graphite-
dev/graphite/main/view/head:/carbon/lib/carbon/writer.py#L49).

This should be obvious from the logs, I think. In my case, the time
between log entries saying 'Sorted %d cache queues in %.6f seconds' was
in upwards of 40 minutes when we were running rsync to get logs from
other machines, due to heavy concurrent IO. That means
cacheSpaceAvailable would not get called again for that much time, even
if there was cacheSpaceAvailable.

My understanding is that the check above could be performed more often,
perhaps every N times 'yield' is called on that function, where N is a
reasonable, configurable number or even a ratio of MAX_CACHE_SIZE (say,
every MAX_CACHE_SIZE * 0.5).

Another interesting bit from looking at that function, is that if there
are more creates than MAX_CREATES_PER_MINUTE, new data will be dropped
from the cache even if the cache is not full. Seems a bit controversial
to me. On one side, if we don't drop it then yes, the cache could be
filled with new metrics only, not leaving space for existing metrics,
but OTOH it might end up dropping new data even if there's space
available in the cache, so not sure what to do there.

-- 
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.