← Back to team overview

graphite-dev team mailing list archive

[Question #186672]: carbon does not resume receiving metrics until cache is empty

 

New question #186672 on Graphite:
https://answers.launchpad.net/graphite/+question/186672

v0.9.9 of carbon, whisper and graphite web.
Twisted 11.0.0
Python 2.6.7

With USE_FLOW_CONTROL set to True,  and a MAX_CACHE_SIZE of say, 20000000, (though it doesnt matter),
carbon will pause receiving metrics when its cache hits this max size. This works.

When its drains to 95% of that number, its supposed to start receiving again. This it doesnt do until the cache completely drains.
This is a problem because of the parabolicly decreasing rate at which it drains, when very full, an update_many() will include a lot of points to update, efficient, drains fast.  As it drains, writes get smaller and smaller, drains slower and slower... 

Anyway, it appears if I follow the python code correctly and from some debug lines I put in that DOES hit the check in optimalWriteOrder() here:
  if state.cacheTooFull and MetricCache.size < CACHE_SIZE_LOW_WATERMARK:

and it calls this:

    events.cacheSpaceAvailable()

Which has the handler for un-pausing the receiving, and this executes:

  def resumeReceiving(self):
    # it DID successfully call this at the right time.
    log.listener("debugtrieger - resuming receiving.")
    self.transport.resumeProducing()


However, it doesnt actually start taking in metrics again.
Once the cache fully drains, it does, although at that point this is not called again, its almost as if there's only 1 thread and the receiver thread is busy in the "write everything in the cache to disk" for loop in writer.py.

I have no idea what self is in that last method, nor transport, so maybe its actually in Twisted?  

My only workaround is to have the cache size be hugemongous, but wierd things started happening after about 5GB of ram used, (60G of ram in the box).  So I'm guessing it doesnt like that huge of a cache map. 

Thanks. - Drew



-- 
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.