graphite-dev team mailing list archive
-
graphite-dev team
-
Mailing list archive
-
Message #02854
[Question #201494]: Not all metrics are saved on EC2 installation
New question #201494 on Graphite:
https://answers.launchpad.net/graphite/+question/201494
I am currently evaluating Graphite performance handling 100k metrics per minute. I've created two identical setups on a local VM and a medium instance in EC2, made a script which would post new metric "systemN.loadavg_1min {rand} {now}" with N ranging from 1 to 50k (sleeping for 0.0006s after each, so that there are 100k per minute) and the metric value is random.
After a while I tried counting the number of directories in the storage location locally:
me@ubuntu:~/graphite-dev$ ls /opt/graphite/storage/whisper/ | wc
50000 50000 588889
and on EC2 (whisper dir is symlinked to /mnt):
ubuntu@ip-x-x-x-x:/opt/graphite$ ls /mnt/whisper/ | wc
31998 31998 372865
The number 31998 does not grow and the strangest thing is that when I delete /mnt/whisper completely, create it back and restart the script, the directory count stops at 31998 again.
console.log contains this kind of entries:
26/06/2012 12:27:37 :: Unhandled Error
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 504, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/local/lib/python2.7/dist-packages/twisted/python/threadpool.py", line 167, in _worker
result = context.call(ctx, function, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/local/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext
return func(*args,**kw)
--- <exception caught here> ---
File "/opt/graphite/lib/carbon/writer.py", line 158, in writeForever
writeCachedDataPoints()
File "/opt/graphite/lib/carbon/writer.py", line 118, in writeCachedDataPoints
whisper.create(dbFilePath, archiveConfig, xFilesFactor, aggregationMethod, settings.WHISPER_SPARSE_CREATE)
File "/usr/local/lib/python2.7/dist-packages/whisper.py", line 327, in create
fh = open(path,'wb')
exceptions.IOError: [Errno 2] No such file or directory: '/opt/graphite/storage/whisper/system31851/loadavg_1min.wsp'
Obviously the permissions are ok since the rest of the dirs are created, but some are not. The box has 1 CPU and 4G memory, the /mnt filesystem has 300GB+ of free space.
I have set MAX_CACHE_SIZE to 100000 to force carbon to write the data to disk sooner, MAX_UPDATES_PER_SECOND and MAX_CREATES_PER_SECOND are "inf". Hovewer the disk usage is not high:
ubuntu@ip-x-x-x-x:/opt/graphite$ iostat -dxk 10
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdap1 0.00 0.32 0.48 0.55 5.89 5.98 23.12 0.01 6.35 8.14 4.77 2.45 0.25
xvdb 0.00 304.39 4.97 60.72 38.24 1460.47 45.63 10.81 164.52 5.54 177.54 1.26 8.28
I guess since the logs show "unhandled exception", this is due to python threads dying together with a part of metrics.
How can I fix that?
--
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.