graphite-dev team mailing list archive
-
graphite-dev team
-
Mailing list archive
-
Message #05987
Re: [Question #285063]: Twisted MemoryError / MetricCache is full
Question #285063 on Graphite changed:
https://answers.launchpad.net/graphite/+question/285063
Will gave more information on the question:
Ok, I've made the following adjustments:
=====
/opt/graphite/bin/carbon.conf:
MAX_CACHE_SIZE=10000000
MAX_UPDATES_PER_SECOND=50000
=====
10MM metrics/minute divided by 60 seconds divided by 8 instances is
about 21000 metrics per instance per second, so 50000 should be more
than able.
=====
/opt/graphite/bin/ccrelay.conf:
cluster lga
fnv1a_ch
0.0.0.0:2013=a
0.0.0.0:2113=b
0.0.0.0:2213=c
0.0.0.0:2313=d
0.0.0.0:2413=e
0.0.0.0:2513=f
0.0.0.0:2613=g
0.0.0.0:2713=h
;
match *
send to lga
;
=====
ps out:
root 4996 77.9 7.1 12455596 9446004 ? Ssl 20:58 26:42
/opt/graphite/bin/relay -f /opt/graphite/bin/ccrelay.conf -l
/opt/graphite/storage/log/ccrelay/ccrelay.log -S 1 -D -P
/var/run/ccrelay.pid -q 150000000 -b 200000
=====
Graphs:
Graphite Stats: https://imgur.com/n28Q2Z5
Carbon-C-Relay Stats: https://imgur.com/ObHAum6
Looks like we could actually pare down the number of threads that
carbon-c-relay runs but it otherwise seems to be handling the load quite
well. However, I have some concerns at this point:
1) Committed points is always < Metrics received in Graphite stats.
2) The carbon-c-relay logfile occasionally shows this for a random
instance:
(ERR) failed to write() to 10.201.12.199:2013: uncomplete write
The cache size on that instance is nearing MAX_CACHE_SIZE within 15
minutes, and the RAM usage on that instance is significantly higher than
the others. This message goes away after I kill and restart the proc.
Not sure what to do here but we've caused the cache sizes to tap out
faster than usual.
--
You received this question notification because your team graphite-dev
is an answer contact for Graphite.