graphite-dev team mailing list archive
-
graphite-dev team
-
Mailing list archive
-
Message #01663
Re: [Question #177377]: Is there a Threshold plugin for Graphite?
Question #177377 on Graphite changed:
https://answers.launchpad.net/graphite/+question/177377
Status: Open => Answered
Nicholas Leskiw proposed the following answer:
First, Graphite is not a monitoring tool. It's a Graphing tool. It
just happens that the data to make graphs is usually the same data
people want to alarm off of ; )
Second, This is a very common pattern. Most people quickly find out
that simple "If X goes above Y" alerts are too chatty and just spam your
alarm console and send useless mails. The SLA (Service Level Agreement)
at my organization for detecting problems is 5 minutes, so we look at 5
minute averages vs. the average of the same time last week, week before,
week before that (and longer sometimes) to detect problems. Cry wolf
too many times and people stop reading the alarms coming from your
system.
As far as being 'reliant on the web frontend running', we're talking apache here. It's pretty rock-solid, and I've never seen it crash. Not even after requesting the past 5 minutes on 5,000 metrics as a test. You may get some errors if you're stringing together some strange functions or something, but that just makes a single request fail, not the whole system. If we were using nagios, we'd be 'reliant on the email system' and 'reliant on the SNMP system'.
I don't use collectd, so I can't speak to that.
--
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.