← Back to team overview

graphite-dev team mailing list archive

Re: [Question #136096]: Collecting system metrics

 

Question #136096 on Graphite changed:
https://answers.launchpad.net/graphite/+question/136096

    Status: Open => Answered

chrismd proposed the following answer:
Unfortunately I have always used custom built monitoring agents for this
purpose and so far have been unsuccessful in convincing my various
employers to let me release them publicly. I have heard of some people
working on integrating with collectd (http://collectd.org/) but I have
never tried it personally.

I've found that building a monitoring agent generally isn't too hard and
can be quite educational, its also fun because you can do it one piece
at a time. Plugin-based is always the way to go, and simplicity &
flexibility are the keys to making a good plugin API. The tool I am
currently using is a python daemon that executes plugin scripts once a
minute and simply captures their stdout, which is a pickled dict (this
is very flexible but tied to python, json would be more general-purpose)
containing lists of metrics, errors, and descriptive information about
the application the plugin is monitoring. My plugins use a simple
library of convenience functions for common operations (outputting their
results, scraping a log, saving/loading their state, etc). My agents
send the data collected by their plugins to a central server which
applies alarming logic, computes synthetic metrics, and ultimately
forwards the data on to Graphite.

If you or anyone else is interested in working on such a tool, I would
be more than happy to help.

-- 
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.