← Back to team overview

graphite-dev team mailing list archive

Re: [Question #206043]: Best practice for counter data

 

Question #206043 on Graphite changed:
https://answers.launchpad.net/graphite/+question/206043

    Status: Open => Answered

Michael Leinartas proposed the following answer:
Whether or not you should store rates is really a matter of preference.
Many do since it's the most readily useful for charting - integral() can
be used to get back to an ongoing summation if necessary. When storing
rates there are (at least) two ways it's commonly done - one is to do as
derivative() does and take the raw difference between sample values.
Another method (how statsd does it) is to store the per-second average
rate - that is, the difference between sample values divided by the
seconds between samples. The later makes storage-aggregation config
simpler (the default of average works well) and doesn't require the user
to know the specific data precisions, but the former can be more
intuitive to some.

As for your questions:
1. What is the maximum value for a counter value? Is it stored as an integer or as floating-point? (Single-precision floating point would only have effectively 23 bits of accuracy, and double-precision 47 bits; 64-bit counters would therefore not be usable)
All values are stored as doubles on disk and are python floats internally  - with python 2.5+ you can do python -c 'import sys; print sys.float_info' for details on the limits and precision.

2. What happens when a counter wraps around? Does the derivative() function take care of this? Does it assume that values are monotonically increasing, or would I get a huge negative spike when it wraps?
the derivative() function will give you the huge negative spike on wrap, but nonNegativeDerivative() will handle a wrap and allows specifying a max value: http://graphite.readthedocs.org/en/0.9.x/functions.html#graphite.render.functions.nonNegativeDerivative

3. For a series of counter data, I note that the aggregation function would have to be set to "last" instead of the default "average".
http://graphite.readthedocs.org/en/1.0/config-carbon.html
Is there a best practice for this? For example, should I add ".counter" to the end of all metrics which are counters?
In my opinion this (or some variation) is the only manageable way to do policies for storage-aggregation schemas. Without a clear convention like this it can be hard to keep up with new categories of metrics.

4. When plotting a graph, you have to ask for "derivative" in the UI. Can I configure it so that all metrics matching a particular pattern automatically use the derivative function? Otherwise there's an extra UI step involved every time you draw one of these, which would be a good reason for storing the rate in the first place.
No, there's currently no way to create a template like this for it to be automatically applied. Avoiding the extra step is indeed the reason many of us store rates.

5. As each counter value is submitted to graphite, it has a timestamp value. Does graphite/whisper store the exact timestamp against each data point, and use it when deriving rates? Or does it just use the timestamp to assign the data point into the nearest sample bin?
No, it's the nearest sample bin that's stored. This does introduce the possibility of some jitter, but in practice it doesn't seem to be significant enough to worry about (for my own cases at least).

Hope this helps

-- 
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.