← Back to team overview

openstack team mailing list archive

Re: [Metering] schema and counter definitions

 

On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary <loic@xxxxxxxxxxxx> wrote:

>  On 04/30/2012 08:03 PM, Doug Hellmann wrote:
>
>
>
> On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary <loic@xxxxxxxxxxxx> wrote:
>
>>   On 04/30/2012 03:49 PM, Doug Hellmann wrote:
>>
>>
>>
>> On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary <loic@xxxxxxxxxxxx> wrote:
>>
>>> On 04/30/2012 12:15 PM, Loic Dachary wrote:
>>> > We could start a discussion from the content of the following sections:
>>> >
>>> > http://wiki.openstack.org/EfficientMetering#Counters
>>>  I think the rationale of the counter aggregation needs to be explained.
>>> My understanding is that the metering system will be able to deliver the
>>> following information: 10 floating IPv4 addresses were allocated to the
>>> tenant during three months and were leased from provider NNN. From this,
>>> the billing system could add a line to the invoice : 10 IPv4, $N each =
>>> $10xN because it has been configured to invoice each IPv4 leased from
>>> provider NNN for $N.
>>>
>>> It is not the purpose of the metering system to display each IPv4 used,
>>> therefore it only exposes the aggregated information. The counters define
>>> how the information should be aggregated. If the idea was to expose each
>>> resource usage individually, defining counters would be meaningless as they
>>> would duplicate the activity log from each OpenStack component.
>>>
>>> What do you think ?
>>>
>>
>>  At DreamHost we are going to want to show each individual resource (the
>> IPv4 address, the instance, etc.) along with the charge information. Having
>> the metering system aggregate that data will make it difficult/impossible
>> to present the bill summary and detail views that we want. It would be much
>> more useful for us if it tracked the usage details for each resource, and
>> let us aggregate the data ourselves.
>>
>>  If other vendors want to show the data differently, perhaps we should
>> provide separate APIs for retrieving the detailed and aggregate data.
>>
>>  Doug
>>
>>    Hi,
>>
>> For the record, here is the unfinished conversation we had on IRC
>>
>> (04:29:06 PM) dhellmann: dachary, did you see my reply about counter
>> definitions on the list today?
>> (04:39:05 PM) dachary: It means some counters must not be aggregated.
>> Only the amount associated with it is but there is one counter per IP.
>> (04:55:01 PM) dachary: dhellmann: what about this :the id of the
>> ressource controls the agregation of all counters : if it is missing, all
>> resources of the same kind and their measures are aggregated. Otherwise
>> only the measures are agreggated.
>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39
>> (04:55:58 PM) dachary: it makes me a little unconfortable to define such
>> an "ad-hoc" grouping
>> (04:56:53 PM) dachary: i.e. you actuall control the aggregation by
>> chosing which value to put in the id column
>> (04:58:43 PM) dachary: s/actuall/actually/
>> (05:05:38 PM) ***dachary reading http://www.ogf.org/documents/GFD.98.pdf
>> (05:05:54 PM) dachary: I feel like we're trying to resolve a non problem
>> here
>> (05:08:42 PM) dachary: values need to be aggregated. The raw input is a
>> full description of the resource and a value ( gauge ). The question is how
>> to control the aggregation in a reasonably flexible way.
>> (05:11:34 PM) dachary: The definition of a counter could probably be
>> described as : the id of a resource and code to fill each column associated
>> with it.
>>
>> I tried to append the following, but the wiki kept failing.
>>
>> Propose that the counters are defined by a function instead of being
>> fixed. That helps addressing the issue of aggregating the bandwidth
>> associated to a given IP into a single counter.
>>
>> Alternate idea :
>>  * a counter is defined by
>>   * a name ( o1, n2, etc. ) that uniquely identifies the nature of the
>> measure ( outbound internet transit, amount of RAM, etc. )
>>   * the component in which it can be found ( nova, swift etc.)
>>  * and by columns, each one is set with the result of
>> aggregate(find(record),record) where
>>   * find() looks for the existing column as found by selecting with the
>> unique key ( maybe the name and the resource id )
>>   * record is a detailed description of the metering event to be
>> aggregated (
>> http://wiki.openstack.org/SystemUsageData#compute.instance.exists: )
>>   * the aggregate() function returns the updated row. By default it just
>> += the counter value with the old row returned by find()
>>
>
>  Would we want aggregation to occur within the database where we are
> collecting events, or should that move somewhere else?
>
> I assume the events collected by the metering agents will all be archived
> for auditing (or re-building the database)
> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44
>
> Therefore the aggregation should occur when the database is updated to
> account for a new event.
>
> Does this make sense ? I may have misunderstood part of your question.
>

I guess what I don't understand is why the aggregated data is written back
to the metering database at all. If it's in the same database, it seems
like it should be in a different "table" (or equivalent) so the original
data is left alone.

Maybe it's time to start focusing these discussions on user stories?

Follow ups

References