openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #10904
Re: [Metering] schema and counter definitions
On Tue, May 1, 2012 at 11:49 AM, Doug Hellmann
<doug.hellmann@xxxxxxxxxxxxx>wrote:
>
>
> On Tue, May 1, 2012 at 10:38 AM, Nick Barcet <nick.barcet@xxxxxxxxxxxxx>wrote:
>
>> On 05/01/2012 02:23 AM, Loic Dachary wrote:
>> > On 04/30/2012 11:39 PM, Doug Hellmann wrote:
>> >>
>> >>
>> >> On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary <loic@xxxxxxxxxxxx
>> >> <mailto:loic@xxxxxxxxxxxx>> wrote:
>> >>
>> >> On 04/30/2012 08:03 PM, Doug Hellmann wrote:
>> >>>
>> >>>
>> >>> On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary <loic@xxxxxxxxxxxx
>> >>> <mailto:loic@xxxxxxxxxxxx>> wrote:
>> >>>
>> >>> On 04/30/2012 03:49 PM, Doug Hellmann wrote:
>> >>>>
>> >>>>
>> >>>> On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary
>> >>>> <loic@xxxxxxxxxxxx <mailto:loic@xxxxxxxxxxxx>> wrote:
>> >>>>
>> >>>> On 04/30/2012 12:15 PM, Loic Dachary wrote:
>> >>>> > We could start a discussion from the content of the
>> >>>> following sections:
>> >>>> >
>> >>>> > http://wiki.openstack.org/EfficientMetering#Counters
>> >>>> I think the rationale of the counter aggregation needs
>> >>>> to be explained. My understanding is that the metering
>> >>>> system will be able to deliver the following
>> >>>> information: 10 floating IPv4 addresses were allocated
>> >>>> to the tenant during three months and were leased from
>> >>>> provider NNN. From this, the billing system could add a
>> >>>> line to the invoice : 10 IPv4, $N each = $10xN because
>> >>>> it has been configured to invoice each IPv4 leased from
>> >>>> provider NNN for $N.
>> >>>>
>> >>>> It is not the purpose of the metering system to display
>> >>>> each IPv4 used, therefore it only exposes the aggregated
>> >>>> information. The counters define how the information
>> >>>> should be aggregated. If the idea was to expose each
>> >>>> resource usage individually, defining counters would be
>> >>>> meaningless as they would duplicate the activity log
>> >>>> from each OpenStack component.
>> >>>>
>> >>>> What do you think ?
>> >>>>
>> >>>>
>> >>>> At DreamHost we are going to want to show each individual
>> >>>> resource (the IPv4 address, the instance, etc.) along with
>> >>>> the charge information. Having the metering system aggregate
>> >>>> that data will make it difficult/impossible to present the
>> >>>> bill summary and detail views that we want. It would be much
>> >>>> more useful for us if it tracked the usage details for each
>> >>>> resource, and let us aggregate the data ourselves.
>> >>>>
>> >>>> If other vendors want to show the data differently, perhaps
>> >>>> we should provide separate APIs for retrieving the detailed
>> >>>> and aggregate data.
>> >>>>
>> >>>> Doug
>> >>>>
>> >>> Hi,
>> >>>
>> >>> For the record, here is the unfinished conversation we had on
>> IRC
>> >>>
>> >>> (04:29:06 PM) dhellmann: dachary, did you see my reply about
>> >>> counter definitions on the list today?
>> >>> (04:39:05 PM) dachary: It means some counters must not be
>> >>> aggregated. Only the amount associated with it is but there
>> >>> is one counter per IP.
>> >>> (04:55:01 PM) dachary: dhellmann: what about this :the id of
>> >>> the ressource controls the agregation of all counters : if it
>> >>> is missing, all resources of the same kind and their measures
>> >>> are aggregated. Otherwise only the measures are agreggated.
>> >>>
>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39
>> >>> <
>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39>
>> >>> (04:55:58 PM) dachary: it makes me a little unconfortable to
>> >>> define such an "ad-hoc" grouping
>> >>> (04:56:53 PM) dachary: i.e. you actuall control the
>> >>> aggregation by chosing which value to put in the id column
>> >>> (04:58:43 PM) dachary: s/actuall/actually/
>> >>> (05:05:38 PM) ***dachary reading
>> >>> http://www.ogf.org/documents/GFD.98.pdf
>> >>> (05:05:54 PM) dachary: I feel like we're trying to resolve a
>> >>> non problem here
>> >>> (05:08:42 PM) dachary: values need to be aggregated. The raw
>> >>> input is a full description of the resource and a value (
>> >>> gauge ). The question is how to control the aggregation in a
>> >>> reasonably flexible way.
>> >>> (05:11:34 PM) dachary: The definition of a counter could
>> >>> probably be described as : the id of a resource and code to
>> >>> fill each column associated with it.
>> >>>
>> >>> I tried to append the following, but the wiki kept failing.
>> >>>
>> >>> Propose that the counters are defined by a function instead
>> >>> of being fixed. That helps addressing the issue of
>> >>> aggregating the bandwidth associated to a given IP into a
>> >>> single counter.
>> >>>
>> >>> Alternate idea :
>> >>> * a counter is defined by
>> >>> * a name ( o1, n2, etc. ) that uniquely identifies the
>> >>> nature of the measure ( outbound internet transit, amount of
>> >>> RAM, etc. )
>> >>> * the component in which it can be found ( nova, swift etc.)
>> >>> * and by columns, each one is set with the result of
>> >>> aggregate(find(record),record) where
>> >>> * find() looks for the existing column as found by
>> >>> selecting with the unique key ( maybe the name and the
>> >>> resource id )
>> >>> * record is a detailed description of the metering event to
>> >>> be aggregated (
>> >>>
>> http://wiki.openstack.org/SystemUsageData#compute.instance.exists:
>> >>> )
>> >>> * the aggregate() function returns the updated row. By
>> >>> default it just += the counter value with the old row
>> >>> returned by find()
>> >>>
>> >>>
>> >>> Would we want aggregation to occur within the database where we
>> >>> are collecting events, or should that move somewhere else?
>> >> I assume the events collected by the metering agents will all be
>> >> archived for auditing (or re-building the database)
>> >>
>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44
>> >> <
>> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44>
>> >>
>> >> Therefore the aggregation should occur when the database is
>> >> updated to account for a new event.
>> >>
>> >> Does this make sense ? I may have misunderstood part of your
>> question.
>> >>
>> >>
>> >> I guess what I don't understand is why the aggregated data is written
>> >> back to the metering database at all. If it's in the same database, it
>> >> seems like it should be in a different "table" (or equivalent) so the
>> >> original data is left alone.
>> > In my view the events are not stored in a database, they are merely
>> > appended to a log file. The database is built from the events with
>> > aggregated data. I now understand that you (and Joshua Harlow) think
>> > it's better to not aggregate the data and let the billing system do this
>> > job.
>>
>> My intent when writing the blueprint was that each event would be
>> recorded atomically in the database, as it is the only way to control
>> that we have not missed any. Aggregation, should be done at the external
>> API level if the request is to get the sum of a given counter.
>>
>
> That matches what I was thinking. The "log file" that Loic mentioned would
> in fact be a database that can handle a lot of writes. We could use some
> sort of simple file format, but since we're going to have to read and parse
> the log anyway, we might as well use a tool that makes that easy.
>
> Aggregation could happen either in a metering API based on the query, or
> an external app could retrieve a large dataset and manage the aggregation
> itself.
>
>
>> What I missed in the blueprint and seems to be appearing clearly now, is
>> that an event need to be able to carry the "object-reference" for which
>> it was collected, and this would seem highly necessary looking at the
>> messages in this thread. A metering event would essentially be defined
>> by (who, what, which) instead of a simple (who, what). As a consequence
>> we would need to extend the DB schema to add this [which/object
>> reference], and make sure that we carry it as well when we will work on
>> the message API format definition.
>>
>> How does this sound?
>>
>
> I think so. A lot of these sorts of issues can probably be fixed by being
> careful about how we define the measurements. For example, I may want to be
> able to show a customer the network bandwidth used per server, not just per
> network. If we measure the bandwidth consumed by each VIF, the aggregation
> code can take care of summarizing by network (because we know where the VIF
> is) and/or server (because we know which server has the VIF).
>
> We may need to record more detail than a simple "which," though, because
> it may be possible to change some information relevant for calculating the
> billing rate later. For example, a tenant can resize an instance, which
> would usually cause a change in the billing rate. Some of the relationships
> might change, too (Is it possible to move a VIF between networks?).
>
> At first I thought this might require separate table definitions per
> resource type (instance, network, etc.) but re-reading the table of
> counters in EfficientMetering I guess this is handled by measuring things
> like CPU, RAM, and block storage as separate counters? So a single event
> for creating a new instance might result in several records being written
> to the database, with the "which" set to the instance identifier. The data
> could then be presented as a unified "resource usage" report for that
> server.
>
> I think that works, but it may make the job of calculating the bill
> harder. We are planning to follow the model of specifying rates per size,
> so we would have to figure out which combination of CPU, RAM, and root
> volume storage matches up with a given size to determine the rate.
>
> Another piece I've been thinking about is handling boundary conditions
> when resource create and delete events don't both fall inside a billing
> cycle (or within the granularity of the metering system). That shouldn't be
> part of logging the events, necessarily, but it could be a reusable
> component that feeds into producing the aggregated data (either through the
> API, or as a way of processing the results returned by the API).
>
> >> Maybe it's time to start focusing these discussions on user stories?
>> >>
>> > I agree. Would you like to go first ?
>>
>
> These are "things that might happen" use cases rather than "user stories,"
> but let's see where they take us:
>
> 1. User creates an instance, waits some period of time, then terminates it.
> - Vary the period of time to allow the events to both fall within the
> metering granularity window, to overlap an entire window, to start in one
> window and end in another.
> - The same variations for "billing cycle" instead of "metering
> granularity window."
> 2. User creates an instance, waits some period of time, then resizes it.
> - Vary the period of time as above.
> - Do we need variations for resizing up and down?
> 3. User creates an instance but it fails to create properly (provider
> issue).
> 4. User creates an instance but it fails to boot after creation (bad
> image).
> 5. User create volume storage, adds it to an existing instance, waits a
> period of time, then deletes the volume.
> - Vary the period of time as above.
> 6. User creates volume storage, adds it to an existing instance, waits a
> period of time, then terminates the instance (I'm not sure what happens to
> the volume in that case, maybe it still exists?)
>
> A provider-related story might be:
>
> 1. As a provider, I can query the metering API to determine the activity
> for a tenant within a given period of time.
>
> Although that's pretty vague. :-)
>
I thought of another provider story:
2. As a provider, I can install a metering plugin to start collecting data
about events not handled by the core metering app.
References
-
[Metering] schema and counter definitions
From: Loic Dachary, 2012-04-30
-
Re: [Metering] schema and counter definitions
From: Loic Dachary, 2012-04-30
-
Re: [Metering] schema and counter definitions
From: Doug Hellmann, 2012-04-30
-
Re: [Metering] schema and counter definitions
From: Loic Dachary, 2012-04-30
-
Re: [Metering] schema and counter definitions
From: Doug Hellmann, 2012-04-30
-
Re: [Metering] schema and counter definitions
From: Loic Dachary, 2012-04-30
-
Re: [Metering] schema and counter definitions
From: Doug Hellmann, 2012-04-30
-
Re: [Metering] schema and counter definitions
From: Loic Dachary, 2012-05-01
-
Re: [Metering] schema and counter definitions
From: Nick Barcet, 2012-05-01
-
Re: [Metering] schema and counter definitions
From: Doug Hellmann, 2012-05-01