← Back to team overview

openstack team mailing list archive

Re: [metering] resources metadata

 

On Mon, May 14, 2012 at 1:04 PM, Loic Dachary <loic@xxxxxxxxxxxx> wrote:

>  On 05/14/2012 04:15 PM, Doug Hellmann wrote:
>
>
>
> On Fri, May 11, 2012 at 3:55 PM, Loic Dachary <loic@xxxxxxxxxxxx> wrote:
>
>>
>> > - The interesting metadata for a resource may depend on the type of
>> > resource. Do we need separate "tables" for that or can we normalize
>> > somehow?
>> > - How do we map a resource to the correct version of its metadata at
>> > any given time? Timestamps seem brittle.
>> > - Do we need to reflect the metadata in the aggregation API?
>> >
>> Hi,
>>
>> I started a new thread for the "metadata" topic. I suspect it deserves
>> it. Although I was reluctant to acknowledge that the metadate should be
>> stored by the metering, yesterday's meeting made me realize that it was
>> mandatory. The compelling reason ( for me ;-) is that it would make it much
>> more difficult to implement a billing system if the metering does not
>> provide a simple way to extract metadata and display it in a human readable
>> way (or meaningfull to accountants ?) .
>>
>> I see two separate questions :
>>
>> a) how to store and query metadata ?
>> b) what is the semantic of metadata for a given resource ?
>>
>> My hunch is that there will never be a definitive answer to b) and that
>> the best we can do is to provide a format and leave the semantic to the
>> documentation of the metering system, explaining the metadata of a resource.
>>
>> Regarding the storage of the metadata, the metering could listen / poll
>> events creating / updating / deleting a given resource and store a history
>> log indexed by the resource id. Something like:
>>
>> { meter_type: TTT,
>> resource_id: RRR,
>> metadata: [{ version: VVVV,
>> timestamp: TIME1,
>> payload: PAYLOAD1 },
>> { version: VVVV,
>> timestamp: TIME3,
>> payload: PAYLOAD2 }]
>> }
>>
>> With PPP being the resource dependant metadata that depends on the type
>> of the resource. And the metadata array being an ordered list of the
>> successive states of the resource over time. The VVV version accounting for
>> changes in the format of the payload.
>>
>> The query would be :
>>
>> GET /resource/<meter_type>/<resource_id>/<TIME2>
>>
>> and it would return PAYLOAD1 if TIME2 is in the range [TIME1,TIME3[
>>
>> I'm not sure why you think "timestamp is brittle". Maybe I'm missing
>> something.
>>
>
>  Each set of metering data will need to be associated with the
> appropriate metadata from the resource at the time the metering information
> was collected. The rate of change of metadata and metering events are
> different, though, so the timestamps of the metadata records are unlikely
> to match exactly with the values in the metering records. Depending on the
> clock resolution, it would be possible to have metadata changes and meter
> data with the same timestamp, resulting in an incorrect association.
>
> Indeed, good point.
>

Although it turns out the case I was actually worried about, resizing
instances, may be supported by only some hypervisors. As a result, this is
less of a concern and I could afford to have us postpone handling changing
metadata until a later version of ceilometer. We still need to collect the
initial data, in case the resource is deleted, but that is far less
complicated and there is no sense making extra trouble for ourselves if
other users of 1.0 will not need the feature, either. Does anyone else in
the group have feedback on how important it is?

>
>  We can work around that by maintaining proper foreign key references
> using the metadata version field as you describe in the schema above (so
> the resource id and metadata version value point to the correct metadata
> record). It will make recording the metering data less efficient because we
> will need to determine the current version for the resource metadata, but
> we can optimize that eventually through indexes and caching.
>
>  Aggregation will also need to take the metadata version into account, so
> everywhere in the list of queries we say "by resource_id" we need to change
> that to "by resource_id and version".
>
> I added the idea of a format version for when the payload format changes
> and tried to write down a description of the metadata storage matching this
> thread in the wiki.
>
> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=80&rev1=78
>
> What do you think ?
>

That looks good. I am looking forward to getting Julien's code merged in so
I can start working with it.


>
>
> --
> Loïc Dachary         Chief Research Officer
> // eNovance labs   http://labs.enovance.com
> // ✉ loic@xxxxxxxxxxxx  ☎ +33 1 49 70 99 82
>
>

References