openstack team mailing list archive

Thread
Date
Re: [Metering] Agent configuration mechanism

To: Doug Hellmann <doug.hellmann@xxxxxxxxxxxxx>
From: Nick Barcet <nick.barcet@xxxxxxxxxxxxx>
Date: Wed, 06 Jun 2012 11:43:00 +0200
Cc: openstack@xxxxxxxxxxxxxxxxxxx
In-reply-to: <CADb+p3QtSpHe_CXMkcBF5XkFtfVoCxa8hmbJMWGT-+=JJX8wOg@mail.gmail.com>
Openpgp: id=5FECBD92
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120430 Thunderbird/12.0.1
On 06/05/2012 09:03 PM, Doug Hellmann wrote:
> 
> 
> On Tue, Jun 5, 2012 at 12:59 PM, Nick Barcet <nick.barcet@xxxxxxxxxxxxx
> <mailto:nick.barcet@xxxxxxxxxxxxx>> wrote:
> 
>     On 06/05/2012 04:44 PM, Doug Hellmann wrote:
>     > On Tue, Jun 5, 2012 at 10:41 AM, Doug Hellmann
>     > <doug.hellmann@xxxxxxxxxxxxx <mailto:doug.hellmann@xxxxxxxxxxxxx>
>     <mailto:doug.hellmann@xxxxxxxxxxxxx
>     <mailto:doug.hellmann@xxxxxxxxxxxxx>>> wrote:
>     >     On Tue, Jun 5, 2012 at 9:56 AM, Nick Barcet
>     >     <nick.barcet@xxxxxxxxxxxxx <mailto:nick.barcet@xxxxxxxxxxxxx>
>     <mailto:nick.barcet@xxxxxxxxxxxxx
>     <mailto:nick.barcet@xxxxxxxxxxxxx>>> wrote:
>     >
>     >         Following up on our last meeting, here is a proposal for
>     centrally
>     >         hosting configuration of agents in ceilometer.
>     >
>     >         The main idea is that all agents of a given type should be
>     sending
>     >         similarly formatted information in order for the
>     information to be
>     >         usable, hence the need to ensure that configuration info is
>     >         centrally
>     >         stored and retrieved.  This would rule out, in my mind,
>     the idea
>     >         that we
>     >         could use the global flags object, as distribution of the
>     >         configuration
>     >         file is left to the cloud implementor and does not lend
>     for easy and
>     >         synchronized updates of agent config.
>     >
>     >         Configuration format and content is left to the agent's
>     >         implementation,
>     >         but it is assumed that each meter covered by an agent can be :
>     >          * enabled or disabled
>     >          * set to send information at a specified interval.
>     >
>     >
>     >     Right now we only have one interval for all polling. Do you
>     think we
>     >     need to add support for polling different values at different
>     >     intervals? Do we need other per-agent settings, or are all of the
>     >     settings the same for all agents? (I had assumed the latter
>     would be
>     >     all we needed.)
> 
>     I would have thought that we may want to support different intervals per
>     meters, based on the billing rules that one may want to offer.  For
>     example, I may want to bill compute by the hour but floating IPs by the
>     day, hence have a different reporting interval for each.
> 
> 
> I was planning to aggregate the values for items being billed over the
> longer time frames, but we can make the polling interval configurable.
> It will take some work, because of the way the scheduled tasks are
> configured in the service and manager (right now we just schedule one
> method to run, and it invokes each pollster).
> 
> How important is it to include this in Folsom?

Not crucial.  I would classify this as "Nice to have".

>     >         1/ Configuration is stored for each agent in the database
>     as follow
>     >        
>     +-------------------------------------------------------------------+
>     >         | Field     | Type     | Note
>     >             |
>     >        
>     +-------------------------------------------------------------------+
>     >         | AgentType | String   | Unique agent type
>     >            |
>     >         | ConfVers  | Integer  | Version of the configuration
>     >             |
>     >         | Config    | Text     | JSON Configuration info (defined by
>     >         agent) |
>     >        
>     +-----------+----------+--------------------------------------------+
>     >
>     >         2/ Config is retreived via the messaging queue upon boot
>     once a day
>     >         (this should be defined in the global flags object) to
>     check if the
>     >         config has changed.
>     >
>     >
>     >     Updating the config once a day is not going to be enough in an
>     >     environment with a lot of compute nodes.
>     >
>     >
>     > Two thoughts merged into one sentence there. Need more caffeine.
>     >
>     > What I was trying to say, was that updating the config once a day
>     might
>     > not be enough and in environments with a lot of compute nodes going
>     > around to manually restart the services each time the config changes
>     > will be a pain. See below for more discussion of pushing config
>     settings
>     > out.
> 
>     Agreed, and that's why I proposed that the interval for confguration
>     refresh should be set in the Global object flag (this is something that
>     can be shared among all the agents).
> 
>     >
>     >
>     >         Request sent by the agent upon boot and :
>     >
>     >            'reply_to': 'get_config_data',
>     >            'correlation_id': xxxxx
>     >            'version': '1.0',
>     >            'args': {'data': {
>     >                       'AgentType': agent.type,
>     >                       'CurrentVersion': agent.version,
>     >                       'ConfigDefault': agent.default,
>     >                       },
>     >                    },
>     >
>     >
>     >     Is this a standard OpenStack RPC call?
> 
>     Not sure about that, but if it can be, it would be easier :)
> 
> 
> Yeah, I think a regular RPC call would be the easiest implementation. So
> we still need to specify the arguments to that call, but we don't have
> to worry about how the messages travel back and forth.

Agreed.

> 
>     >         Where ConfigDefault are the "sane" default proposed by the
>     agent
>     >         authors.
>     >
>     >
>     >     Why is the agent proposing default settings?
> 
>     So that the first agent of a given type can populate its info with sane
>     defaults that can then be edited later on?
> 
> 
> If the agent plugins are installed on the server where the collector is
> located, the collector can ask them for defaults.

Better indeed.

>     >         If no config record is found the collector creates the
>     record, sets
>     >         ConfVers to 1 and sends back a normal reply.
>     >
>     >         Reply sent by the collector:
>     >            'correlation_id': xxxxx
>     >            'version': '1.0',
>     >
>     >
>     >     Do we need minor versions for the config settings, or are those
>     >     simple sequence numbers to track which settings are the "most
>     current"?
> 
>     Simple sequence was what I was thinking about.
> 
> 
> Wouldn't it be simpler if the configuration settings were pushed to the
> agent as an idempotent operation?

Simpler but we would still have to send a request for it to be sent.
Can you clarify?

>     >            'args': {'data': {
>     >                       'Result': result.code,
>     >                       'ConfVers': ConfVers,
>     >                       'Config': Config,
>     >                       },
>     >                    },
>     >            }
>     >
>     >         Result is set as follow:
>     >            200  -> Config was retrieved successfully
>     >            201  -> Config was created based on received default
>     (Config
>     >         is empty)
>     >            304  -> Config version is identical to CurrentVersion
>     (Config
>     >         is empty)
>     >
>     >
>     >     Why does the agent need to know the difference between those?
>     >     Shouldn't it simply use the settings it is given?
> 
>     To avoid processing update code if the update is not needed?
> 
> 
> That optimization doesn't need to be built into the protocol, though.
> The only way to get that right is for the central server to have a
> representation of the state of the configuration of each agent. It is
> simpler for the agent to ask the collector, "what should my
> configuration be?" and then handle the changes locally.

Not really, since the agent would send "CurrentVersion" stating, this is
the version I have, the server would just have to check that ConfVers is
greater.  Or am I missing something?

> The simplest implementation will be to just throw away all of the
> pollsters and instantiate new ones when the configuration changes. It
> isn't expensive to construct those objects, and doing it this way should
> be easier to implement than trying to adjust settings (especially the
> schedule).

So basically you are suggesting a push from server mechanism instead of
a pull from agent? That would work fine too, but we will still need a
pull for new agents coming online to get the current version.

>     >         This leaves open the question of having some UI to change the
>     >         config,
>     >         but I thing we can live with manual updating of the
>     records for
>     >         the time
>     >         being.
>     >
>     >
>     >     Since we're using the service and RPC frameworks from nova
>     >     elsewhere, we have the option of issuing commands to all of the
>     >     agents from a central server. That would let us, for example,
>     use a
>     >     cast() call to push a new configuration out to all of the
>     agents at
>     >     once, on demand (from a command line program, for example).
> 
>     Sounds nifty.  Let's amend.
> 
>     >     I don't see the need for storing the configuration in the
>     database.
>     >     It seems just as easy to have a configuration file on the central
>     >     server. The collector could read the file each time it is
>     asked for
>     >     the agent configuration, and the command line program that pushes
>     >     config changes out could do the same.
> 
>     Over engineering on my side, maybe.  You are right that the database is
>     NOT needed and we can do with a simple file, but then the collector
>     becomes state-full and HA considerations will start kicking in if we
>     want to have 2 collectors running in //.  If the DB is shared, the issue
>     is pushed to the DB, which will, hopefully, be redundant by nature.
> 
> 
> That's a reasonable point. I assumed the collector configuration is
> going to need to be shared among those nodes already. How does that work
> in other OpenStack components?

I think that by addressing the default config of agent through values
provided by server side agent, we are addressing my concern of
synchronization of multiple instances and can assume, as other openstack
project do, that the deployer will maintain those config files in
parallel when they need to change it.

>     >     Have you given any thought to distributing the secret value
>     used for
>     >     signing incoming messages? A central configuration authority does
>     >     not give us a secure way to deliver secrets like that. If anyone
>     >     with access to the message queue can retrieve the key by
>     sending RPC
>     >     requests, we might as well not sign the messages.
> 
>     Actually, the private key used to generate a signature should be unique
>     to each host, if we want them to have any value at all, therefore
>     distributing a common signature should NOT be part of this, or we would
>     fall under the notion of a shared secret, which is, IMHO, not any better
>     than having a global password.
> 
>     I would recommend that, for the time being, we just generate a random
>     key pair per host the first time the agent is run, allowing for someone
>     with further requirement to eventually populate this value by another
>     mean.
> 
>     In any case, if we want to effectively check the signature, the public
>     key does need to be accessible by the collector to check it and have yet
>     to define a way to do so...  Proposals welcome, but again, while I think
>     we should lay the ground for a great security experience, we certainly
>     don't need to solve it all in v1.
> 
> 
> The current implementation uses hmac message signatures, which use a
> shared secret instead of public/private key pairs. We can have a
> separate secret for each agent, but we still need the collector(s) to
> have them all. I thought the point of signing the messages was to
> prevent an arbitrary agent from signing on to the message queue and
> sending bogus data. Do we need to be doing more than hmac for security?

As stated since the beginning of the project, purpose of the signature
is non repudiability, not only authentication.  If I understand
correctly, hmac signature will only provide authentication through a
shared secret, shared secret which then should not be transmited on the
wire to the agent, or else it would loose all purpose.   I was
envisioning a scenario that would allow both:

1. each agent instance generate a keypair
2. pub key is added to collector as in a trusted agent list
3. message are signed by agent using priv key and each message emitted
carries a sequence number
4. collector checks signature using the pub key

As a result, individual agents are authenticated, their signature is
unique so they can be traced and you can't fool the system through a
replay since the sequencing would be off.

Nick
Attachment: signature.asc
Description: OpenPGP digital signature
References

[Metering] Agent configuration mechanism
From: Nick Barcet, 2012-06-05
Re: [Metering] Agent configuration mechanism
From: Doug Hellmann, 2012-06-05
Re: [Metering] Agent configuration mechanism
From: Doug Hellmann, 2012-06-05
Re: [Metering] Agent configuration mechanism
From: Nick Barcet, 2012-06-05
Re: [Metering] Agent configuration mechanism
From: Doug Hellmann, 2012-06-05