openstack team mailing list archive

Thread
Date
resolving the current keystone api URL via metadata?

To: "openstack@xxxxxxxxxxxxxxxxxxx (openstack@xxxxxxxxxxxxxxxxxxx)" <openstack@xxxxxxxxxxxxxxxxxxx>
From: Matt Joyce <matt.joyce@xxxxxxxxxxxxxxxx>
Date: Wed, 20 Jun 2012 10:30:49 -0700
This is continuity on a thread I was having with Scott Moser about his
blueprint on config drive improvements:

https://blueprints.launchpad.net/nova/+spec/config-drive-v2

I had suggested adding the current keystone API url path ( from flags ) to
metadata api response.

The fundamental logic on my part is that it provides instances a known
static path for finding out the dynamic value of the keystone API.  With
that it can perform any function in openstack it may need to ( so long as
it has valid auth credentials to do so ).

The concern however raised is that this may not be considered metadata and
is more of a user data thing.  And thus it would have no place being in
metadata.  Certainly I understand that argument.

I however can think of no other way to make this available to instances in
an automated fashion that does not involve external configuration
management.  And while I generally have no problem with the "so what?"
answer to that.  In this case, the problem relates specifically to
functionality of openstack core applications.  So it's sitting on the
dividing line.

I feel like this is useful functionality for openstack and its users.  I
think that metadata is the right place for it.  But maybe I am in a
minority in wanting to see this happen.  Or maybe I am simply wrong about
where it should / could be placed.

Example usage:

In my case, the specific use case I wish to achieve is having fully
portable images that can authenticate against keystone ( via pam auth
module ) and provide environment data from the nova APIs.  With this
functionality I could simply query the metadata API... lookup the keystone
API.  Authenticate with credentials and discover tenancy / owner of
instance to deny or allow access.

Pretty simple.

Would love input from others...

Here is a dissenting opinion from Scott :

-------- Begin Paste --------

> >   Regarding 'tenant' or 'keystone_api' being present in the metadata
> > service I have the following thoughts:
> >  * in general we should not put data there that provides no added
> >    benefit. For example, is there a benefit to having keystone server
> >    information available in the metadata service versus being passed in
> >    in user-data, and a initialization script in the instance reading the
> >    information from user-data rather than meta-data.
> >
> >
> I disagree here.  Having just the address of the current cluster's
keystone
> API server available at meta-data means that any application can figure
out
> it's environment data on the fly to a fairly high degree.  Coupled with a
> pam auth it can fully populate env data for the cluster in which it is in
> with nothing other than the username and password or hash pass.
>
> I think it's highly beneficial for making instances that can be directly
> ported between clusters and require no modification or injection.  More to

I have a bad connotation for the term 'injection'.  user-data is not
injection.
user-data is information about an instance's purpose provided to the
instance at launch time by the entity that requested its creation.

This data may include information such as:
 * location of package mirror
 * location of puppet / chef master server
 * location of juju's zookeeper node
 * users to add, or public keys to import
 * code to run on first boot

There are numerous existing complex and successful examples of using
user-data with such information.  The first items above can be simplified
to "location of an external service to integrate with".

Compare the above with the list of things that are inside the existing EC2
metadata (see http://paste.ubuntu.com/1027440/ for a full crawl):
 * ami-id
 * ami-launch-index
 * block-device-mapping
 * network information
 * public-keys
 * reservation id
 * public-ipv4
 * instance-id
 * public-keys

The information there is almost entirely composed of information that the
entity launching the instance could not know at request time (when
user-data is provided).  It is information that is filled in by the cloud.
There are exactly zero external services mentioned.

I'm not sure why "keystone api server" is special when compared to puppet
master or juju zookeeper node.  The only difference I see is that
openstack knows what the keystone server is for a given tenant and thus
could provide it, while it has no knowledge at all of juju-server or
puppet server.

Between the two locations of 'user-data' and 'meta-data' it seems fairly
obvious to me that 'keystone api server' fits more in line with user-data
than meta-data.

> the point it can make interfacing with the APIs from within user space
> substantially easier.  Usability wise I think it has definite benefits.

I'm honestly not sure what those benefits are.  It would be trivial to
write a openstack-instance-launch tool that does:
 * get keystone-api server and tenant information of user doing the launch
 * create user-data with:
  keystone-api-server: ....
  tenant-id: abcdefg
 * launch instance

This is < 100 lines of code that would then provide the information to the
instance using existing infrastructure, and would also work on EC2,
existing HP cloud, rackspace ....

> >    Basically, user-data is free and uncontrolled.  But if something goes
> >    into the metadata service, then Nova is involved and will potentially
> >    have to maintain this code path indefinitely.  If there is no benefit
> >    to nova involvement, then it should not really be involved.
> >
>
> I feel like this piece is worth the effort and I am willing to write and
> maintain it.
>
>
> >    Separation from Nova is also a good thing from the clients
perspective
> >    because it means it does not depend on openstack.  If the
keystone_api
> >    and tenant are instead passed in via user-data, the implementation is
> >    more easily moved to another cloud provider.  You could even launch
an
> >    ec2 instance and point it at a keystone server running on a openstack
> >    node.
> >
>
> I think that's a case by case thing.  In this case we're providing a
> feature that I think people will use.  However, it's not ALWAYS the best
> choice.  That's fine.  Different strokes for different folks.

The key is that you need to have a good reason why a bit of information is
necessary to be exposed in the metadata service.  What does it enable that
cannot be done equally well without it?

The metadata service is basically an API from openstack to the instance.
We need to treat it as such and scrutinize changes to it.

This conversation is probably best suited for the openstack mailing list.

Maybe we should re-play it there?