← Back to team overview

openstack team mailing list archive

Re: Agreeing a common set of Image Properties

 



On 04/09/2012 02:41 PM, Justin Santa Barbara wrote:
I should probably clarify my terminology a little here, as I may have
mangled it:

  * I'm talking about additional/extension properties, not properties
    that are well-known to Glance
  * However, we agree to use a common set of properties to mean certain
    things.

In other words, we all agree to call Debian
"openstack.org:distro=debian.org <http://debian.org>" rather than
"os_distro=Debian" on Cloud1, "distro=debian.org <http://debian.org>" on
Cloud2, "name=Debian Squeeze" on Cloud3 etc

Sure.

        3 main pieces of metadata: os:distro, os:version_major,
        os:version_minor


    In the proposed Images API 2.0, we use JSON Schema's
    additionalProperties collection to allow for discoverable custom
    properties. For the properties you list above, you could either add
    those properties with the prefix you propose above (in
    additionalProperties), or argue for inclusion in the standard
    "properties" collection of the Image Entry schema. You can see both
    the properties and a sample of what additionalProperties are used
    for shown in the schema:

    https://docs.google.com/__document/d/1rb7ZVn0Du___5NZqUyQpqUZSmv7Qd66TMHYAtvsow7__LH4/edit#heading=h.__1nk6x0hs4enq
    <https://docs.google.com/document/d/1rb7ZVn0Du_5NZqUyQpqUZSmv7Qd66TMHYAtvsow7LH4/edit#heading=h.1nk6x0hs4enq>

Ah, OK I understand the proposed  2.0 API a bit better now.

I'd like to solve this now though, not in 6 / 12 / 18 months.  It sounds
like my proposal is compatible with the new API as well, so no problem
there!

Yes, I hear you :)

          * Some clouds will automatically respin images e.g. weekly
        with the
            latest updates.  This could also be exposed through metadata.
              os:updated_through= "20120301" ?

    Possible, though a version identifier would also work. Or simply a
    query like:
    GET
    /images/sort_by=created_at&__sort_order=desc&limit=1&__property-os:distro=Debian
    (and you can do that in the existing API as well, BTW)

created_at != updated_through in general.   As a simple example, the
cloud provider might make a "clean" image (stock Debian 6.0.3) at the
same time as you make a totally up-to-date image.

You can use updated_at as well in the existing API... though of course, it only refers to the image metadata, not the image file itself since that is static read-only once uploaded...

Not sure what you mean by a version identifier?

Debian 6.0.3 is a version identifier...

          * Some clouds will offer only bare images, some will provide a
        variety
            e.g. bare, LAMP, Hadoop, etc.  Should we use the native package
            names to indicate additional packages?  e.g.
            os:packages="apache2,mysql,__php5" ?

    IMHO, no. That would get overkill. In the new API, you could use
    tags, though. Or you could add a custom extension in the new API so
    that you could expose packages as a subresource of /images/<IMAGE_ID>/.

It is important to differentiate a "minimal" image from a "loaded"
image, users will typically want a loaded image, code will typically
want a bare image; alternative suggestions are welcome.  If we're going
to use tags in 2.0, shouldn't we use properties/metadata today?

Your definition of loaded != others' definition of loaded ;) I'm not sure there is going to be a whole lot of success in defining what loaded means, though there might be a better success of determining what "minimal" means.

Yes, in the existing API, you would use properties. And prefixing those properties is perfectly acceptable. Just pointing out we've come up with a different solution (tags plus a discoverable schema) in the 2.0 proposed API.

        As a (programmatic) consumer of these images, my wishlist:

          * A cloud will have to put on whatever drivers / agents they
        need to,
            but ideally these should otherwise be clean images, with minimal
            deviation from the stock install.  (Or 'clean' images should be
            available alongside other images)  It would be great if I
        could be
            launch a 'clean' image on any OpenStack cloud and have it behave
            more or less the same; I shouldn't have to second guess any
            additional tweaks.


    No disagreement from me here, but I don't see how this relates to a
    common set of image properties? Could you elaborate?

Here's what I see on Cloud X (edited to protect the party, and I removed
the kernels/ramdisks):

417 Debian Squeeze 6.0.3 Server 64-bit 20120123            ACTIVE
x_image_type=machine, image_location=local, image_state=available,
project_id=None, x_md_version=1, kernel_id=415, min_ram=0,
ramdisk_id=416, x_image_id=c89dee3bca7a62103f7d88d2a02f4dc8, owner=None,
x_image_builddate=20120123, architecture=amd64, min_disk=0,
x_image_version=1x1.1
414 CentOS 6.2 Server 64-bit 20120125                      ACTIVE
x_image_type=machine, image_location=local, image_state=available,
project_id=None, x_md_version=1, kernel_id=412, min_ram=0,
ramdisk_id=413, x_image_id=f2fbb1bf37a13e7c5da897c7082684df, owner=None,
x_image_builddate=20120125, architecture=x86_64, min_disk=0,
x_image_version=1x1
229 Ubuntu Oneiric 11.10 Server 64-bit 20111212            ACTIVE
image_location=local, image_state=available, kernel_id=228, min_ram=0,
min_disk=0, architecture=amd64, owner=None, project_id=None
227 Ubuntu Natty 11.04 Server 64-bit 20111212              ACTIVE
image_location=local, image_state=available, kernel_id=226, min_ram=0,
min_disk=0, architecture=amd64, owner=None, project_id=None
225 Ubuntu Maverick 10.10 Server 64-bit 20111212           ACTIVE
image_location=local, image_state=available, kernel_id=224, min_ram=0,
min_disk=0, architecture=amd64, owner=None, project_id=None
223 Ubuntu Lucid 10.04 LTS Server 64-bit 20111212          ACTIVE
image_location=local, image_state=available, kernel_id=222, min_ram=0,
min_disk=0, architecture=amd64, owner=None, project_id=None
221 CentOS 5.6 Server 64-bit 20111207                      ACTIVE
image_location=local, image_state=available, kernel_id=219, min_ram=0,
ramdisk_id=220, min_disk=0, architecture=x86_64, owner=None,
project_id=None
How do I write code to determine:
Is Debian Squeeze 6.0.4 available?
Is 6.0.3 is the best I can get?
Is Debian Wheezy available?
If there are two Debian 6.0.3 images, one bare and one loaded, how do I
tell them apart?
Is CentOS 6.2 a newer version of Debian (supposing I've never heard of
CentOS, for some reason)?

Gotcha, ok, I understand you better now... and I maintain that in the current API, standardizing on property prefixes is perfectly reasonable solution, but that in the proposed 2.0 API, you'd use the schema document and tags.

          * I would like to be able to launch the clean image and install
            updates myself, in case I don't want a particular update.
          Providing
            a fast apt cache is much better than providing respun
        images, for my
            use-case.  I would be great if old images stuck around,
        therefore!

    Again, no disagreement, but I'm also confused how this relates to
    standard image properties...

Just a request to bear in mind that some people won't want to run the
latest image.  Once we agree the metadata we can tell the images apart
easily (or you can hide them in the GUI), so there's no need to delete
old images unless there's an SSH exploit.

understood. :)

Best,
-jay


References