← Back to team overview

openstack team mailing list archive

Re: Glance API semantics when image sizes aren't known

 

Thanks Tom, that's exactly the code I was looking for.  As Jay said yesterday, the Swift backend is expected to update image_size in this situation, so it's obviously an accident that this code got removed.

I've just proposed using manifest-style upload for images with unknown size, so if people like that idea, we can tot up the total data transferred on the way through.  Otherwise, we can use your code exactly as below.

Cheers,

Ewan.

> -----Original Message-----
> From: openstack-bounces+ewan.mellor=citrix.com@xxxxxxxxxxxxxxxxxxx
> [mailto:openstack-bounces+ewan.mellor=citrix.com@xxxxxxxxxxxxxxxxxxx]
> On Behalf Of Hancock, Tom (HP Cloud Services)
> Sent: 16 November 2011 04:49
> To: openstack@xxxxxxxxxxxxxxxxxxx
> Subject: Re: [Openstack] Glance API semantics when image sizes aren't
> known
> 
> We ran into a backward compat issue closely related to this last week.
> 
> If the newer Glance code in Diablo is running on a server
> configured with a Swift backend and interacting with an
> older Glance client on an older Nova system that doesn't set
> content-length then (even assuming the image is < 5GB)
> it gets stored in the backend using the usual means, but
> the image size in the registry is never set correctly.
> These few lines were in the pre-diablo code to do a HEAD
> on the object in swift if the size was unknown and if
> they're re-added then the registry gets updated with
> the correct length. It looks like this still applies to essex.
> 
> regards,
> Tom
> 
> ---------------------------- glance/store/swift.py --------------------
> --------
> @@ -339,6 +339,12 @@ class Store(glance.store.base.Store):
>                  # best...
>                  obj_etag = swift_conn.put_object(self.container,
> obj_name,
>                                                   image_file)
> +                if image_size == 0:
> +                    resp_headers =
> swift_conn.head_object(self.container, obj_name)
> +                    # header keys are lowercased by Swift
> +                    if 'content-length' in resp_headers:
> +                        image_size = int(resp_headers['content-
> length'])
> +                        logger.debug(_("Zero image size so setting
> object size to %s") % image_size)
>              else:
>                  # Write the image into Swift in chunks. We cannot
>                  # stream chunks of the webob.Request.body_file,
> unfortunately,
> 
> ---
> 
> Tomas Hancock, HP Cloud Services, Hewlett Packard, Galway. Ireland
> +353-91-754765
> 
> Postal Address   : Hewlett Packard Galway Limited, European Software
> Centre, Ballybrit Business Park, Galway, Ireland
> Registered Office: Hewlett Packard Galway Limited, 63-74 Sir John
> Rogerson's Quay, Dublin 2 Registered Number: 361933
> 
> The contents of this message and any attachments to it are confidential
> and may be legally privileged. If you have received this message in
> error you should delete it from your system immediately and advise the
> sender. To any recipient of this message within HP, unless otherwise
> stated, you should consider this message and attachments as "HP
> CONFIDENTIAL".
> 
> 
> -----Original Message-----
> From: openstack-bounces+tom.hancock=hp.com@xxxxxxxxxxxxxxxxxxx
> [mailto:openstack-bounces+tom.hancock=hp.com@xxxxxxxxxxxxxxxxxxx] On
> Behalf Of Jay Pipes
> Sent: 16 November 2011 03:48
> To: Ewan Mellor
> Cc: openstack@xxxxxxxxxxxxxxxxxxx
> Subject: Re: [Openstack] Glance API semantics when image sizes aren't
> known
> 
> On Tue, Nov 15, 2011 at 5:26 PM, Ewan Mellor
> <Ewan.Mellor@xxxxxxxxxxxxx> wrote:
> >> From: Ewan Mellor
> >> > From: Jay Pipes [mailto:jaypipes@xxxxxxxxx]
> >> > > From: Ewan Mellor
> >> > > ~ # cat test_glance.py
> >> > > import sys
> >> > > import glance.client
> >> > >
> >> > > client = glance.client.Client('localhost', 9292,
> >> > auth_tok="999888777666")
> >> > > print client.add_image({}, sys.stdin)
> >> > > ~ # echo a | python26 ./test_glance.py
> >> > > {u'status': u'active', u'name': None, u'deleted': False,
> >> > u'container_format': None, u'created_at': u'2011-11-15T21:44:21',
> >> > u'disk_format': None, u'updated_at': u'2011-11-15T21:44:22',
> u'id':
> >> 6,
> >> > u'owner': u'Administrator', u'location':
> >> > u'swift+http://root:password@localhost:5000/v1.0/glance/6',
> >> u'min_ram':
> >> > 0, u'checksum': u'60b725f10c9c85c70d97880dfe8191b3', u'min_disk':
> 0,
> >> > u'is_public': False, u'deleted_at': None, u'properties': {},
> u'size':
> >> > 0}
> >> > >
> >> > > Note that size is returned as 0, not 1.
> >> >
> >> > Yes, indeed. That is because the client is not designed to be used
> >> > that way [snip]
> >>
> >> But it could perfectly well be used this way, modulo the bug above,
> and
> >> this is a very useful thing to be able to do (stdin here could be a
> >> decompression or decryption pipeline, or a read from a remote
> socket,
> >> or
> >> whatever.
> >>
> >> With the bug in the Swift backend fixed, I think it will work just
> fine
> >> to stream through a glance client in this way.
> >
> > Erm, well it will until we hit the 5GB Swift chunking limit, anyway.
> >
> > Do you know what this means (glance/store/swift.py +347)?
> > Why can't we stream from body_file?
> >
> >            else:
> >                # Write the image into Swift in chunks. We cannot
> >                # stream chunks of the webob.Request.body_file,
> unfortunately,
> >                # so we must write chunks of the body_file into a
> temporary
> >                # disk buffer, and then pass this disk buffer to
> Swift.
> 
> We can (and we do in the case when image_size=0 when calling the Swift
> backend store's add() method). Unfortunately, since you don't know the
> file size, you begin streaming to Swift (essentially by passing
> body_file on to the swift.client.Connection.put_object() method).
> Unfortunately, if the file is >5GB, what will happen is that Swift
> will vomit after attempting to upload 5G of data, resulting in a
> situation where the Glance Swift storage driver would then have to try
> to restart the upload using a segmented approach. But in order to do
> that, the driver would have to seek(0) back on the body_file.
> 
> However, by default, webob.Request.body_file is NOT seekable. To make
> it seekable, you need to either do this:
> 
> webob.Request.make_body_seekable()
> 
> or this:
> 
> webob.Request.is_body_seekable = True
> 
> Unfortunately, doing either of the above results in the entire request
> body being buffered to disk on the server side in order for webob to
> convert the (unseekable) StringIO it uses for webob.Request.body into
> a (seekable) regular file object that it then assigns to its
> webob.Request.body_file attribute.
> 
> So, let's say we tried to handle re-trying a segmented upload after we
> first try to upload a 6GB image file *without* setting the image_size
> to the actual size. What would happen is this:
> 
> Write 5GB of data to the swift socket
> Get error, make body_file seekable in order to seek(0) back to the
> request body's starting point, which results in copying the entire
> body_file (6GB) into a temporary file on the server
> seek(SEEK_END) on that body_file to get the real image size, and then
> stream 6GB of data in chunks to Swift using its segmented upload
> method.
> 
> Total data written to disk and network for 6GB image file: 5GB + 6GB +
> 6GB = 17GB
> 
> Or you can set the image_size properly by determining the file size
> ahead of time. :)
> 
> You can read more about this issue here:
> 
> https://bugs.launchpad.net/glance/+bug/794718
> https://bugs.launchpad.net/glance/+bug/818292
> upstream: https://bitbucket.org/ianb/webob/issue/12/fix-for-issue-6-
> broke-chunked-transfer
> 
> Cheers,
> -jay
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

References