← Back to team overview

openstack team mailing list archive

glance performance gains via sendfile()

 


Hi Reynolds,

I've been looking into your interesting idea around sendfile()[1]
usage, here are a few initial thoughts:

- There's potentially even more speed-up to be harnessed in serving
  out images from the filesystem store via sendfile(), than from using
  it client-side on the initial upload (based on the assumption that
  images would typically uploaded once but downloaded many times, also
  that the download time is more crucial for perceived responsiveness
  as an instance being spun up by nova may be held up until the image
  is retreived from glance, unless already cached).

- I'd suspect that some of the potential gain on the client side is
  currently thrown away by the syntax of the glance add CLI, specifically
  the use of shell redirection to pass the image content:

    glance add name=MyImage < /path/to/image/file

  I'm open to correction, but this seems to needlessly copy the image
  content via user-space, even if the glance client avoids a second
  copy internally via the sendfile() usage. So I'd propose to also add
  a new cmd line arg to allow the file be directly specified, e.g.

    glance add name=MyImage path=/path/to/image/file

  This would have different semantics to the location field, e.g 

    location=file:///path/to/image/file

  (which would imply that the content is not uploaded to the remote
  store).

- The structure of typical pysendfile usage gets in the way of
  glance's image iterator pattern. On the client side this is more an
  an incovenience, requiring some restructuring of the code. However
  on the service-side, it seems we're a bit a hamstrung by the
  WSGI/webob APIs. For example the webob.response.body_file is
  filelike but doesn't expose a fileno attribute as there's no real
  underlying FD available intially, so the following kind of approach
  isn't workable:

    sendfile(response.body_file.fileno(), 
             filestore_path.fileno(), ...)

  Seems a better approach would be to allow glance to be optionally
  deployed on a WSGI container that directly supports the [2]
  wsgi.file_wrapper extension (e.g. mod_wsgi on Apache, or uWSGI) or
  even allow one of the non-standard headers like X-Sendfile or 
  X-Accel-Redirect to be set where supported.

In case, I'll crack on with the basic client-side usage to begin with,
so that we can quantify the performance gain.

Cheers,
Eoghan

[1] http://code.google.com/p/pysendfile/
[2] http://www.python.org/dev/peps/pep-0333/#optional-platform-specific-file-handling


Follow ups