openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #07287
glance performance gains via sendfile()
Hi Reynolds,
I've been looking into your interesting idea around sendfile()[1]
usage, here are a few initial thoughts:
- There's potentially even more speed-up to be harnessed in serving
out images from the filesystem store via sendfile(), than from using
it client-side on the initial upload (based on the assumption that
images would typically uploaded once but downloaded many times, also
that the download time is more crucial for perceived responsiveness
as an instance being spun up by nova may be held up until the image
is retreived from glance, unless already cached).
- I'd suspect that some of the potential gain on the client side is
currently thrown away by the syntax of the glance add CLI, specifically
the use of shell redirection to pass the image content:
glance add name=MyImage < /path/to/image/file
I'm open to correction, but this seems to needlessly copy the image
content via user-space, even if the glance client avoids a second
copy internally via the sendfile() usage. So I'd propose to also add
a new cmd line arg to allow the file be directly specified, e.g.
glance add name=MyImage path=/path/to/image/file
This would have different semantics to the location field, e.g
location=file:///path/to/image/file
(which would imply that the content is not uploaded to the remote
store).
- The structure of typical pysendfile usage gets in the way of
glance's image iterator pattern. On the client side this is more an
an incovenience, requiring some restructuring of the code. However
on the service-side, it seems we're a bit a hamstrung by the
WSGI/webob APIs. For example the webob.response.body_file is
filelike but doesn't expose a fileno attribute as there's no real
underlying FD available intially, so the following kind of approach
isn't workable:
sendfile(response.body_file.fileno(),
filestore_path.fileno(), ...)
Seems a better approach would be to allow glance to be optionally
deployed on a WSGI container that directly supports the [2]
wsgi.file_wrapper extension (e.g. mod_wsgi on Apache, or uWSGI) or
even allow one of the non-standard headers like X-Sendfile or
X-Accel-Redirect to be set where supported.
In case, I'll crack on with the basic client-side usage to begin with,
so that we can quantify the performance gain.
Cheers,
Eoghan
[1] http://code.google.com/p/pysendfile/
[2] http://www.python.org/dev/peps/pep-0333/#optional-platform-specific-file-handling
Follow ups