← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1340993] [NEW] Glance + SSL - Image download errors

 

Public bug reported:

Hello,

I have a latest stable havana (2013.2.3) openstack setup and I am
noticing issues occasionally when downloading new backing files for vm's
to compute nodes.  I will occasionally end up with vm's that are stuck
spawning, upon investigation I can see the backing file under
/var/nova/instances/_base/<sha1 of imageuuid>.part is created but is
only partially downloaded and hasn't been update in some time (some
times days).  Side not - you are unable to a delete a vm in this state
successfully - it will always be stuck in deleting, until you restart
nova-compute on the compute node and perform the delete again.

I have managed to create some scripts that will replicate the issue
multiple ways.  The image files that I have been testing with are 8.8gb,
8.6gb and a large 60gb image (however another larger 8gb image would
also duplicate the issue).

The first script:
https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-
multiboot-sh

Will take the image files that you give it and will deploy a vm per
image file to the compute node that you have specified.  With SSL
enabled typically only 1 VM will ever boot successfully.  Errors here
will range from failed (md5sum mismatches) image downloads to backing
files that are only partially downloaded.  To narrow down the issue I
switched over to using the glance client to do image downloads.

The second script:
https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-multi-
img-download-sh

Will take the images specified on the command line and run the glance
image-download command in a parallel bash subshell.  This script removes
nova from the mix.  However, errors seen here are the same as what I
have seen with the first script.

The thrid script:
https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-multi-
img-download-newclient-sh

Uses: https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-
client-py instead of the glance cli to download the image.  I believe it
also uses a different download library as well.  WIth this client I will
usually get 2 successful images downloads (sometimes 3), but the issue
still exists.

With all the scripts, and after a lot of testing I have found that this
issue is 100% re-producible when trying to download 3 images at the same
time.   But I have also noticed in production that this issue happens
when only downloading a single image on a compute node.

** Affects: glance
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1340993

Title:
  Glance + SSL - Image download errors

Status in OpenStack Image Registry and Delivery Service (Glance):
  New

Bug description:
  Hello,

  I have a latest stable havana (2013.2.3) openstack setup and I am
  noticing issues occasionally when downloading new backing files for
  vm's to compute nodes.  I will occasionally end up with vm's that are
  stuck spawning, upon investigation I can see the backing file under
  /var/nova/instances/_base/<sha1 of imageuuid>.part is created but is
  only partially downloaded and hasn't been update in some time (some
  times days).  Side not - you are unable to a delete a vm in this state
  successfully - it will always be stuck in deleting, until you restart
  nova-compute on the compute node and perform the delete again.

  I have managed to create some scripts that will replicate the issue
  multiple ways.  The image files that I have been testing with are
  8.8gb, 8.6gb and a large 60gb image (however another larger 8gb image
  would also duplicate the issue).

  The first script:
  https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-
  multiboot-sh

  Will take the image files that you give it and will deploy a vm per
  image file to the compute node that you have specified.  With SSL
  enabled typically only 1 VM will ever boot successfully.  Errors here
  will range from failed (md5sum mismatches) image downloads to backing
  files that are only partially downloaded.  To narrow down the issue I
  switched over to using the glance client to do image downloads.

  The second script:
  https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-multi-
  img-download-sh

  Will take the images specified on the command line and run the glance
  image-download command in a parallel bash subshell.  This script
  removes nova from the mix.  However, errors seen here are the same as
  what I have seen with the first script.

  The thrid script:
  https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-multi-
  img-download-newclient-sh

  Uses: https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-
  client-py instead of the glance cli to download the image.  I believe
  it also uses a different download library as well.  WIth this client I
  will usually get 2 successful images downloads (sometimes 3), but the
  issue still exists.

  With all the scripts, and after a lot of testing I have found that
  this issue is 100% re-producible when trying to download 3 images at
  the same time.   But I have also noticed in production that this issue
  happens when only downloading a single image on a compute node.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1340993/+subscriptions


Follow ups

References