yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #17224
[Bug 1340993] [NEW] Glance + SSL - Image download errors
Public bug reported:
Hello,
I have a latest stable havana (2013.2.3) openstack setup and I am
noticing issues occasionally when downloading new backing files for vm's
to compute nodes. I will occasionally end up with vm's that are stuck
spawning, upon investigation I can see the backing file under
/var/nova/instances/_base/<sha1 of imageuuid>.part is created but is
only partially downloaded and hasn't been update in some time (some
times days). Side not - you are unable to a delete a vm in this state
successfully - it will always be stuck in deleting, until you restart
nova-compute on the compute node and perform the delete again.
I have managed to create some scripts that will replicate the issue
multiple ways. The image files that I have been testing with are 8.8gb,
8.6gb and a large 60gb image (however another larger 8gb image would
also duplicate the issue).
The first script:
https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-
multiboot-sh
Will take the image files that you give it and will deploy a vm per
image file to the compute node that you have specified. With SSL
enabled typically only 1 VM will ever boot successfully. Errors here
will range from failed (md5sum mismatches) image downloads to backing
files that are only partially downloaded. To narrow down the issue I
switched over to using the glance client to do image downloads.
The second script:
https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-multi-
img-download-sh
Will take the images specified on the command line and run the glance
image-download command in a parallel bash subshell. This script removes
nova from the mix. However, errors seen here are the same as what I
have seen with the first script.
The thrid script:
https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-multi-
img-download-newclient-sh
Uses: https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-
client-py instead of the glance cli to download the image. I believe it
also uses a different download library as well. WIth this client I will
usually get 2 successful images downloads (sometimes 3), but the issue
still exists.
With all the scripts, and after a lot of testing I have found that this
issue is 100% re-producible when trying to download 3 images at the same
time. But I have also noticed in production that this issue happens
when only downloading a single image on a compute node.
** Affects: glance
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1340993
Title:
Glance + SSL - Image download errors
Status in OpenStack Image Registry and Delivery Service (Glance):
New
Bug description:
Hello,
I have a latest stable havana (2013.2.3) openstack setup and I am
noticing issues occasionally when downloading new backing files for
vm's to compute nodes. I will occasionally end up with vm's that are
stuck spawning, upon investigation I can see the backing file under
/var/nova/instances/_base/<sha1 of imageuuid>.part is created but is
only partially downloaded and hasn't been update in some time (some
times days). Side not - you are unable to a delete a vm in this state
successfully - it will always be stuck in deleting, until you restart
nova-compute on the compute node and perform the delete again.
I have managed to create some scripts that will replicate the issue
multiple ways. The image files that I have been testing with are
8.8gb, 8.6gb and a large 60gb image (however another larger 8gb image
would also duplicate the issue).
The first script:
https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-
multiboot-sh
Will take the image files that you give it and will deploy a vm per
image file to the compute node that you have specified. With SSL
enabled typically only 1 VM will ever boot successfully. Errors here
will range from failed (md5sum mismatches) image downloads to backing
files that are only partially downloaded. To narrow down the issue I
switched over to using the glance client to do image downloads.
The second script:
https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-multi-
img-download-sh
Will take the images specified on the command line and run the glance
image-download command in a parallel bash subshell. This script
removes nova from the mix. However, errors seen here are the same as
what I have seen with the first script.
The thrid script:
https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-multi-
img-download-newclient-sh
Uses: https://gist.github.com/krislindgren/fc519aa03d350f42e9e6#file-
client-py instead of the glance cli to download the image. I believe
it also uses a different download library as well. WIth this client I
will usually get 2 successful images downloads (sometimes 3), but the
issue still exists.
With all the scripts, and after a lot of testing I have found that
this issue is 100% re-producible when trying to download 3 images at
the same time. But I have also noticed in production that this issue
happens when only downloading a single image on a compute node.
To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1340993/+subscriptions
Follow ups
References