← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1334398] Re: test_images_oneserver times out in tearDown during task_state "image_pending_upload"

 

Here is a logstash query on the libvirt connection reset error:

http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwicmFpc2UgbGlidmlydEVycm9yICgndmlyRG9tYWluQmxvY2tKb2JBYm9ydCgpIGZhaWxlZCdcIiBBTkQgbWVzc2FnZTpcImxpYnZpcnRFcnJvcjogVW5hYmxlIHRvIHJlYWQgZnJvbSBtb25pdG9yOiBDb25uZWN0aW9uIHJlc2V0IGJ5IHBlZXJcIiBBTkQgdGFnczpcInNjcmVlbi1uLWNwdS50eHRcIiIsImZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiMTcyODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTQwMzcyMjMxMDczNX0=

159 hits in 2 days, there is a small successful percentage when that
shows up.

Looking at the tests when this fails, they are doing image snapshots so
this seems like a pretty good query.

** Changed in: nova
   Importance: Undecided => High

** No longer affects: glance

** Summary changed:

- test_images_oneserver times out in tearDown during task_state "image_pending_upload"
+ snapshot hangs when libvirt connection is reset

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1334398

Title:
  snapshot hangs when libvirt connection is reset

Status in OpenStack Compute (Nova):
  New

Bug description:
  Seeing this here:

  http://logs.openstack.org/70/97670/5/check/check-tempest-dsvm-
  postgres-full/7d4c7cf/console.html

  2014-06-24 23:15:41.714 | tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestJSON.test_create_image_specify_multibyte_character_image_name[gate]
  2014-06-24 23:15:41.714 | ---------------------------------------------------------------------------------------------------------------------------------------
  2014-06-24 23:15:41.714 | 
  2014-06-24 23:15:41.714 | Captured traceback-1:
  2014-06-24 23:15:41.714 | ~~~~~~~~~~~~~~~~~~~~~
  2014-06-24 23:15:41.715 |     Traceback (most recent call last):
  2014-06-24 23:15:41.715 |       File "tempest/services/compute/json/images_client.py", line 86, in delete_image
  2014-06-24 23:15:41.715 |         resp, body = self.delete("images/%s" % str(image_id))
  2014-06-24 23:15:41.715 |       File "tempest/common/rest_client.py", line 224, in delete
  2014-06-24 23:15:41.715 |         return self.request('DELETE', url, extra_headers, headers, body)
  2014-06-24 23:15:41.715 |       File "tempest/common/rest_client.py", line 430, in request
  2014-06-24 23:15:41.715 |         resp, resp_body)
  2014-06-24 23:15:41.715 |       File "tempest/common/rest_client.py", line 474, in _error_checker
  2014-06-24 23:15:41.715 |         raise exceptions.NotFound(resp_body)
  2014-06-24 23:15:41.715 |     NotFound: Object not found
  2014-06-24 23:15:41.715 |     Details: {"itemNotFound": {"message": "Image not found.", "code": 404}}
  2014-06-24 23:15:41.716 |     
  2014-06-24 23:15:41.716 | 
  2014-06-24 23:15:41.716 | Captured traceback:
  2014-06-24 23:15:41.716 | ~~~~~~~~~~~~~~~~~~~
  2014-06-24 23:15:41.716 |     Traceback (most recent call last):
  2014-06-24 23:15:41.716 |       File "tempest/api/compute/images/test_images_oneserver.py", line 31, in tearDown
  2014-06-24 23:15:41.716 |         self.server_check_teardown()
  2014-06-24 23:15:41.716 |       File "tempest/api/compute/base.py", line 161, in server_check_teardown
  2014-06-24 23:15:41.716 |         'ACTIVE')
  2014-06-24 23:15:41.716 |       File "tempest/services/compute/json/servers_client.py", line 173, in wait_for_server_status
  2014-06-24 23:15:41.716 |         raise_on_error=raise_on_error)
  2014-06-24 23:15:41.717 |       File "tempest/common/waiters.py", line 107, in wait_for_server_status
  2014-06-24 23:15:41.717 |         raise exceptions.TimeoutException(message)
  2014-06-24 23:15:41.717 |     TimeoutException: Request timed out
  2014-06-24 23:15:41.717 |     Details: (ImagesOneServerTestJSON:tearDown) Server 90c79adf-4df1-497c-a786-13bdc5cca98d failed to reach ACTIVE status and task state "None" within the required time (196 s). Current status: ACTIVE. Current task state: image_pending_upload.

  
  Looks like it's trying to delete image with uuid 518a32d0-f323-413c-95c2-dd8299716c19 which doesn't exist, because it's still uploading?

  
  This is maybe related to bug 1320617 as a general performance issue with glance.

  Looking in the glance registry log, the image is created here:

  2014-06-24 22:51:23.538 15740 INFO glance.registry.api.v1.images
  [13c1b477-cd22-44ca-ba0d-bf1b19202df6 d01d4977b5cc4e20a99e1d7ca58ce444
  207d083a31944716b9cd2ecda0f09ce7 - - -] Successfully created image
  518a32d0-f323-413c-95c2-dd8299716c19

  The image is deleted here:

  2014-06-24 22:54:53.146 15740 INFO glance.registry.api.v1.images
  [7c29f253-acef-41a0-b62b-c3087f7617ef d01d4977b5cc4e20a99e1d7ca58ce444
  207d083a31944716b9cd2ecda0f09ce7 - - -] Successfully deleted image
  518a32d0-f323-413c-95c2-dd8299716c19

  And the 'not found' is here:

  2014-06-24 22:54:56.508 15740 INFO glance.registry.api.v1.images
  [c708cf1f-27a8-4003-9c29-6afca7dd9bb8 d01d4977b5cc4e20a99e1d7ca58ce444
  207d083a31944716b9cd2ecda0f09ce7 - - -] Image 518a32d0-f323-413c-
  95c2-dd8299716c19 not found

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1334398/+subscriptions


References