← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1277494] [NEW] Stuck in vm_state SHUTOFF and task_state rebuild_spawning - stuck in to_xml() ?

 

Public bug reported:

See http://logs.openstack.org/32/71532/1/gate/gate-tempest-dsvm-
full/a90b312/

The first test to fail is test_rebuild_server_in_stop_state. Sequence of
events is:

  1. rebuild with new image id
  2. wait to transition to SHUTOFF/None
  3. Test is done, run cleanup operations ...
  4. Rebuild to old image id
  5. Wait to hit SHUTOFF/None
  6. Start the instance again

It is step 5 we get stuck and time out. We make this transition:

  2014-02-06 22:29:51,724 State transition "SHUTOFF/rebuilding" ==>
"SHUTOFF/rebuild_spawning" after 2 second wait

and never transition to ACTIVE/powering-off ... which suggests we get
stuck in spawn() somewhere

The last log entry for req-60fa2fbb-de78-4379-8ffa-bd0c70f52039 in n-cpu
is:

  [instance: a3ac8847-db15-4f5a-b087-5256b54a36f5] Start to_xml

We never get the corresponding 'End to_xml'

Now ... interestingly, there's a resume operation happening around the
same time (req-aeedee18-be57-419e-8775-0af26dd796de) and it fails with:

  "An error occurred while trying to launch a defined domain with xml:"

Hmm ... also interestingly, there isn't another "Start to_xml" in the
logs after this one - perhaps the stuck thread is holding a lock.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1277494

Title:
  Stuck in vm_state SHUTOFF and task_state rebuild_spawning - stuck in
  to_xml() ?

Status in OpenStack Compute (Nova):
  New

Bug description:
  See http://logs.openstack.org/32/71532/1/gate/gate-tempest-dsvm-
  full/a90b312/

  The first test to fail is test_rebuild_server_in_stop_state. Sequence
  of events is:

    1. rebuild with new image id
    2. wait to transition to SHUTOFF/None
    3. Test is done, run cleanup operations ...
    4. Rebuild to old image id
    5. Wait to hit SHUTOFF/None
    6. Start the instance again

  It is step 5 we get stuck and time out. We make this transition:

    2014-02-06 22:29:51,724 State transition "SHUTOFF/rebuilding" ==>
  "SHUTOFF/rebuild_spawning" after 2 second wait

  and never transition to ACTIVE/powering-off ... which suggests we get
  stuck in spawn() somewhere

  The last log entry for req-60fa2fbb-de78-4379-8ffa-bd0c70f52039 in
  n-cpu is:

    [instance: a3ac8847-db15-4f5a-b087-5256b54a36f5] Start to_xml

  We never get the corresponding 'End to_xml'

  Now ... interestingly, there's a resume operation happening around the
  same time (req-aeedee18-be57-419e-8775-0af26dd796de) and it fails
  with:

    "An error occurred while trying to launch a defined domain with
  xml:"

  Hmm ... also interestingly, there isn't another "Start to_xml" in the
  logs after this one - perhaps the stuck thread is holding a lock.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1277494/+subscriptions


Follow ups

References