yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #09408
[Bug 1277494] [NEW] Stuck in vm_state SHUTOFF and task_state rebuild_spawning - stuck in to_xml() ?
Public bug reported:
See http://logs.openstack.org/32/71532/1/gate/gate-tempest-dsvm-
full/a90b312/
The first test to fail is test_rebuild_server_in_stop_state. Sequence of
events is:
1. rebuild with new image id
2. wait to transition to SHUTOFF/None
3. Test is done, run cleanup operations ...
4. Rebuild to old image id
5. Wait to hit SHUTOFF/None
6. Start the instance again
It is step 5 we get stuck and time out. We make this transition:
2014-02-06 22:29:51,724 State transition "SHUTOFF/rebuilding" ==>
"SHUTOFF/rebuild_spawning" after 2 second wait
and never transition to ACTIVE/powering-off ... which suggests we get
stuck in spawn() somewhere
The last log entry for req-60fa2fbb-de78-4379-8ffa-bd0c70f52039 in n-cpu
is:
[instance: a3ac8847-db15-4f5a-b087-5256b54a36f5] Start to_xml
We never get the corresponding 'End to_xml'
Now ... interestingly, there's a resume operation happening around the
same time (req-aeedee18-be57-419e-8775-0af26dd796de) and it fails with:
"An error occurred while trying to launch a defined domain with xml:"
Hmm ... also interestingly, there isn't another "Start to_xml" in the
logs after this one - perhaps the stuck thread is holding a lock.
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1277494
Title:
Stuck in vm_state SHUTOFF and task_state rebuild_spawning - stuck in
to_xml() ?
Status in OpenStack Compute (Nova):
New
Bug description:
See http://logs.openstack.org/32/71532/1/gate/gate-tempest-dsvm-
full/a90b312/
The first test to fail is test_rebuild_server_in_stop_state. Sequence
of events is:
1. rebuild with new image id
2. wait to transition to SHUTOFF/None
3. Test is done, run cleanup operations ...
4. Rebuild to old image id
5. Wait to hit SHUTOFF/None
6. Start the instance again
It is step 5 we get stuck and time out. We make this transition:
2014-02-06 22:29:51,724 State transition "SHUTOFF/rebuilding" ==>
"SHUTOFF/rebuild_spawning" after 2 second wait
and never transition to ACTIVE/powering-off ... which suggests we get
stuck in spawn() somewhere
The last log entry for req-60fa2fbb-de78-4379-8ffa-bd0c70f52039 in
n-cpu is:
[instance: a3ac8847-db15-4f5a-b087-5256b54a36f5] Start to_xml
We never get the corresponding 'End to_xml'
Now ... interestingly, there's a resume operation happening around the
same time (req-aeedee18-be57-419e-8775-0af26dd796de) and it fails
with:
"An error occurred while trying to launch a defined domain with
xml:"
Hmm ... also interestingly, there isn't another "Start to_xml" in the
logs after this one - perhaps the stuck thread is holding a lock.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1277494/+subscriptions
Follow ups
References