← Back to team overview

openstack team mailing list archive

Re: [NOVA] Possible causes for hung VMs (Diablo)

 

Compute/api sets the task state to "deleting" at the start of delete() but without updating the vm_state,   so if these were VMs that failed to build, or were deleted during the build, then you could get that combination.



-----Original Message-----
From: openstack-bounces+philip.day=hp.com@xxxxxxxxxxxxxxxxxxx [mailto:openstack-bounces+philip.day=hp.com@xxxxxxxxxxxxxxxxxxx] On Behalf Of Jay Pipes
Sent: 12 March 2012 16:27
To: openstack@xxxxxxxxxxxxxxxxxxx
Subject: [Openstack] [NOVA] Possible causes for hung VMs (Diablo)

Hey Stackers,

We've noticed while administering the TryStack site that VMs tend to get into a stuck 'building' VM state, but we haven't been able to track down exactly what might be causing the problems. Hoping I can get some insight from folks running Diablo-based clouds.

Here is what the Nova database has recorded for VMs in the building or error VM states:

mysql> select vm_state, task_state, count(*) from instances where
vm_state in ('building', 'error') group by vm_state, task_state order by
count(*) desc;
+----------+------------+----------+
| vm_state | task_state | count(*) |
+----------+------------+----------+
| building | deleting   |      128 |
| building | networking |       40 |
| building | scheduling |       26 |
| error    | spawning   |       10 |
| building | spawning   |        1 |
+----------+------------+----------+
5 rows in set (0.01 sec)


As you can see, the majority of stuck VMs are in a "building" vm_state but with a "deleting" task_state.

Could someone elaborate how something is in a "deleting" task state during a build process?

Thanks in advance for any hints!
-jay

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


References