[NOVA] Possible causes for hung VMs (Diablo)


Hey Stackers,

We've noticed while administering the TryStack site that VMs tend to get into a stuck 'building' VM state, but we haven't been able to track down exactly what might be causing the problems. Hoping I can get some insight from folks running Diablo-based clouds.

Here is what the Nova database has recorded for VMs in the building or error VM states:

mysql> select vm_state, task_state, count(*) from instances where vm_state in ('building', 'error') group by vm_state, task_state order by count(*) desc;
| vm_state | task_state | count(*) |
| building | deleting   |      128 |
| building | networking |       40 |
| building | scheduling |       26 |
| error    | spawning   |       10 |
| building | spawning   |        1 |
5 rows in set (0.01 sec)

As you can see, the majority of stuck VMs are in a "building" vm_state but with a "deleting" task_state.

Could someone elaborate how something is in a "deleting" task state during a build process?

Thanks in advance for any hints!

