yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #06394
[Bug 1236930] Re: attempting to reboot a shutdown/suspened/crashed/paused instance appears to have failed, but then surprisingly succeeds two minutes later
** Changed in: nova
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1236930
Title:
attempting to reboot a shutdown/suspened/crashed/paused instance
appears to have failed, but then surprisingly succeeds two minutes
later
Status in OpenStack Compute (Nova):
Fix Released
Bug description:
I am running Havana from precise-proposed in the UCA (nova
1:2013.2~b3-0ubuntu1~cloud0).
To reproduce:
- start an instance
- reboot (sudo reboot) the compute node on which it is running
- after the compute node is done booting, the instance will be off:
root@xen10:~# nova list
+--------------------------------------+------+---------+------------+-------------+-------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------+---------+------------+-------------+-------------------------+
| 4824dce8-d876-4022-a446-3fc8d708ac62 | test | SHUTOFF | None | Shutdown | novanetwork=172.20.46.3 |
+--------------------------------------+------+---------+------------+-------------+-------------------------+
(note that although my hostname has "xen" in it, I'm using KVM.
Haven't updated DNS yet...)
- attempt to reboot the instance (nova reboot
4824dce8-d876-4022-a446-3fc8d708ac62)
# nova show 4824dce8-d876-4022-a446-3fc8d708ac62
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| status | SHUTOFF |
| updated | 2013-10-08T15:28:47Z |
| OS-EXT-STS:task_state | rebooting |
The reboot fails. The compute node will log:
2013-10-08 11:28:55.579 1400 WARNING nova.compute.manager [req-
11fe1624-22f6-4348-81c5-185d0ce0d3a0 a70453729dd84bfd8f31019b1bb91e40
46ab32189ab64a4c92f8f64e6c9ed028] [instance:
4824dce8-d876-4022-a446-3fc8d708ac62] trying to reboot a non-running
instance: (state: 4 expected: 1)
- attempt to start the instance (nova start
4824dce8-d876-4022-a446-3fc8d708ac62):
produces console output:
ERROR: Instance 4824dce8-d876-4022-a446-3fc8d708ac62 in task_state rebooting. Cannot start while the instance is in this state. (HTTP 400) (Request-ID: req-732224e1-8c34-4754-84f7-7a8476673185)
- wait about 120 seconds, and the compute node will log:
2013-10-08 11:30:56.082 1400 WARNING nova.virt.libvirt.driver [req-11fe1624-22f6-4348-81c5-185d0ce0d3a0 a70453729dd84bfd8f31019b1bb91e40 46ab32189ab64a4c92f8f64e6c9ed028] [instance: 4824dce8-d876-4022-a446-3fc8d708ac62] Failed to soft reboot instance. Trying hard reboot.
Afterwards, the instance will be running.
It's confusing that the reboot logs a failure for a very obvious
reason (an instance that is not running can't be *re*booted), yet the
instance's state remains as "rebooting". I had expected that the
reboot had failed, and openstack was in some consistant state. I was
then again suprised when in fact it *was* still rebooting -- it just
took two minutes to do so. Less confusing would be to catch the
original error, and report the reboot as failed. The log messages are
confusing, because the first sets the expectation that a non-running
instance can't be rebooted, but it can (two minutes later).
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1236930/+subscriptions