yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #03651
[Bug 1197514] Re: migrate from stopped goes back to ACTIVE state after resize-confirm
** Changed in: nova
Status: Fix Committed => Fix Released
** Changed in: nova
Milestone: None => havana-2
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1197514
Title:
migrate from stopped goes back to ACTIVE state after resize-confirm
Status in OpenStack Compute (Nova):
Fix Released
Bug description:
On havana master, running with this setup:
[root@ngp02-hv1 ~]# nova host-list
+-----------------------------+-------------+----------+
| host_name | service | zone |
+-----------------------------+-------------+----------+
| CN10 | compute | nova |
| cn12 | compute | nova |
| cn4 | compute | nova |
| cn6 | compute | nova |
| cn8 | compute | nova |
| ngp02-hv1.private.cloud.com | cells | internal |
| ngp02-hv1.private.cloud.com | cert | internal |
| ngp02-hv1.private.cloud.com | compute | nova |
| ngp02-hv1.private.cloud.com | conductor | internal |
| ngp02-hv1.private.cloud.com | console | internal |
| ngp02-hv1.private.cloud.com | consoleauth | internal |
| ngp02-hv1.private.cloud.com | scheduler | internal |
+-----------------------------+-------------+----------+
The compute nodes are all running the hyper-v driver.
1. Boot an instance to CN10:
nova boot --image quantal-server-cloudimg-amd64-disk1 --flavor 2
--availability-zone nova:CN10 --user-data
/home/testscripts/myuserdata.txt --config-drive true Ivt_July2_9
2. Then stop it:
[root@ngp02-hv1 testscripts]# nova stop Ivt_July2_9
[root@ngp02-hv1 testscripts]# nova show Ivt_July2_9
+-------------------------------------+----------------------------------------------------------------------------+
| Property | Value |
+-------------------------------------+----------------------------------------------------------------------------+
| status | SHUTOFF |
| updated | 2013-07-02T17:21:36Z |
| OS-EXT-STS:task_state | None |
| OS-EXT-SRV-ATTR:host | CN10 |
| key_name | None |
| image | quantal-server-cloudimg-amd64-disk1 (8495ff12-ee3c-4fe4-b46b-7b3b10641c87) |
| network1 network | 10.0.1.49 |
| hostId | 114d3be3249df7d93321ac50d4672be322989685e6f99d4cd8f87fb1 |
| OS-EXT-STS:vm_state | stopped |
| OS-EXT-SRV-ATTR:instance_name | instance-00000090 |
| OS-SRV-USG:launched_at | 2013-07-02T16:59:30.310000 |
| OS-EXT-SRV-ATTR:hypervisor_hostname | CN10 |
| flavor | m1.small (2) |
| id | 6c9a28c8-c52c-4046-ad83-00e12ee0994d |
| security_groups | [{u'name': u'default'}] |
| OS-SRV-USG:terminated_at | None |
| user_id | b25a17a8fcfa44098e45e0cfe5ae4fee |
| name | Ivt_July2_9 |
| created | 2013-07-02T16:59:00Z |
| tenant_id | 066c47e2d09b440aad1845fcea9959ba |
| OS-DCF:diskConfig | MANUAL |
| metadata | {} |
| accessIPv4 | |
| accessIPv6 | |
| OS-EXT-STS:power_state | 4 |
| OS-EXT-AZ:availability_zone | nova |
| config_drive | 1 |
+-------------------------------------+----------------------------------------------------------------------------+
3. Then migrate it (moves from CN10 to cn4):
[root@ngp02-hv1 testscripts]# nova migrate Ivt_July2_9
[root@ngp02-hv1 testscripts]# nova show Ivt_July2_9
[root@ngp02-hv1 testscripts]# nova show Ivt_July2_9
+-------------------------------------+----------------------------------------------------------------------------+
| Property | Value |
+-------------------------------------+----------------------------------------------------------------------------+
| status | VERIFY_RESIZE |
| updated | 2013-07-02T17:22:27Z |
| OS-EXT-STS:task_state | None |
| OS-EXT-SRV-ATTR:host | cn4 |
| key_name | None |
| image | quantal-server-cloudimg-amd64-disk1 (8495ff12-ee3c-4fe4-b46b-7b3b10641c87) |
| network1 network | 10.0.1.49 |
| hostId | aec601f205a7665ea600c0fed93cf50e9bb779c2d7625e4f7da23944 |
| OS-EXT-STS:vm_state | resized |
| OS-EXT-SRV-ATTR:instance_name | instance-00000090 |
| OS-SRV-USG:launched_at | 2013-07-02T17:22:34.143000 |
| OS-EXT-SRV-ATTR:hypervisor_hostname | cn4 |
| flavor | m1.small (2) |
| id | 6c9a28c8-c52c-4046-ad83-00e12ee0994d |
| security_groups | [{u'name': u'default'}] |
| OS-SRV-USG:terminated_at | None |
| user_id | b25a17a8fcfa44098e45e0cfe5ae4fee |
| name | Ivt_July2_9 |
| created | 2013-07-02T16:59:00Z |
| tenant_id | 066c47e2d09b440aad1845fcea9959ba |
| OS-DCF:diskConfig | MANUAL |
| metadata | {} |
| accessIPv4 | |
| accessIPv6 | |
| progress | 0 |
| OS-EXT-STS:power_state | 4 |
| OS-EXT-AZ:availability_zone | nova |
| config_drive | 1 |
+-------------------------------------+----------------------------------------------------------------------------+
4. Then confirm the migration. This is where the problem happens -
the vm_state goes to ACTIVE even though the power_state is 4
(SHUTOFF):
[root@ngp02-hv1 testscripts]# nova show Ivt_July2_9
+-------------------------------------+----------------------------------------------------------------------------+
| Property | Value |
+-------------------------------------+----------------------------------------------------------------------------+
| status | ACTIVE |
| updated | 2013-07-02T17:23:03Z |
| OS-EXT-STS:task_state | None |
| OS-EXT-SRV-ATTR:host | cn4 |
| key_name | None |
| image | quantal-server-cloudimg-amd64-disk1 (8495ff12-ee3c-4fe4-b46b-7b3b10641c87) |
| network1 network | 10.0.1.49 |
| hostId | aec601f205a7665ea600c0fed93cf50e9bb779c2d7625e4f7da23944 |
| OS-EXT-STS:vm_state | active |
| OS-EXT-SRV-ATTR:instance_name | instance-00000090 |
| OS-SRV-USG:launched_at | 2013-07-02T17:22:34.143000 |
| OS-EXT-SRV-ATTR:hypervisor_hostname | cn4 |
| flavor | m1.small (2) |
| id | 6c9a28c8-c52c-4046-ad83-00e12ee0994d |
| security_groups | [{u'name': u'default'}] |
| OS-SRV-USG:terminated_at | None |
| user_id | b25a17a8fcfa44098e45e0cfe5ae4fee |
| name | Ivt_July2_9 |
| created | 2013-07-02T16:59:00Z |
| tenant_id | 066c47e2d09b440aad1845fcea9959ba |
| OS-DCF:diskConfig | MANUAL |
| metadata | {} |
| accessIPv4 | |
| accessIPv6 | |
| progress | 0 |
| OS-EXT-STS:power_state | 4 |
| OS-EXT-AZ:availability_zone | nova |
| config_drive | 1 |
+-------------------------------------+----------------------------------------------------------------------------+
I checked the compute logs for CN10 and cn4 and on the source host
CN10 I found an exception for an InstanceNotFound coming from the
nova.virt.hyperv.vmops.get_info method (which is actually detailed in
bug 1197506).
For some history here, change
I19fa61d467edd5a7572040d084824972569ef65a fixed bug 1177811 where a
user could resize/migrate an instance from a stopped state and it
would go back to ACTIVE state after the resize/migrate was
confirmed/rejected. As part of that change, we check the power_state
when confirming the migration/resize to see if the user powered on the
instance to confirm the resize (if they had resized/migrated from
stopped state):
https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2300
Now for this bug, it looks like if the instance was migrated, when we
check the power_state on the instance, the old host is being asked to
get the state but the instance has migrated and we get the
InstanceNotFound (which is swallowed here):
https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L698
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1197514/+subscriptions