yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #46477
[Bug 1536916] Re: nova's task_state still is migrating after live migration failed
Reviewed: https://review.openstack.org/275650
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=30d5d805c10b0cc6e474fe1292b2c6549fc07d33
Submitter: Jenkins
Branch: master
commit 30d5d805c10b0cc6e474fe1292b2c6549fc07d33
Author: ShaoHe Feng <shaohe.feng@xxxxxxxxx>
Date: Tue Feb 2 09:41:33 2016 +0000
reset task_state after select_destinations failed.
During live migration, there maybe exception when let scheduler
select destination, and live migration will abort. But the task
state of the instance still keep migrating, then we can not take
any action on this instance.
We need to recover the state of the task as None.
We should also recover the vm_state.
Change-Id: If1cae8f4c9037f7821554a94d4440f66d9538794
Closes-bug: #1536916
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1536916
Title:
nova's task_state still is migrating after live migration failed
Status in OpenStack Compute (nova):
Fix Released
Bug description:
Environment:
distribution:
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.2 LTS"
$ sudo virsh version
Compiled against library: libvirt 1.2.2
Using library: libvirt 1.2.2
Using API: QEMU 1.2.2
Running hypervisor: QEMU 2.0.0
$ git log --oneline
806113e Merge "Changed filter_by() to filter() during filtering instances in db API"
...
There are two hosts in my environment.
Host A is controller with compute node. Host B is only as compute
node.
Produce:
1. I upgrade the nova code in Host A. and restart n-sch, n-cond, n-cpu.
the nova code in Host B, and restart n-cpu.
keep n-sch as old version.
2. do live migration
$ nova live-migration tt
it report error. "Remote error: UnsupportedVersion Endpoint does not support RPC version 4.3. Attempted method: select_destinations
$ sudo virsh list x~~~~~~~~
Id Name State x~~~~~~~~
---------------------------------------------------- x~~~~~~~~
26 instance-00000002 running x~~~~~~~~
error details as follow:
+--------------------------------------+------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------+
| Property | Value
|
+--------------------------------------+------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL
|
| OS-EXT-AZ:availability_zone | nova
|
| OS-EXT-SRV-ATTR:host | shaohe1
|
| OS-EXT-SRV-ATTR:hostname | tt |
| OS-EXT-SRV-ATTR:hypervisor_hostname | shaohe1 |
| OS-EXT-SRV-ATTR:instance_name | instance-00000002 |
| OS-EXT-SRV-ATTR:kernel_id | 89e91cc1-a40c-4c9f-bcfa-37b0e94d5f57 |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | d4c09101-c0fa-40de-9c88-28b9428d03fb |
| OS-EXT-SRV-ATTR:reservation_id | r-p0884maw |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | migrating |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2015-12-18T07:41:00.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | True |
|
| created | 2015-12-18T07:40:44Z |
| fault | {"message": "Remote error: UnsupportedVersion Endpoint does not support RPC version 4.3. Attempted method: select_destinations |
| | [u'Traceback (most recent call last):\ |
| | ', u' File \"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py\", line 142, in _d", "code": 500, "details": " File \"/opt/stack/nova/nova/conductor/manager.py\", line 295, in _live_migrate |
| | task.execute() |
| | File \"/opt/stack/nova/nova/conductor/tasks/base.py\", line 27, in wrap |
| | self.rollback() |
| | File \"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py\", line 204, in __exit__ |
| | six.reraise(self.type_, self.value, self.tb) |
| | File \"/opt/stack/nova/nova/conductor/tasks/base.py\", line 24, in wrap |
| | return original(self) |
| | File \"/opt/stack/nova/nova/conductor/tasks/base.py\", line 42, in execute |
| | return self._execute() |
| | File \"/opt/stack/nova/nova/conductor/tasks/live_migrate.py\", line 58, in _execute |
| | self.destination = self._find_destination() |
| | File \"/opt/stack/nova/nova/conductor/tasks/live_migrate.py\", line 181, in _find_destination |
| | spec_obj)[0]['host'] |
| | File \"/opt/stack/nova/nova/scheduler/utils.py\", line 358, in wrapped |
| | return func(*args, **kwargs) |
| | File \"/opt/stack/nova/nova/scheduler/client/__init__.py\", line 51, in select_destinations |
| | return self.queryclient.select_destinations(context, spec_obj) |
| | File \"/opt/stack/nova/nova/scheduler/client/__init__.py\", line 37, in __run_method |
| | return getattr(self.instance, __name)(*args, **kwargs)
|
| | File \"/opt/stack/nova/nova/scheduler/client/query.py\", line 32, in select_destinations |
| | return self.scheduler_rpcapi.select_destinations(context, spec_obj) |
| | File \"/opt/stack/nova/nova/scheduler/rpcapi.py\", line 121, in select_destinations |
| | return cctxt.call(ctxt, 'select_destinations', **msg_args) |
| | File \"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py\", line 158, in call |
| | retry=self.retry) |
| | File \"/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py\", line 90, in _send |
| | timeout=timeout, retry=retry) |
| | File \"/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py\", line 464, in send |
| | retry=retry) |
| | File \"/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py\", line 455, in _send |
| | raise result |
| | ", "created": "2016-01-22T03:41:21Z"} |
| flavor | m1.tiny (1) |
| hostId | 94170a2af829a4a136f9f9c9d3c3f8f10df72bc1238a5eeef5b4ffa5 |
| id | 058fc419-a8a8-4e08-b62c-a9841ef9cd3f |
| image | cirros-0.3.4-x86_64-uec (3476ee05-b80e-4a80-aac4-78350ce132a2) |
| key_name | - |
| metadata | {} |
| name | tt |
| os-extended-volumes:volumes_attached | [] |
| private network | fdf1:3ce8:94a:0:f816:3eff:fee1:2a95, 10.0.0.4 |
| security_groups | default
|
| status | ERROR |
| tenant_id | f5a8829cc14c4825a2728b273aa91aa1 |
| updated | 2016-01-22T03:41:21Z |
| user_id | b5450f0c30154d2bb1506968a05c6f80 |
+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3. when I do live migration again.
$ nova live-migration tt
ERROR (Conflict): Cannot 'os-migrateLive' instance 058fc419-a8a8-4e08-b62c-a9841ef9cd3f while it is in vm_state error (HTTP 409) (Request-ID: req-2712a326-c3ab-4a6d-9afb-5001b3822047)
4. and stop the server.
$ nova stop tt
Cannot 'stop' instance 058fc419-a8a8-4e08-b62c-a9841ef9cd3f while it is in task_state migrating (HTTP 409) (Request-ID: req-580ae847-1a5d-44c8-a285-7cc31dc9e0e2)
ERROR (CommandError): Unable to stop the specified server(s).
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1536916/+subscriptions
References