yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #70233
[Bug 1741817] [NEW] vm rebuild fails but confusing state
Public bug reported:
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the
instance on this host, it will stuck in rebulding which is reasonable
because no one consumes the message from MQ. However, if nova-compute on
this host is started then, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
$nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
$nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
first:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later it stucks in following state:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-
9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784
-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to
ERROR state _init_instance /usr/lib/python2.7/dist-
packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784
-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual:
{'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
** Affects: nova
Importance: Undecided
Status: New
** Description changed:
Description
===========
- If nova-compute crahes on the host and rebuild the instance on this host,
- it will stuck in rebulding which is reasonable because no one consumes
- the message from MQ. However, if restart nova-compute on this host, the
- instance gets confusing state.
+ If nova-compute crashes on the host and rebuild the instance on this host,
+ it will stuck in rebulding which is reasonable because no one consumes
+ the message from MQ. However, if restart nova-compute on this host, the
+ instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
-
+
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
-
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
-
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
-
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
-
Expected result
===============
- Instance status should become error or the rebuild operarion should continue to be finished.
-
+ Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
-
Environment
===========
- nova-api 2:16.0.1-0ubuntu1~cloud0
- nova-common 2:16.0.1-0ubuntu1~cloud0
- nova-conductor 2:16.0.1-0ubuntu1~cloud0
- nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
- nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
- nova-placement-api 2:16.0.1-0ubuntu1~cloud0
- nova-scheduler 2:16.0.1-0ubuntu1~cloud0
- python-nova 2:16.0.1-0ubuntu1~cloud0
- python-novaclient 2:9.1.0-0ubuntu1~cloud0
-
+ nova-api 2:16.0.1-0ubuntu1~cloud0
+ nova-common 2:16.0.1-0ubuntu1~cloud0
+ nova-conductor 2:16.0.1-0ubuntu1~cloud0
+ nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
+ nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
+ nova-placement-api 2:16.0.1-0ubuntu1~cloud0
+ nova-scheduler 2:16.0.1-0ubuntu1~cloud0
+ python-nova 2:16.0.1-0ubuntu1~cloud0
+ python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
-
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-
9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784
-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to
ERROR state _init_instance /usr/lib/python2.7/dist-
packages/nova/compute/manager.py:823
-
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
- File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
- return getattr(target, method)(*args, **kwargs)
-
- File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
- return fn(self, *args, **kwargs)
-
- File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
- columns_to_join=_expected_cols(expected_attrs))
-
- File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
- expected=expected)
-
- File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
- return f(*args, **kwargs)
-
- File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
- ectxt.value = e.inner_exc
-
- File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
- self.force_reraise()
-
- File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
- six.reraise(self.type_, self.value, self.tb)
-
- File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
- return f(*args, **kwargs)
-
- File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
- return f(context, *args, **kwargs)
-
- File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
- context, instance_uuid, values, expected, original=instance_ref))
-
- File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
- raise exc(**exc_props)
+ File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
+ return getattr(target, method)(*args, **kwargs)
+
+ File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
+ return fn(self, *args, **kwargs)
+
+ File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
+ columns_to_join=_expected_cols(expected_attrs))
+
+ File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
+ expected=expected)
+
+ File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
+ return f(*args, **kwargs)
+
+ File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
+ ectxt.value = e.inner_exc
+
+ File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
+ self.force_reraise()
+
+ File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
+ six.reraise(self.type_, self.value, self.tb)
+
+ File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
+ return f(*args, **kwargs)
+
+ File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
+ return f(context, *args, **kwargs)
+
+ File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
+ context, instance_uuid, values, expected, original=instance_ref))
+
+ File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
+ raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784
-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual:
{'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
** Description changed:
Description
===========
- If nova-compute crashes on the host and rebuild the instance on this host,
+ If nova-compute crashes on the host and I rebuild the instance on this host,
it will stuck in rebulding which is reasonable because no one consumes
the message from MQ. However, if restart nova-compute on this host, the
instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-
9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784
-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to
ERROR state _init_instance /usr/lib/python2.7/dist-
packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784
-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual:
{'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
** Description changed:
Description
===========
- If nova-compute crashes on the host and I rebuild the instance on this host,
- it will stuck in rebulding which is reasonable because no one consumes
- the message from MQ. However, if restart nova-compute on this host, the
- instance gets confusing state.
+ If nova-compute crashes on the host and I rebuild the instance on this
+ host, it will stuck in rebulding which is reasonable because no one
+ consumes the message from MQ. However, if restart nova-compute on this
+ host, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-
9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784
-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to
ERROR state _init_instance /usr/lib/python2.7/dist-
packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784
-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual:
{'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
** Description changed:
Description
===========
- If nova-compute crashes on the host and I rebuild the instance on this
- host, it will stuck in rebulding which is reasonable because no one
- consumes the message from MQ. However, if restart nova-compute on this
- host, the instance gets confusing state.
+ If nova-compute crashes on the host and I am trying to rebuild the
+ instance on this host, it will stuck in rebulding which is reasonable
+ because no one consumes the message from MQ. However, if restart nova-
+ compute on this host, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-
9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784
-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to
ERROR state _init_instance /usr/lib/python2.7/dist-
packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784
-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual:
{'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
** Description changed:
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the
instance on this host, it will stuck in rebulding which is reasonable
- because no one consumes the message from MQ. However, if restart nova-
- compute on this host, the instance gets confusing state.
+ because no one consumes the message from MQ. However, if nova-compute on
+ this host is started then, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-
9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784
-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to
ERROR state _init_instance /usr/lib/python2.7/dist-
packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784
-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual:
{'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
** Description changed:
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the
instance on this host, it will stuck in rebulding which is reasonable
because no one consumes the message from MQ. However, if nova-compute on
this host is started then, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
+ $nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
+ $nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
fisrt:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-
9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784
-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to
ERROR state _init_instance /usr/lib/python2.7/dist-
packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784
-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual:
{'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
** Description changed:
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the
instance on this host, it will stuck in rebulding which is reasonable
because no one consumes the message from MQ. However, if nova-compute on
this host is started then, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
$nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
$nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
- fisrt:
+ first:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-
9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784
-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to
ERROR state _init_instance /usr/lib/python2.7/dist-
packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784
-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual:
{'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
** Description changed:
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the
instance on this host, it will stuck in rebulding which is reasonable
because no one consumes the message from MQ. However, if nova-compute on
this host is started then, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
$nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
$nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
first:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
- but soon later:
+ but soon later it stucks in following state:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-
9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784
-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to
ERROR state _init_instance /usr/lib/python2.7/dist-
packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance 5c6c8913-a160-4784
-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual:
{'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
** Summary changed:
- vm rebuild fail but confusing state
+ vm rebuild fails but confusing state
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1741817
Title:
vm rebuild fails but confusing state
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
If nova-compute crashes on the host and I am trying to rebuild the
instance on this host, it will stuck in rebulding which is reasonable
because no one consumes the message from MQ. However, if nova-compute
on this host is started then, the instance gets confusing state.
I am not sure it is designed like this or a bug.
Steps to reproduce
==================
(1) boot an instance
$nova boot --image cirros --flavor mini --nic net-id=4ccb0bc2-272f-46a2-ad32-5f5012a5a49f y
$nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ACTIVE | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
(2) stop nova-compute on compute1
$nova service-list|grep compute1
| 985b4f89-92b9-4c4e-abd3-3b7d8f5f80a8 | nova-compute | compute1 | nova | enabled | down | 2018-01-08T03:17:25.000000 | - | False |
(3) rebuild the instance
$nova rebuild y cir
$nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ rabbitmqctl list_queues
Listing queues ...
...
compute.compute1 1
(4) start nova-compute on compute1
first:
$ nova list
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | ERROR | - | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+--------+------------+-------------+--------------------+
but soon later it stucks in following state:
$ nova list
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| ID | Name | Host | Status | Task State | Power State | Networks |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
| 5c6c8913-a160-4784-9e0e-c3e24cf0511c | y | compute1 | REBUILD | rebuilding | Running | vxlan-l=100.53.0.8 |
+--------------------------------------+------+----------+---------+------------+-------------+--------------------+
$ nova show y
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | compute1 |
| OS-EXT-SRV-ATTR:hostname | y |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute1 |
| OS-EXT-SRV-ATTR:instance_name | instance-000025db |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-wo128mco |
| OS-EXT-SRV-ATTR:root_device_name | /dev/vda |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | rebuilding |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-01-08T03:17:03.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2018-01-08T03:16:39Z |
| description | - |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | mini |
| flavor:ram | 64 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | a7837285ad98259e17a7ca2a0fdd9194c47113068489c17ec3c3c4ce |
| host_status | UP |
| id | 5c6c8913-a160-4784-9e0e-c3e24cf0511c |
| image | cir (14440220-3505-4d45-b4b5-f2a184145140) |
| key_name | - |
| locked | False |
| metadata | {} |
| name | y |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | REBUILD |
| tags | [] |
| tenant_id | 5673331fd93740f58e9b36ebc56b7183 |
| updated | 2018-01-08T03:20:19Z |
| user_id | 59a94426c0404b5fae4f81d10bfc9b2f |
| vxlan-l network | 100.53.0.8 |
+--------------------------------------+----------------------------------------------------------+
Expected result
===============
Instance status should become error or the rebuild operarion should continue to be finished.
Actual result
=============
Instance status stucks in REBUILD and task_state stucks in rebuilding.
Environment
===========
nova-api 2:16.0.1-0ubuntu1~cloud0
nova-common 2:16.0.1-0ubuntu1~cloud0
nova-conductor 2:16.0.1-0ubuntu1~cloud0
nova-consoleauth 2:16.0.1-0ubuntu1~cloud0
nova-novncproxy 2:16.0.1-0ubuntu1~cloud0
nova-placement-api 2:16.0.1-0ubuntu1~cloud0
nova-scheduler 2:16.0.1-0ubuntu1~cloud0
python-nova 2:16.0.1-0ubuntu1~cloud0
python-novaclient 2:9.1.0-0ubuntu1~cloud0
Libvirt + KVM, CEPH, NEUTRON with linuxbridge
Logs
==============
On compute node:
(after nova-compute started)
2018-01-08 11:20:15.685 2352 DEBUG nova.compute.manager [req-ded7ce8a-
9ac4-4fb3-8f4e-a35dd5505c4e - - - - -] [instance: 5c6c8913-a160-4784
-9e0e-c3e24cf0511c] Instance failed to rebuild correctly, setting to
ERROR state _init_instance /usr/lib/python2.7/dist-
packages/nova/compute/manager.py:823
(nova-compute picks the message from MQ and continues to rebuild)
2018-01-08 11:20:18.900 2352 INFO nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Rebuilding instance
2018-01-08 11:20:19.101 2352 DEBUG nova.notifications.objects.base [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] Defaulting the value of the field 'projects' to None in FlavorPayload due to 'Cannot call _load_projects on orphaned Flavor object' populate_schema /usr/lib/python2.7/dist-packages/nova/notifications/objects/base.py:125
2018-01-08 11:20:19.108 2352 DEBUG nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1184
2018-01-08 11:20:19.242 2352 DEBUG nova.compute.utils [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None} notify_about_instance_usage /usr/lib/python2.7/dist-packages/nova/compute/utils.py:328
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [req-d635de5c-d5b4-4b20-96c6-663291e5554b 59a94426c0404b5fae4f81d10bfc9b2f 5673331fd93740f58e9b36ebc56b7183 - default default] [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Setting instance vm_state to ERROR: UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 123, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
return fn(self, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/objects/instance.py", line 785, in save
columns_to_join=_expected_cols(expected_attrs))
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 859, in instance_update_and_get_original
expected=expected)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
return f(context, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2666, in instance_update_and_get_original
context, instance_uuid, values, expected, original=instance_ref))
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2802, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance
5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state':
[u'rebuilding']}. Actual: {'task_state': None}
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] Traceback (most recent call last):
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6839, in _error_out_instance_on_exception
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] yield
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2821, in rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] bdms, recreate, on_shared_storage, preserve_ephemeral)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2869, in _do_rebuild_instance_with_claim
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] self._do_rebuild_instance(*args, **kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2943, in _do_rebuild_instance
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] instance.save(expected_task_state=[task_states.REBUILDING])
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 210, in wrapper
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] ctxt, self, fn.__name__, args, kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 245, in object_action
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] objmethod=objmethod, args=args, kwargs=kwargs)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=self.retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] timeout=timeout, retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] retry=retry)
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] raise result
2018-01-08 11:20:19.274 2352 ERROR nova.compute.manager [instance: 5c6c8913-a160-4784-9e0e-c3e24cf0511c] UnexpectedTaskStateError_Remote: Conflict updating instance 5c6c8913-a160-4784-9e0e-c3e24cf0511c. Expected: {'task_state': [u'rebuilding']}. Actual: {'task_state': None}
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1741817/+subscriptions