← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1324041] [NEW] nova-compute cannot restart if _init_instance failed

 

Public bug reported:

In my openstack, because of the interruption of power supply, my compute
nodes crash . Then , i  start my compute nodes, and the start the nova-
compute service. Unfortunately , i cannot start nova-compute service. I
checked the compute.log , found something error like follows:

2014-05-28 16:21:12.558 2724 DEBUG nova.compute.manager [-] [instance: ac57aab0-1864-4335-aa4a-bbfcc75a9624] Checking state _get_power_state /usr/lib/python2.6/site-packages/nova/compute/manager.py:1043
2014-05-28 16:21:12.563 2724 DEBUG nova.compute.manager [-] [instance: ac57aab0-1864-4335-aa4a-bbfcc75a9624] Checking state _get_power_state /usr/lib/python2.6/site-packages/nova/compute/manager.py:1043
2014-05-28 16:21:12.567 2724 DEBUG nova.virt.libvirt.vif [-] vif_type=bridge instance=<nova.objects.instance.Instance object at 0x3fcead0> vif=VIF({'ovs_interfaceid': None, 'network': Network({'bridge': u'brqf29d33d2-7c', 'subnets': [Subnet({'ips': [FixedIP({'meta': {}, 'version': 4, 'type': u'fixed', 'floating_ips': [IP({'meta': {}, 'version': 4, 'type': u'floating', 'address': u'10.0.0.101'})], 'address': u'192.168.0.2'})], 'version': 4, 'meta': {u'dhcp_server': u'192.168.0.3'}, 'dns': [], 'routes': [], 'cidr': u'192.168.0.0/24', 'gateway': IP({'meta': {}, 'version': 4, 'type': u'gateway', 'address': u'192.168.0.1'})})], 'meta': {u'injected': False, u'tenant_id': u'5d56667c799c46ef81b87455445af457', u'should_create_bridge': True}, 'id': u'f29d33d2-7c70-456a-96b0-03a59fe0b40f', 'label': u'admin_net'}), 'devname': u'tap0780a643-9a', 'qbh_params': None, 'meta': {}, 'details': {u'port_filter': True}, 'address': u'fa:16:3e:dc:23:66', 'active': True, 'type': u'bridge', 'id': u'0780a643-9ad4-4388-a51d-3456a1e88ae6', 'qbg_params': None}) plug /usr/lib/python2.6/site-packages/nova/virt/libvirt/vif.py:592
2014-05-28 16:21:12.568 2724 DEBUG nova.virt.libvirt.vif [-] [instance: ac57aab0-1864-4335-aa4a-bbfcc75a9624] Ensuring bridge brqf29d33d2-7c plug_bridge /usr/lib/python2.6/site-packages/nova/virt/libvirt/vif.py:408
2014-05-28 16:21:12.568 2724 DEBUG nova.openstack.common.lockutils [-] Got semaphore "lock_bridge" lock /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:168
2014-05-28 16:21:12.569 2724 DEBUG nova.openstack.common.lockutils [-] Attempting to grab file lock "lock_bridge" lock /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:178
2014-05-28 16:21:12.569 2724 DEBUG nova.openstack.common.lockutils [-] Got file lock "lock_bridge" at /var/lib/nova/tmp/nova-lock_bridge lock /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:206
2014-05-28 16:21:12.569 2724 DEBUG nova.openstack.common.lockutils [-] Got semaphore / lock "ensure_bridge" inner /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:248
2014-05-28 16:21:12.570 2724 DEBUG nova.openstack.common.lockutils [-] Released file lock "lock_bridge" at /var/lib/nova/tmp/nova-lock_bridge lock /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:210
2014-05-28 16:21:12.570 2724 DEBUG nova.openstack.common.lockutils [-] Semaphore / lock released "ensure_bridge" inner /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:252
2014-05-28 16:21:12.570 2724 DEBUG nova.compute.manager [-] [instance: ac57aab0-1864-4335-aa4a-bbfcc75a9624] Checking state _get_power_state /usr/lib/python2.6/site-packages/nova/compute/manager.py:1043
2014-05-28 16:21:12.575 2724 DEBUG nova.compute.manager [-] [instance: ac57aab0-1864-4335-aa4a-bbfcc75a9624] Current state is 4, state in DB is 1. _init_instance /usr/lib/python2.6/site-packages/nova/compute/manager.py:920
2014-05-28 16:21:12.575 2724 DEBUG nova.compute.manager [-] [instance: 8047e688-d189-4d35-a9c8-634f34cdda86] Checking state _get_power_state /usr/lib/python2.6/site-packages/nova/compute/manager.py:1043
2014-05-28 16:21:12.579 2724 DEBUG nova.compute.manager [-] [instance: 8047e688-d189-4d35-a9c8-634f34cdda86] Checking state _get_power_state /usr/lib/python2.6/site-packages/nova/compute/manager.py:1043
2014-05-28 16:21:12.584 2724 DEBUG nova.virt.libvirt.vif [-] vif_type=binding_failed instance=<nova.objects.instance.Instance object at 0x3fcef50> vif=VIF({'ovs_interfaceid': None, 'network': Network({'bridge': None, 'subnets': [Subnet({'ips': [FixedIP({'meta': {}, 'version': 4, 'type': u'fixed', 'floating_ips': [IP({'meta': {}, 'version': 4, 'type': u'floating', 'address': u'10.0.0.112'})], 'address': u'172.16.0.180'})], 'version': 4, 'meta': {u'dhcp_server': u'172.16.0.3'}, 'dns': [], 'routes': [], 'cidr': u'172.16.0.0/24', 'gateway': IP({'meta': {}, 'version': 4, 'type': u'gateway', 'address': u'172.16.0.1'})})], 'meta': {u'injected': False, u'tenant_id': u'b0df2063f0ae4830880ce544643c15e2'}, 'id': u'1ddbf12d-9adc-4371-9741-a2d89cd40686', 'label': u'plcloud_net'}), 'devname': u'tap75168900-ca', 'qbh_params': None, 'meta': {}, 'details': {}, 'address': u'fa:16:3e:35:47:1e', 'active': True, 'type': u'binding_failed', 'id': u'75168900-cafe-4cb1-9e20-1ed9cd33de44', 'qbg_params': None}) plug /usr/lib/python2.6/site-packages/nova/virt/libvirt/vif.py:592
2014-05-28 16:21:12.588 2724 ERROR nova.openstack.common.threadgroup [-] Unexpected vif_type=binding_failed
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/openstack/common/threadgroup.py", line 117, in wait
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     x.wait()
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/openstack/common/threadgroup.py", line 49, in wait
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     return self.thread.wait()
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/eventlet/greenthread.py", line 168, in wait
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     return self._exit_event.wait()
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/eventlet/event.py", line 116, in wait
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     return hubs.get_hub().switch()
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 187, in switch
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     return self.greenlet.switch()
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/eventlet/greenthread.py", line 194, in main
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     result = function(*args, **kwargs)
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/openstack/common/service.py", line 483, in run_service
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     service.start()
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/service.py", line 163, in start
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     self.manager.init_host()
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1026, in init_host
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     self._init_instance(context, instance)
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 884, in _init_instance
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     self.driver.plug_vifs(instance, net_info)
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 855, in plug_vifs
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     self.vif_driver.plug(instance, vif)
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/vif.py", line 616, in plug
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     _("Unexpected vif_type=%s") % vif_type)
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup NovaException: Unexpected vif_type=binding_failed
2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup 


Then the nova-compute exit

It`s no doubt that we should check and recover the instances when start
the crashed compute node, while, i do not think the nova-compute service
should be exit if calling _init_instance falied

** Affects: nova
     Importance: Undecided
     Assignee: zhangjialong (zhangjl)
         Status: New


** Tags: init instance nova-compute start

** Changed in: nova
     Assignee: (unassigned) => zhangjialong (zhangjl)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1324041

Title:
  nova-compute cannot restart if _init_instance failed

Status in OpenStack Compute (Nova):
  New

Bug description:
  In my openstack, because of the interruption of power supply, my
  compute nodes crash . Then , i  start my compute nodes, and the start
  the nova-compute service. Unfortunately , i cannot start nova-compute
  service. I checked the compute.log , found something error like
  follows:

  2014-05-28 16:21:12.558 2724 DEBUG nova.compute.manager [-] [instance: ac57aab0-1864-4335-aa4a-bbfcc75a9624] Checking state _get_power_state /usr/lib/python2.6/site-packages/nova/compute/manager.py:1043
  2014-05-28 16:21:12.563 2724 DEBUG nova.compute.manager [-] [instance: ac57aab0-1864-4335-aa4a-bbfcc75a9624] Checking state _get_power_state /usr/lib/python2.6/site-packages/nova/compute/manager.py:1043
  2014-05-28 16:21:12.567 2724 DEBUG nova.virt.libvirt.vif [-] vif_type=bridge instance=<nova.objects.instance.Instance object at 0x3fcead0> vif=VIF({'ovs_interfaceid': None, 'network': Network({'bridge': u'brqf29d33d2-7c', 'subnets': [Subnet({'ips': [FixedIP({'meta': {}, 'version': 4, 'type': u'fixed', 'floating_ips': [IP({'meta': {}, 'version': 4, 'type': u'floating', 'address': u'10.0.0.101'})], 'address': u'192.168.0.2'})], 'version': 4, 'meta': {u'dhcp_server': u'192.168.0.3'}, 'dns': [], 'routes': [], 'cidr': u'192.168.0.0/24', 'gateway': IP({'meta': {}, 'version': 4, 'type': u'gateway', 'address': u'192.168.0.1'})})], 'meta': {u'injected': False, u'tenant_id': u'5d56667c799c46ef81b87455445af457', u'should_create_bridge': True}, 'id': u'f29d33d2-7c70-456a-96b0-03a59fe0b40f', 'label': u'admin_net'}), 'devname': u'tap0780a643-9a', 'qbh_params': None, 'meta': {}, 'details': {u'port_filter': True}, 'address': u'fa:16:3e:dc:23:66', 'active': True, 'type': u'bridge', 'id': u'0780a643-9ad4-4388-a51d-3456a1e88ae6', 'qbg_params': None}) plug /usr/lib/python2.6/site-packages/nova/virt/libvirt/vif.py:592
  2014-05-28 16:21:12.568 2724 DEBUG nova.virt.libvirt.vif [-] [instance: ac57aab0-1864-4335-aa4a-bbfcc75a9624] Ensuring bridge brqf29d33d2-7c plug_bridge /usr/lib/python2.6/site-packages/nova/virt/libvirt/vif.py:408
  2014-05-28 16:21:12.568 2724 DEBUG nova.openstack.common.lockutils [-] Got semaphore "lock_bridge" lock /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:168
  2014-05-28 16:21:12.569 2724 DEBUG nova.openstack.common.lockutils [-] Attempting to grab file lock "lock_bridge" lock /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:178
  2014-05-28 16:21:12.569 2724 DEBUG nova.openstack.common.lockutils [-] Got file lock "lock_bridge" at /var/lib/nova/tmp/nova-lock_bridge lock /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:206
  2014-05-28 16:21:12.569 2724 DEBUG nova.openstack.common.lockutils [-] Got semaphore / lock "ensure_bridge" inner /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:248
  2014-05-28 16:21:12.570 2724 DEBUG nova.openstack.common.lockutils [-] Released file lock "lock_bridge" at /var/lib/nova/tmp/nova-lock_bridge lock /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:210
  2014-05-28 16:21:12.570 2724 DEBUG nova.openstack.common.lockutils [-] Semaphore / lock released "ensure_bridge" inner /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:252
  2014-05-28 16:21:12.570 2724 DEBUG nova.compute.manager [-] [instance: ac57aab0-1864-4335-aa4a-bbfcc75a9624] Checking state _get_power_state /usr/lib/python2.6/site-packages/nova/compute/manager.py:1043
  2014-05-28 16:21:12.575 2724 DEBUG nova.compute.manager [-] [instance: ac57aab0-1864-4335-aa4a-bbfcc75a9624] Current state is 4, state in DB is 1. _init_instance /usr/lib/python2.6/site-packages/nova/compute/manager.py:920
  2014-05-28 16:21:12.575 2724 DEBUG nova.compute.manager [-] [instance: 8047e688-d189-4d35-a9c8-634f34cdda86] Checking state _get_power_state /usr/lib/python2.6/site-packages/nova/compute/manager.py:1043
  2014-05-28 16:21:12.579 2724 DEBUG nova.compute.manager [-] [instance: 8047e688-d189-4d35-a9c8-634f34cdda86] Checking state _get_power_state /usr/lib/python2.6/site-packages/nova/compute/manager.py:1043
  2014-05-28 16:21:12.584 2724 DEBUG nova.virt.libvirt.vif [-] vif_type=binding_failed instance=<nova.objects.instance.Instance object at 0x3fcef50> vif=VIF({'ovs_interfaceid': None, 'network': Network({'bridge': None, 'subnets': [Subnet({'ips': [FixedIP({'meta': {}, 'version': 4, 'type': u'fixed', 'floating_ips': [IP({'meta': {}, 'version': 4, 'type': u'floating', 'address': u'10.0.0.112'})], 'address': u'172.16.0.180'})], 'version': 4, 'meta': {u'dhcp_server': u'172.16.0.3'}, 'dns': [], 'routes': [], 'cidr': u'172.16.0.0/24', 'gateway': IP({'meta': {}, 'version': 4, 'type': u'gateway', 'address': u'172.16.0.1'})})], 'meta': {u'injected': False, u'tenant_id': u'b0df2063f0ae4830880ce544643c15e2'}, 'id': u'1ddbf12d-9adc-4371-9741-a2d89cd40686', 'label': u'plcloud_net'}), 'devname': u'tap75168900-ca', 'qbh_params': None, 'meta': {}, 'details': {}, 'address': u'fa:16:3e:35:47:1e', 'active': True, 'type': u'binding_failed', 'id': u'75168900-cafe-4cb1-9e20-1ed9cd33de44', 'qbg_params': None}) plug /usr/lib/python2.6/site-packages/nova/virt/libvirt/vif.py:592
  2014-05-28 16:21:12.588 2724 ERROR nova.openstack.common.threadgroup [-] Unexpected vif_type=binding_failed
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/openstack/common/threadgroup.py", line 117, in wait
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     x.wait()
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/openstack/common/threadgroup.py", line 49, in wait
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     return self.thread.wait()
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/eventlet/greenthread.py", line 168, in wait
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     return self._exit_event.wait()
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/eventlet/event.py", line 116, in wait
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     return hubs.get_hub().switch()
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 187, in switch
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     return self.greenlet.switch()
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/eventlet/greenthread.py", line 194, in main
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     result = function(*args, **kwargs)
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/openstack/common/service.py", line 483, in run_service
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     service.start()
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/service.py", line 163, in start
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     self.manager.init_host()
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1026, in init_host
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     self._init_instance(context, instance)
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 884, in _init_instance
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     self.driver.plug_vifs(instance, net_info)
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 855, in plug_vifs
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     self.vif_driver.plug(instance, vif)
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/vif.py", line 616, in plug
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup     _("Unexpected vif_type=%s") % vif_type)
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup NovaException: Unexpected vif_type=binding_failed
  2014-05-28 16:21:12.588 2724 TRACE nova.openstack.common.threadgroup 

  
  Then the nova-compute exit

  It`s no doubt that we should check and recover the instances when
  start the crashed compute node, while, i do not think the nova-compute
  service should be exit if calling _init_instance falied

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1324041/+subscriptions


Follow ups

References