← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1183633] Re: nova-compute wedged by deleting an in-use baremetal node

 

I talked to Robert Collins  and tried to reproduce the bug, but it didnt
happen again, i tried several times and at last ran a test he suggested
and the bug didn't happen, so he said that i could mark this bug as fix
released.

** Changed in: nova
       Status: Triaged => Fix Released

** Changed in: tripleo
       Status: Triaged => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1183633

Title:
  nova-compute wedged by deleting an in-use baremetal node

Status in OpenStack Compute (Nova):
  Fix Released
Status in tripleo - openstack on openstack:
  Fix Released

Bug description:
  we deleted a node that had an instance sortof on it, because we
  thought it was dead.

  symptoms:
  nova compute stops checking in:
  $ nova service-list 
  +------------------+---------------+----------+---------+-------+----------------------------+
  | Binary           | Host          | Zone     | Status  | State | Updated_at                 |
  +------------------+---------------+----------+---------+-------+----------------------------+
  | nova-cert        | foo.novalocal | internal | enabled | up    | 2013-05-24T01:45:02.000000 |
  | nova-compute     | foo.novalocal | nova     | enabled | down  | 2013-05-23T22:16:26.000000 |
  | nova-conductor   | foo.novalocal | internal | enabled | up    | 2013-05-24T01:45:04.000000 |
  | nova-consoleauth | foo.novalocal | internal | enabled | up    | 2013-05-24T01:45:03.000000 |
  | nova-scheduler   | foo.novalocal | internal | enabled | up    | 2013-05-24T01:45:03.000000 |
  +------------------+---------------+----------+---------+-------+----------------------------+

  and it's log is full of repeating attempts to start it where it dies [because upstart is restarting it]
  2013-05-24 01:48:52,227.227 28649 INFO nova.openstack.common.periodic_task [-] Skipping periodic task _periodic_update_dns because its interval is negative
  2013-05-24 01:48:52,291.291 28649 INFO nova.virt.driver [-] Loading compute driver 'baremetal.driver.BareMetalDriver'
  2013-05-24 01:48:52,346.346 28649 INFO nova.openstack.common.rpc.common [req-08ee3ab5-fc1f-4550-a203-c0fb37d9a9e3 None None] Connected to AMQP server on 127.0.0.1:5672
  2013-05-24 01:48:52,409.409 28649 AUDIT nova.service [-] Starting compute node (version 2013.2)
  2013-05-24 01:48:52,817.817 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance test-332aac1b-8987-488d-9ea2-19f26a16907d found in the hypervisor, but not in the database
  2013-05-24 01:48:52,817.817 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance bootstack-vm.notcompute found in the hypervisor, but not in the database
  2013-05-24 01:48:52,817.817 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance bootstack-vm-4.notcompute found in the hypervisor, but not in the database
  2013-05-24 01:48:52,818.818 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance bootstack-vm-testing.notcompute found in the hypervisor, but not in the database
  2013-05-24 01:48:52,818.818 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance compute-test.novacompute-0 found in the hypervisor, but not in the database
  2013-05-24 01:48:52,818.818 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance test-8b80ff43-ba81-44c6-a22a-fffd6034579a found in the hypervisor, but not in the database
  2013-05-24 01:48:52,818.818 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance test-cd715548-afd7-4342-8c74-b4d5e5984dd6 found in the hypervisor, but not in the database
  2013-05-24 01:48:52,818.818 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance test-bb35ffdf-9fae-4e23-8e46-ec76b89c1ce4 found in the hypervisor, but not in the database
  2013-05-24 01:48:52,818.818 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance test-d01059f8-97ab-4f0a-968b-7411b2ab717c found in the hypervisor, but not in the database
  2013-05-24 01:48:52,818.818 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance test-f7862b82-268d-4971-b961-a8fe51488b21 found in the hypervisor, but not in the database
  2013-05-24 01:48:52,819.819 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance test-3f0cdb8f-70ae-43f7-bb98-83c48f5da317 found in the hypervisor, but not in the database
  2013-05-24 01:48:52,819.819 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance test-d3d7d58f-408c-47ff-993a-4b8327f27541 found in the hypervisor, but not in the database
  2013-05-24 01:48:52,819.819 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance test-30405362-c307-428a-94c5-dbe6284b8f28 found in the hypervisor, but not in the database
  2013-05-24 01:48:52,819.819 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance test-54fb06f0-325c-4d98-9a54-2ab4d3ab9794 found in the hypervisor, but not in the database
  2013-05-24 01:48:52,820.820 28649 ERROR nova.compute.manager [req-76be156c-c2d9-4c7c-aa00-3ce45b3b49a8 None None] Instance test-091264f9-830b-4279-92e3-20ff56375973 found in the hypervisor, but not in the database
  Traceback (most recent call last):
    File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 336, in fire_timers
      timer()
    File "/usr/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 56, in __call__
      cb(*args, **kw)
    File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 192, in main
      result = function(*args, **kwargs)
    File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/service.py", line 148, in run_server
      server.start()
    File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/service.py", line 430, in start
      self.manager.init_host()
    File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py", line 631, in init_host
      self._init_instance(context, instance)
    File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py", line 520, in _init_instance
      self.driver.plug_vifs(instance, legacy_net_info)
    File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/driver.py", line 460, in plug_vifs
      self._plug_vifs(instance, network_info)
    File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/driver.py", line 465, in _plug_vifs
      node = _get_baremetal_node_by_instance_uuid(instance['uuid'])
    File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/driver.py", line 88, in _get_baremetal_node_by_instance_uuid
      node = db.bm_node_get_by_instance_uuid(ctx, instance_uuid)
    File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/db/api.py", line 101, in bm_node_get_by_instance_uuid
      instance_uuid)
    File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 97, in wrapper
      return f(*args, **kwargs)
    File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/db/sqlalchemy/api.py", line 151, in bm_node_get_by_instance_uuid
      raise exception.InstanceNotFound(instance_id=instance_uuid)
  InstanceNotFound: Instance 9dc0aba0-27a5-47cb-a85a-574763e8243e could not be found.
  2013-05-24 01:48:54,716.716 28649 CRITICAL nova [-] Instance 9dc0aba0-27a5-47cb-a85a-574763e8243e could not be found.
  2013-05-24 01:48:54,716.716 28649 TRACE nova Traceback (most recent call last):
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/bin/nova-compute", line 8, in <module>
  2013-05-24 01:48:54,716.716 28649 TRACE nova     load_entry_point('nova==2013.2.a2.gaf90386', 'console_scripts', 'nova-compute')()
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/cmd/compute.py", line 65, in main
  2013-05-24 01:48:54,716.716 28649 TRACE nova     service.wait()
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/service.py", line 690, in wait
  2013-05-24 01:48:54,716.716 28649 TRACE nova     _launcher.wait()
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/service.py", line 210, in wait
  2013-05-24 01:48:54,716.716 28649 TRACE nova     super(ServiceLauncher, self).wait()
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/service.py", line 180, in wait
  2013-05-24 01:48:54,716.716 28649 TRACE nova     service.wait()
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 166, in wait
  2013-05-24 01:48:54,716.716 28649 TRACE nova     return self._exit_event.wait()
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/usr/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait
  2013-05-24 01:48:54,716.716 28649 TRACE nova     return hubs.get_hub().switch()
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 177, in switch
  2013-05-24 01:48:54,716.716 28649 TRACE nova     return self.greenlet.switch()
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 192, in main
  2013-05-24 01:48:54,716.716 28649 TRACE nova     result = function(*args, **kwargs)
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/service.py", line 148, in run_server
  2013-05-24 01:48:54,716.716 28649 TRACE nova     server.start()
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/service.py", line 430, in start
  2013-05-24 01:48:54,716.716 28649 TRACE nova     self.manager.init_host()
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py", line 631, in init_host
  2013-05-24 01:48:54,716.716 28649 TRACE nova     self._init_instance(context, instance)
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py", line 520, in _init_instance
  2013-05-24 01:48:54,716.716 28649 TRACE nova     self.driver.plug_vifs(instance, legacy_net_info)
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/driver.py", line 460, in plug_vifs
  2013-05-24 01:48:54,716.716 28649 TRACE nova     self._plug_vifs(instance, network_info)
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/driver.py", line 465, in _plug_vifs
  2013-05-24 01:48:54,716.716 28649 TRACE nova     node = _get_baremetal_node_by_instance_uuid(instance['uuid'])
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/driver.py", line 88, in _get_baremetal_node_by_instance_uuid
  2013-05-24 01:48:54,716.716 28649 TRACE nova     node = db.bm_node_get_by_instance_uuid(ctx, instance_uuid)
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/db/api.py", line 101, in bm_node_get_by_instance_uuid
  2013-05-24 01:48:54,716.716 28649 TRACE nova     instance_uuid)
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 97, in wrapper
  2013-05-24 01:48:54,716.716 28649 TRACE nova     return f(*args, **kwargs)
  2013-05-24 01:48:54,716.716 28649 TRACE nova   File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/db/sqlalchemy/api.py", line 151, in bm_node_get_by_instance_uuid
  2013-05-24 01:48:54,716.716 28649 TRACE nova     raise exception.InstanceNotFound(instance_id=instance_uuid)
  2013-05-24 01:48:54,716.716 28649 TRACE nova InstanceNotFound: Instance 9dc0aba0-27a5-47cb-a85a-574763e8243e could not be found.
  2013-05-24 01:48:54,716.716 28649 TRACE nova 

  Note that the last line is truncated in the logs - its not missing
  content from the copy-paste.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1183633/+subscriptions