← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1543010] Re: Nova clears DB if ESX nova-compute node restarted

 

It's been almost 3 months with no activity on this issue. I'm closing
this out as we don't have enough information to reproduce it. Please
feel free to reopen this bug or file a new one once you are able to
provide us with the additional information we've requested.

** Changed in: nova
       Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1543010

Title:
  Nova clears DB if ESX nova-compute node restarted

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  I had 12 ESX nova-compute cluster with 100 ESX hypervisor. For some reason one of nova-compute node went down.
  After couple of attempt nova-compute came up fine. But,

  1. Nova deleted all the instances running on that particular( esx-compute11) from its DB
  2. All the instances were deleted from the backend as well.

  Filing this bug to track if there is any issue with nova scheduler on
  ESX setup.

  Logs:

  stack@runner:~/nsbu_cqe_openstack/nested$ nova service-list | grep nova-compute | grep esx
  | 6 | nova-compute | esx-compute2 | nova | enabled | up | 2016-02-03T09:45:15.000000 | - |
  | 7 | nova-compute | esx-compute1 | nova | enabled | up | 2016-02-03T09:45:17.000000 | - |
  | 8 | nova-compute | esx-compute4 | nova | enabled | up | 2016-02-03T09:45:18.000000 | - |
  | 9 | nova-compute | esx-compute3 | nova | enabled | up | 2016-02-03T09:45:21.000000 | - |
  | 10 | nova-compute | esx-compute8 | nova | enabled | up | 2016-02-03T09:45:20.000000 | - |
  | 11 | nova-compute | esx-compute7 | nova | enabled | up | 2016-02-03T09:45:19.000000 | - |
  | 12 | nova-compute | esx-compute12 | nova | enabled | up | 2016-02-03T09:45:19.000000 | - |
  | 13 | nova-compute | esx-compute5 | nova | enabled | up | 2016-02-03T09:45:19.000000 | - |
  | 14 | nova-compute | esx-compute9 | nova | enabled | up | 2016-02-03T09:45:17.000000 | - |
  | 15 | nova-compute | esx-compute6 | nova | enabled | up | 2016-02-03T09:45:19.000000 | - |
  | 16 | nova-compute | esx-compute10 | nova | enabled | up | 2016-02-03T09:45:20.000000 | - |
  | 17 | nova-compute | esx-compute11 | nova | enabled | down | 2016-02-03T09:26:53.000000 | - |
  stack@runner:~/nsbu_cqe_openstack/nested$

  
  stack@controller:~$ sudo netstat -anp | grep 62.24.1.87
  tcp6 0 0 62.24.1.111:5672 62.24.1.87:58180 ESTABLISHED 8687/beam.smp
  tcp6 0 0 62.24.1.111:5672 62.24.1.87:58179 ESTABLISHED 8687/beam.smp
  stack@controller:~$

  
  2016-02-03 01:27:03.217 INFO nova.service [-] Starting compute node (version 13.0.0)
  Traceback (most recent call last):
    File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers
      timer()
    File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
      cb(*args, **kw)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
      result = function(*args, **kwargs)
    File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 671, in run_service
      service.start()
    File "/opt/stack/nova/nova/service.py", line 183, in start
      self.manager.init_host()
    File "/opt/stack/nova/nova/compute/manager.py", line 1313, in init_host
      context, self.host, expected_attrs=['info_cache', 'metadata'])
    File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 172, in wrapper
      args, kwargs)
    File "/opt/stack/nova/nova/conductor/rpcapi.py", line 241, in object_class_action_versions
      args=args, kwargs=kwargs)
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
      retry=self.retry)
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
      timeout=timeout, retry=retry)
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 464, in send
      retry=retry)
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 453, in _send
      result = self._waiter.wait(msg_id, timeout)
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 336, in wait
      message = self.waiters.get(msg_id, timeout=timeout)
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 239, in get
      'to message ID %s' % msg_id)
  MessagingTimeout: Timed out waiting for a reply to message ID 5a19ba4d2a694453b5db95fb2f73f9e8
  2016-02-03 01:28:58.448 INFO oslo_messaging._drivers.amqpdriver [-] No calling threads waiting for msg_id : 5a19ba4d2a694453b5db95fb2f73f9e8

  
  Logs:

  M-Release, master branch

  stack@esx-compute3:/opt/stack/nova$ git log -1
  commit 197bd6dd1231f1f57cdd6c0acb1dfbdc3b2b0989
  Merge: 1ec0b56 5f5590f
  Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
  Date:   Sun Feb 7 04:08:54 2016 +0000

      Merge "libvirt: use osinfo when configuring the disk bus"
  stack@esx-compute3:/opt/stack/nova$

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1543010/+subscriptions


References