← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1610679] [NEW] race conditions between compute and schedule disk report

 

Public bug reported:

devstack recently built and *not* with in-tree virt layer but I think
it's not related to it


got following error in conductor log 

  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
    res = self.dispatcher.dispatch(message)

  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
    return self._do_dispatch(endpoint, method, ctxt, args)

  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
    result = func(ctxt, **new_args)

  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 199, in inner
    return func(*args, **kwargs)

  File "/opt/stack/new/nova/nova/scheduler/manager.py", line 104, in select_destinations
    dests = self.driver.select_destinations(ctxt, spec_obj)

  File "/opt/stack/new/nova/nova/scheduler/filter_scheduler.py", line 53, in select_destinations
    selected_hosts = self._schedule(context, spec_obj)

  File "/opt/stack/new/nova/nova/scheduler/filter_scheduler.py", line 104, in _schedule
    hosts = self._get_all_host_states(elevated)

  File "/opt/stack/new/nova/nova/scheduler/filter_scheduler.py", line 145, in _get_all_host_states
    return self.host_manager.get_all_host_states(context)

  File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 597, in get_all_host_states
    self._get_instance_info(context, compute))

  File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 180, in update
    return _locked_update(self, compute, service, aggregates, inst_dict)

  File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 271, in inner
    return f(*args, **kwargs)

  File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 169, in _locked_update
    self._update_from_compute_node(compute)

  File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 201, in _update_from_compute_node
    free_disk_mb = free_gb * 1024

TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'


scheduler log shows following 

2016-07-27 13:45:00.934 21123 DEBUG oslo_concurrency.lockutils [req-fc02af23-3279-4c93-8732-9b4f9f3a0b8d tempest-ServersTestJSON-2014112056 tempest-ServersTestJSON-2014112056] Lock "(u'opnssi1', u'OPNSSI1')" acquired by "nova.scheduler.host_manager._locked_update" :: waited 0.000s inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:270
2016-07-27 13:45:00.935 21123 DEBUG nova.scheduler.host_manager [req-fc02af23-3279-4c93-8732-9b4f9f3a0b8d tempest-ServersTestJSON-2014112056 tempest-ServersTestJSON-2014112056] Update host state from compute node: ComputeNode(cpu_allocation_ratio=16.0,cpu_info='{"cec_model": "2827", "architecture": "s390x"}',created_at=2016-07-27T13:45:00Z,current_workload=None,deleted=False,deleted_at=None,disk_allocation_ratio=1.0,disk_available_least=432,free_disk_gb=None,free_ram_mb=None,host='opnssi1',host_ip=10.32.201.222,hypervisor_hostname='OPNSSI1',hypervisor_type='zvm',hypervisor_version=630,id=2,local_gb=556,local_gb_used=124,memory_mb=24576,memory_mb_used=0,metrics=None,numa_topology=None,pci_device_pools=None,ram_allocation_ratio=10.0,running_vms=None,service_id=None,stats={},supported_hv_specs=[HVSpec],updated_at=None,uuid=366d9b37-8570-4188-8aab-bc9819cb2312,vcpus=8,vcpus_used=8) _locked_update /opt/stack/new/nova/nova/scheduler/host_manager.py:168
2016-07-27 13:45:00.936 21123 WARNING nova.scheduler.host_manager [req-fc02af23-3279-4c93-8732-9b4f9f3a0b8d tempest-ServersTestJSON-2014112056 tempest-ServersTestJSON-2014112056] Host OPNSSI1 has more disk space than database expected (432 GB > None GB)


and compute logs shows the compute node was created first time here

2016-07-27 13:45:00.396 21125 WARNING nova.compute.resource_tracker
[req-1ec17be4-cc14-494b-8f88-8bce1999fba1 - -] No compute node record
for opnssi1:OPNSSI1

2016-07-27 13:45:01.009 21125 INFO nova.compute.resource_tracker [req-
1ec17be4-cc14-494b-8f88-8bce1999fba1 - -] Compute_service record updated
for opnssi1:OPNSSI1

** Affects: nova
     Importance: Undecided
     Assignee: jichenjc (jichenjc)
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1610679

Title:
  race conditions between compute and schedule disk report

Status in OpenStack Compute (nova):
  New

Bug description:
  devstack recently built and *not* with in-tree virt layer but I think
  it's not related to it

  
  got following error in conductor log 

    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
      res = self.dispatcher.dispatch(message)

    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
      return self._do_dispatch(endpoint, method, ctxt, args)

    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
      result = func(ctxt, **new_args)

    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 199, in inner
      return func(*args, **kwargs)

    File "/opt/stack/new/nova/nova/scheduler/manager.py", line 104, in select_destinations
      dests = self.driver.select_destinations(ctxt, spec_obj)

    File "/opt/stack/new/nova/nova/scheduler/filter_scheduler.py", line 53, in select_destinations
      selected_hosts = self._schedule(context, spec_obj)

    File "/opt/stack/new/nova/nova/scheduler/filter_scheduler.py", line 104, in _schedule
      hosts = self._get_all_host_states(elevated)

    File "/opt/stack/new/nova/nova/scheduler/filter_scheduler.py", line 145, in _get_all_host_states
      return self.host_manager.get_all_host_states(context)

    File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 597, in get_all_host_states
      self._get_instance_info(context, compute))

    File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 180, in update
      return _locked_update(self, compute, service, aggregates, inst_dict)

    File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 271, in inner
      return f(*args, **kwargs)

    File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 169, in _locked_update
      self._update_from_compute_node(compute)

    File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 201, in _update_from_compute_node
      free_disk_mb = free_gb * 1024

  TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'

  
  scheduler log shows following 

  2016-07-27 13:45:00.934 21123 DEBUG oslo_concurrency.lockutils [req-fc02af23-3279-4c93-8732-9b4f9f3a0b8d tempest-ServersTestJSON-2014112056 tempest-ServersTestJSON-2014112056] Lock "(u'opnssi1', u'OPNSSI1')" acquired by "nova.scheduler.host_manager._locked_update" :: waited 0.000s inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:270
  2016-07-27 13:45:00.935 21123 DEBUG nova.scheduler.host_manager [req-fc02af23-3279-4c93-8732-9b4f9f3a0b8d tempest-ServersTestJSON-2014112056 tempest-ServersTestJSON-2014112056] Update host state from compute node: ComputeNode(cpu_allocation_ratio=16.0,cpu_info='{"cec_model": "2827", "architecture": "s390x"}',created_at=2016-07-27T13:45:00Z,current_workload=None,deleted=False,deleted_at=None,disk_allocation_ratio=1.0,disk_available_least=432,free_disk_gb=None,free_ram_mb=None,host='opnssi1',host_ip=10.32.201.222,hypervisor_hostname='OPNSSI1',hypervisor_type='zvm',hypervisor_version=630,id=2,local_gb=556,local_gb_used=124,memory_mb=24576,memory_mb_used=0,metrics=None,numa_topology=None,pci_device_pools=None,ram_allocation_ratio=10.0,running_vms=None,service_id=None,stats={},supported_hv_specs=[HVSpec],updated_at=None,uuid=366d9b37-8570-4188-8aab-bc9819cb2312,vcpus=8,vcpus_used=8) _locked_update /opt/stack/new/nova/nova/scheduler/host_manager.py:168
  2016-07-27 13:45:00.936 21123 WARNING nova.scheduler.host_manager [req-fc02af23-3279-4c93-8732-9b4f9f3a0b8d tempest-ServersTestJSON-2014112056 tempest-ServersTestJSON-2014112056] Host OPNSSI1 has more disk space than database expected (432 GB > None GB)


  
  and compute logs shows the compute node was created first time here 

  2016-07-27 13:45:00.396 21125 WARNING nova.compute.resource_tracker
  [req-1ec17be4-cc14-494b-8f88-8bce1999fba1 - -] No compute node record
  for opnssi1:OPNSSI1

  2016-07-27 13:45:01.009 21125 INFO nova.compute.resource_tracker [req-
  1ec17be4-cc14-494b-8f88-8bce1999fba1 - -] Compute_service record
  updated for opnssi1:OPNSSI1

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1610679/+subscriptions


Follow ups