← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1651704] Re: Overcloud deployment fails when node introspection is enabled

 

I've reproduced this locally running RDO from master. I have a ironic
installation (a tripleo undercloud), and i've registered a node and
instrospect it:

This is how the node looks like from ironic perspective:

$ openstack baremetal show e14be55d-8dd9-4b08-aa5d-efab2b5a5c01
This command is deprecated. Instead, use 'openstack baremetal node show'.
+------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field                  | Value                                                                                                                                                                                                             |
+------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| chassis_uuid           | None                                                                                                                                                                                                              |
| clean_step             | {}                                                                                                                                                                                                                |
| console_enabled        | False                                                                                                                                                                                                             |
| created_at             | 2016-12-23T17:08:16+00:00                                                                                                                                                                                         |
| driver                 | pxe_ssh                                                                                                                                                                                                           |
| driver_info            | {u'ssh_username': u'stack', u'deploy_kernel': u'd77546aa-5936-417a-9dec-43f4f98a057d', u'deploy_ramdisk': u'bc508426-aee4-4d7d-97e5-dcbd6761a896', u'ssh_key_contents': u'******', u'ssh_virt_type': u'virsh',    |
|                        | u'ssh_address': u'192.168.23.1'}                                                                                                                                                                                  |
| driver_internal_info   | {}                                                                                                                                                                                                                |
| extra                  | {u'hardware_swift_object': u'extra_hardware-e14be55d-8dd9-4b08-aa5d-efab2b5a5c01'}                                                                                                                                |
| inspection_finished_at | None                                                                                                                                                                                                              |
| inspection_started_at  | None                                                                                                                                                                                                              |
| instance_info          | {}                                                                                                                                                                                                                |
| instance_uuid          | None                                                                                                                                                                                                              |
| last_error             | None                                                                                                                                                                                                              |
| maintenance            | False                                                                                                                                                                                                             |
| maintenance_reason     | None                                                                                                                                                                                                              |
| name                   | control-0                                                                                                                                                                                                         |
| power_state            | power off                                                                                                                                                                                                         |
| properties             | {u'memory_mb': u'4099', u'cpu_arch': u'x86_64', u'local_gb': u'49', u'cpus': u'2', u'capabilities': u'profile:control,cpu_hugepages:true,boot_option:local'}                                                      |
| provision_state        | manageable                                                                                                                                                                                                        |
| provision_updated_at   | 2016-12-23T17:08:50+00:00                                                                                                                                                                                         |
| reservation            | None                                                                                                                                                                                                              |
| target_power_state     | None                                                                                                                                                                                                              |
| target_provision_state | None                                                                                                                                                                                                              |
| updated_at             | 2016-12-23T17:15:09+00:00                                                                                                                                                                                         |
| uuid                   | e14be55d-8dd9-4b08-aa5d-efab2b5a5c01                                                                                                                                                                              |
+------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

But from nova:

 openstack hypervisor show e14be55d-8dd9-4b08-aa5d-efab2b5a5c01
+----------------------+--------------------------------------+
| Field                | Value                                |
+----------------------+--------------------------------------+
| aggregates           | []                                   |
| cpu_info             |                                      |
| current_workload     | 0                                    |
| disk_available_least | 0                                    |
| free_disk_gb         | 0                                    |
| free_ram_mb          | 0                                    |
| host_ip              | 192.168.23.43                        |
| hypervisor_hostname  | e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 |
| hypervisor_type      | ironic                               |
| hypervisor_version   | 1                                    |
| id                   | 3                                    |
| local_gb             | 0                                    |
| local_gb_used        | 0                                    |
| memory_mb            | 0                                    |
| memory_mb_used       | 0                                    |
| running_vms          | 0                                    |
| service_host         | undercloud                           |
| service_id           | 4                                    |
| state                | up                                   |
| status               | enabled                              |
| vcpus                | 0                                    |
| vcpus_used           | 0                                    |
+----------------------+--------------------------------------+


I've wait for update but data never updates, i see following messages in nova-compute.log:

$ sudo grep "Final.*e14" /var/log/nova/nova-compute.log
2016-12-23 17:09:49.015 22260 INFO nova.compute.resource_tracker [req-25559752-74e3-4811-9bc2-052267bb2c55 - - - - -] Final resource view: name=e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 phys_ram=0MB used_ram=0MB phys_disk=0GB used_disk=0GB total_vcpus=0 used_vcpus=0 pci_stats=[]
2016-12-23 17:10:49.590 22260 INFO nova.compute.resource_tracker [req-25559752-74e3-4811-9bc2-052267bb2c55 - - - - -] Final resource view: name=e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 phys_ram=0MB used_ram=0MB phys_disk=0GB used_disk=0GB total_vcpus=0 used_vcpus=0 pci_stats=[]
2016-12-23 17:11:51.628 22260 INFO nova.compute.resource_tracker [req-25559752-74e3-4811-9bc2-052267bb2c55 - - - - -] Final resource view: name=e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 phys_ram=0MB used_ram=0MB phys_disk=0GB used_disk=0GB total_vcpus=0 used_vcpus=0 pci_stats=[]
2016-12-23 17:12:52.586 22260 INFO nova.compute.resource_tracker [req-25559752-74e3-4811-9bc2-052267bb2c55 - - - - -] Final resource view: name=e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 phys_ram=0MB used_ram=0MB phys_disk=0GB used_disk=0GB total_vcpus=0 used_vcpus=0 pci_stats=[]
2016-12-23 17:13:54.658 22260 INFO nova.compute.resource_tracker [req-25559752-74e3-4811-9bc2-052267bb2c55 - - - - -] Final resource view: name=e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 phys_ram=0MB used_ram=0MB phys_disk=0GB used_disk=0GB total_vcpus=0 used_vcpus=0 pci_stats=[]
2016-12-23 17:14:55.059 22260 INFO nova.compute.resource_tracker [req-25559752-74e3-4811-9bc2-052267bb2c55 - - - - -] Final resource view: name=e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 phys_ram=0MB used_ram=0MB phys_disk=0GB used_disk=0GB total_vcpus=0 used_vcpus=0 pci_stats=[]
2016-12-23 17:15:55.588 22260 INFO nova.compute.resource_tracker [req-25559752-74e3-4811-9bc2-052267bb2c55 - - - - -] Final resource view: name=e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 phys_ram=0MB used_ram=0MB phys_disk=0GB used_disk=0GB total_vcpus=0 used_vcpus=0 pci_stats=[]
2016-12-23 17:16:55.592 22260 INFO nova.compute.resource_tracker [req-25559752-74e3-4811-9bc2-052267bb2c55 - - - - -] Final resource view: name=e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 phys_ram=0MB used_ram=0MB phys_disk=0GB used_disk=0GB total_vcpus=0 used_vcpus=0 pci_stats=[]
2016-12-23 17:17:56.588 22260 INFO nova.compute.resource_tracker [req-25559752-74e3-4811-9bc2-052267bb2c55 - - - - -] Final resource view: name=e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 phys_ram=0MB used_ram=0MB phys_disk=0GB used_disk=0GB total_vcpus=0 used_vcpus=0 pci_stats=[]
2016-12-23 17:18:58.609 22260 INFO nova.compute.resource_tracker [req-25559752-74e3-4811-9bc2-052267bb2c55 - - - - -] Final resource view: name=e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 phys_ram=0MB used_ram=0MB phys_disk=0GB used_disk=0GB total_vcpus=0 used_vcpus=0 pci_stats=[]
2016-12-23 17:19:59.093 22260 INFO nova.compute.resource_tracker [req-25559752-74e3-4811-9bc2-052267bb2c55 - - - - -] Final resource view: name=e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 phys_ram=0MB used_ram=0MB phys_disk=0GB used_disk=0GB total_vcpus=0 used_vcpus=0 pci_stats=[]
2016-12-23 17:20:59.587 22260 INFO nova.compute.resource_tracker [req-25559752-74e3-4811-9bc2-052267bb2c55 - - - - -] Final resource view: name=e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 phys_ram=0MB used_ram=0MB phys_disk=0GB used_disk=0GB total_vcpus=0 used_vcpus=0 pci_stats=[]
2016-12-23 17:22:00.582 22260 INFO nova.compute.resource_tracker [req-25559752-74e3-4811-9bc2-052267bb2c55 - - - - -] Final resource view: name=e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 phys_ram=0MB used_ram=0MB phys_disk=0GB used_disk=0GB total_vcpus=0 used_vcpus=0 pci_stats=[]
2016-12-23 17:23:02.626 22260 INFO nova.compute.resource_tracker [req-25559752-74e3-4811-9bc2-052267bb2c55 - - - - -] Final resource view: name=e14be55d-8dd9-4b08-aa5d-efab2b5a5c01 phys_ram=0MB used_ram=0MB phys_disk=0GB used_disk=0GB total_vcpus=0 used_vcpus=0 pci_stats=[]


I'm adding nova as affected as the issue may be in nova side iiuc. 

** Also affects: nova
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1651704

Title:
  Overcloud deployment fails when node introspection is enabled

Status in Ironic:
  New
Status in OpenStack Compute (nova):
  New
Status in tripleo:
  New
Status in ironic-inspector package in Ubuntu:
  New

Bug description:
  Running tripleo using tripleo-quickstart with minimal profile
  (step_introspect: true) for master branch, overcloud deploy with
  error:

      ResourceInError: resources.Controller: Went to status ERROR due to
  "Message: No valid host was found. There are not enough hosts
  available., Code: 500"

  Looking at nova-scheduler.log, following errors are found:

      https://ci.centos.org/artifacts/rdo/jenkins-tripleo-quickstart-
  promote-master-delorean-minimal-806/undercloud/var/log/nova/nova-
  scheduler.log.gz

      2016-12-21 06:45:56.822 17759 DEBUG nova.scheduler.host_manager
  [req-f889dbc0-1096-4f92-80fc-3c5bdcb1ad29
  4f103e0230074c2488b7359bc079d323 f21dbfb3b2c840059ec2a0bba03b7385 - -
  -] Update host state from compute node:
  ComputeNode(cpu_allocation_ratio=16.0,cpu_info='',created_at=2016-12-21T06:38:28Z,current_workload=0,deleted=False,deleted_at=None,disk_allocation_ratio=1.0,disk_available_least=0,free_disk_gb=0,free_ram_mb=0,host='undercloud',host_ip=192.168.23.46,hypervisor_hostname
  ='c6f8f4ba-9c7c-4c87-b95a-
  67a5861b7bec',hypervisor_type='ironic',hypervisor_version=1,id=2,local_gb=0,local_gb_used=0,memory_mb=0,memory_mb_used=0,metrics='[]',numa_topology=None,pci_device_pools=PciDevicePoolList,ram_allocation_ratio=1.0,running_vms=0,service_id=None,stats={boot_option='local',cpu_aes='true',cpu_arch='x86_64',cpu_hugepages='true',cpu_hugepages_1g='true',cpu_vt='true',profile='control'},supported_hv_specs=[HVSpec],updated_at=2016-12-21T06:45:38Z,uuid
  =ac2742da-39fb-4ca4-9f78-8e04f703c7a6,vcpus=0,vcpus_used=0)
  _locked_update /usr/lib/python2.7/site-
  packages/nova/scheduler/host_manager.py:168

      2016-12-21 06:47:48.893 17759 DEBUG
  nova.scheduler.filters.ram_filter [req-2aece1c8-6d3e-457b-
  92d7-a3177680c82e 4f103e0230074c2488b7359bc079d323
  f21dbfb3b2c840059ec2a0bba03b7385 - - -] (undercloud, c6f8f4ba-9c7c-
  4c87-b95a-67a5861b7bec) ram: 0MB disk: 0MB io_ops: 0 instances: 0 does
  not have 8192 MB usable ram before overcommit, it only has 0 MB.
  host_passes /usr/lib/python2.7/site-
  packages/nova/scheduler/filters/ram_filter.py:45

      2016-12-21 06:47:48.894 17759 INFO nova.filters [req-2aece1c8
  -6d3e-457b-92d7-a3177680c82e 4f103e0230074c2488b7359bc079d323
  f21dbfb3b2c840059ec2a0bba03b7385 - - -] Filter RamFilter returned 0
  hosts

  My guess is that node introspection is failing to get proper node
  information.

  Full logs can be found in https://ci.centos.org/artifacts/rdo/jenkins-
  tripleo-quickstart-promote-master-delorean-minimal-806/undercloud/

  We have hit this issue twice in the last runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ironic/+bug/1651704/+subscriptions