← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1651678] Re: boot server request randomly hanging at n-cpu side, and didn't get to Ironic

 

Reviewed:  https://review.openstack.org/414214
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3c217acb9c55d647ca362320d697e80d7cfa5ceb
Submitter: Jenkins
Branch:    master

commit 3c217acb9c55d647ca362320d697e80d7cfa5ceb
Author: Jay Pipes <jaypipes@xxxxxxxxx>
Date:   Thu Dec 22 11:09:15 2016 -0500

    placement: Do not save 0-valued inventory
    
    Ironic nodes that are not available or operable have 0 values for vcpus,
    memory_mb, and local_gb in the returned dict from the Ironic virt driver's
    get_available_resource() call. Don't try to save these 0 values in the
    placement API inventory records, since the placement REST API will return an
    error. Instead, attempt to delete any inventory records for that Ironic node
    resource provider by PUT'ing an empty set of inventory records to the placement
    API.
    
    Closes-bug: #1651678
    
    Change-Id: I10b22606f704abcb970939fb2cd77f026d4d6322


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1651678

Title:
  boot server request randomly hanging at n-cpu side, and didn't get to
  Ironic

Status in Ironic:
  Invalid
Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Ironic gate jobs are randomly timing out during last few weeks:

  
  An example is: http://logs.openstack.org/46/327046/36/check/gate-tempest-dsvm-ironic-ipa-partition-pxe_ipmitool-tinyipa-ubuntu-xenial/48db3ea/console.html

  2016-12-20 23:30:24.418214 |     Traceback (most recent call last):
  2016-12-20 23:30:24.418231 |       File "tempest/test.py", line 99, in wrapper
  2016-12-20 23:30:24.418248 |         return f(self, *func_args, **func_kwargs)
  2016-12-20 23:30:24.418296 |       File "/opt/stack/new/tempest/.tox/tempest/local/lib/python2.7/site-packages/ironic_tempest_plugin/tests/scenario/test_baremetal_basic_ops.py", line 111, in test_baremetal_server_ops
  2016-12-20 23:30:24.418316 |         self.instance, self.node = self.boot_instance()
  2016-12-20 23:30:24.418361 |       File "/opt/stack/new/tempest/.tox/tempest/local/lib/python2.7/site-packages/ironic_tempest_plugin/tests/scenario/baremetal_manager.py", line 173, in boot_instance
  2016-12-20 23:30:24.418375 |         self.wait_node(instance['id'])
  2016-12-20 23:30:24.418417 |       File "/opt/stack/new/tempest/.tox/tempest/local/lib/python2.7/site-packages/ironic_tempest_plugin/tests/scenario/baremetal_manager.py", line 117, in wait_node
  2016-12-20 23:30:24.418441 |         raise lib_exc.TimeoutException(msg)
  2016-12-20 23:30:24.418464 |     tempest.lib.exceptions.TimeoutException: Request timed out
  2016-12-20 23:30:24.418494 |     Details: Timed out waiting to get Ironic node by instance id 50e23a00-5b92-49b7-8dd0-5b8715ba7e26

  Nova compute seems stuck at "_do_build_and_run_instance
  /opt/stack/new/nova/nova/compute/manager.py:1754"

  2016-12-21 13:24:24.307 21735 DEBUG oslo_messaging._drivers.amqpdriver [-] received message with unique_id: 3b9dab54da604a8cadc6c854588a1a5d __call__ /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:196
  2016-12-21 13:24:24.312 21735 DEBUG oslo_concurrency.lockutils [req-7b291e0c-c5b3-4a8a-b4db-e7cef3150b03 tempest-BaremetalBasicOps-1775111554 tempest-BaremetalBasicOps-1775111554] Lock "6376a75b-2970-42f5-9f1b-b34db22a23e4" acquired by "nova.compute.manager._locked_do_build_and_run_instance" :: waited 0.000s inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:270
  2016-12-21 13:24:24.313 21735 DEBUG oslo_messaging._drivers.amqpdriver [req-7b291e0c-c5b3-4a8a-b4db-e7cef3150b03 tempest-BaremetalBasicOps-1775111554 tempest-BaremetalBasicOps-1775111554] CALL msg_id: 92cc73436d164feab727c5b7c81ec179 exchange 'nova' topic 'conductor' _send /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:442
  2016-12-21 13:24:24.326 21735 DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: 92cc73436d164feab727c5b7c81ec179 __call__ /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:299
  2016-12-21 13:24:24.327 21735 DEBUG nova.compute.manager [req-7b291e0c-c5b3-4a8a-b4db-e7cef3150b03 tempest-BaremetalBasicOps-1775111554 tempest-BaremetalBasicOps-1775111554] [instance: 6376a75b-2970-42f5-9f1b-b34db22a23e4] Starting instance... _do_build_and_run_instance /opt/stack/new/nova/nova/compute/manager.py:1754
  2016-12-21 13:24:24.330 21735 DEBUG oslo_messaging._drivers.amqpdriver [req-7b291e0c-c5b3-4a8a-b4db-e7cef3150b03 tempest-BaremetalBasicOps-1775111554 tempest-BaremetalBasicOps-1775111554] CALL msg_id: 15898ce761a143c690ea51c6af5d4f23 exchange 'nova' topic 'conductor' _send /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:442
  2016-12-21 13:24:24.367 21735 DEBUG nova.compute.resource_tracker [req-f3cfc8fa-df45-4da4-adf2-83688458fa16 - -] Compute_service record updated for ubuntu-xenial-osic-cloud1-s3500-6327285:039bbc98-5123-470c-8e09-74e8f35a1391 _update_available_resource /opt/stack/new/nova/nova/compute/resource_tracker.py:601
  2016-12-21 13:24:24.367 21735 DEBUG oslo_concurrency.lockutils [req-f3cfc8fa-df45-4da4-adf2-83688458fa16 - -] Lock "compute_resources" released by "nova.compute.resource_tracker._update_available_resource" :: held 6.935s inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:282

  full log available: http://logs.openstack.org/39/404239/14/check/gate-
  tempest-dsvm-ironic-ipa-wholedisk-pxe_snmp-tinyipa-ubuntu-xenial-
  nv/8f98498/logs/screen-n-cpu.txt.gz#_2016-12-21_13_24_24_307

To manage notifications about this bug go to:
https://bugs.launchpad.net/ironic/+bug/1651678/+subscriptions