← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1651678] Re: boot server request randomly hanging at n-cpu side, and didn't get to Ironic

 

** Changed in: nova
     Assignee: Matt Riedemann (mriedem) => Jay Pipes (jaypipes)

** Changed in: ironic
       Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1651678

Title:
  boot server request randomly hanging at n-cpu side, and didn't get to
  Ironic

Status in Ironic:
  Invalid
Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Ironic gate jobs are randomly timing out during last few weeks:

  
  An example is: http://logs.openstack.org/46/327046/36/check/gate-tempest-dsvm-ironic-ipa-partition-pxe_ipmitool-tinyipa-ubuntu-xenial/48db3ea/console.html

  2016-12-20 23:30:24.418214 |     Traceback (most recent call last):
  2016-12-20 23:30:24.418231 |       File "tempest/test.py", line 99, in wrapper
  2016-12-20 23:30:24.418248 |         return f(self, *func_args, **func_kwargs)
  2016-12-20 23:30:24.418296 |       File "/opt/stack/new/tempest/.tox/tempest/local/lib/python2.7/site-packages/ironic_tempest_plugin/tests/scenario/test_baremetal_basic_ops.py", line 111, in test_baremetal_server_ops
  2016-12-20 23:30:24.418316 |         self.instance, self.node = self.boot_instance()
  2016-12-20 23:30:24.418361 |       File "/opt/stack/new/tempest/.tox/tempest/local/lib/python2.7/site-packages/ironic_tempest_plugin/tests/scenario/baremetal_manager.py", line 173, in boot_instance
  2016-12-20 23:30:24.418375 |         self.wait_node(instance['id'])
  2016-12-20 23:30:24.418417 |       File "/opt/stack/new/tempest/.tox/tempest/local/lib/python2.7/site-packages/ironic_tempest_plugin/tests/scenario/baremetal_manager.py", line 117, in wait_node
  2016-12-20 23:30:24.418441 |         raise lib_exc.TimeoutException(msg)
  2016-12-20 23:30:24.418464 |     tempest.lib.exceptions.TimeoutException: Request timed out
  2016-12-20 23:30:24.418494 |     Details: Timed out waiting to get Ironic node by instance id 50e23a00-5b92-49b7-8dd0-5b8715ba7e26

  Nova compute seems stuck at "_do_build_and_run_instance
  /opt/stack/new/nova/nova/compute/manager.py:1754"

  2016-12-21 13:24:24.307 21735 DEBUG oslo_messaging._drivers.amqpdriver [-] received message with unique_id: 3b9dab54da604a8cadc6c854588a1a5d __call__ /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:196
  2016-12-21 13:24:24.312 21735 DEBUG oslo_concurrency.lockutils [req-7b291e0c-c5b3-4a8a-b4db-e7cef3150b03 tempest-BaremetalBasicOps-1775111554 tempest-BaremetalBasicOps-1775111554] Lock "6376a75b-2970-42f5-9f1b-b34db22a23e4" acquired by "nova.compute.manager._locked_do_build_and_run_instance" :: waited 0.000s inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:270
  2016-12-21 13:24:24.313 21735 DEBUG oslo_messaging._drivers.amqpdriver [req-7b291e0c-c5b3-4a8a-b4db-e7cef3150b03 tempest-BaremetalBasicOps-1775111554 tempest-BaremetalBasicOps-1775111554] CALL msg_id: 92cc73436d164feab727c5b7c81ec179 exchange 'nova' topic 'conductor' _send /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:442
  2016-12-21 13:24:24.326 21735 DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: 92cc73436d164feab727c5b7c81ec179 __call__ /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:299
  2016-12-21 13:24:24.327 21735 DEBUG nova.compute.manager [req-7b291e0c-c5b3-4a8a-b4db-e7cef3150b03 tempest-BaremetalBasicOps-1775111554 tempest-BaremetalBasicOps-1775111554] [instance: 6376a75b-2970-42f5-9f1b-b34db22a23e4] Starting instance... _do_build_and_run_instance /opt/stack/new/nova/nova/compute/manager.py:1754
  2016-12-21 13:24:24.330 21735 DEBUG oslo_messaging._drivers.amqpdriver [req-7b291e0c-c5b3-4a8a-b4db-e7cef3150b03 tempest-BaremetalBasicOps-1775111554 tempest-BaremetalBasicOps-1775111554] CALL msg_id: 15898ce761a143c690ea51c6af5d4f23 exchange 'nova' topic 'conductor' _send /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:442
  2016-12-21 13:24:24.367 21735 DEBUG nova.compute.resource_tracker [req-f3cfc8fa-df45-4da4-adf2-83688458fa16 - -] Compute_service record updated for ubuntu-xenial-osic-cloud1-s3500-6327285:039bbc98-5123-470c-8e09-74e8f35a1391 _update_available_resource /opt/stack/new/nova/nova/compute/resource_tracker.py:601
  2016-12-21 13:24:24.367 21735 DEBUG oslo_concurrency.lockutils [req-f3cfc8fa-df45-4da4-adf2-83688458fa16 - -] Lock "compute_resources" released by "nova.compute.resource_tracker._update_available_resource" :: held 6.935s inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:282

  full log available: http://logs.openstack.org/39/404239/14/check/gate-
  tempest-dsvm-ironic-ipa-wholedisk-pxe_snmp-tinyipa-ubuntu-xenial-
  nv/8f98498/logs/screen-n-cpu.txt.gz#_2016-12-21_13_24_24_307

To manage notifications about this bug go to:
https://bugs.launchpad.net/ironic/+bug/1651678/+subscriptions