← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1904181] Re: nova-compute fails to start if cell conductor is not running.

 

Reviewed:  https://review.opendev.org/762633
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=433bee58bc8d7d65edb6a0805021e51972e6bed6
Submitter: Zuul
Branch:    master

commit 433bee58bc8d7d65edb6a0805021e51972e6bed6
Author: Balazs Gibizer <balazs.gibizer@xxxxxxxx>
Date:   Fri Nov 13 11:47:51 2020 +0100

    Restore retrying the RPC connection to conductor
    
    Before Ie15ec8299ae52ae8f5334d591ed3944e9585cf71 if the compute was
    started before the conductor then the compute retried the connection
    until the conductor was up. The Ie15ec8299ae52ae8f5334d591ed3944e9585cf71
    break this behavior as the service version check runs before this RPC
    retry mechanism and therefore the compute simply fails to start without
    a retry if no conductor is started.
    
    This patch moves the service version check after the RPC connection
    retry mechanism.
    
    Change-Id: Iad0ba1a02868eebc2f43b1ac843fcc5096cd5c47
    Closes-Bug: #1904181


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1904181

Title:
  nova-compute fails to start if cell conductor is not running.

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Since https://review.opendev.org/#/c/738482/ if the nova-compute is
  started earlier than the cell conductor the compute belongs to then
  the nova-compute service will fail with a MessagingTimeout and exists.
  Before https://review.opendev.org/#/c/738482/ in this scenario the
  nova compute would retry until the conductor is ready.

  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: CRITICAL nova [None req-a8777aaa-2c27-4770-83d5-63d869124590 None None] Unhandled error: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 5f9597ea9baf4b6fb37ba85d81040b62
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova Traceback (most recent call last):
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/usr/local/lib/python3.8/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 405, in get
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     return self._queues[msg_id].get(block=True, timeout=timeout)
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/usr/local/lib/python3.8/dist-packages/eventlet/queue.py", line 322, in get
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     return waiter.wait()
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/usr/local/lib/python3.8/dist-packages/eventlet/queue.py", line 141, in wait
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     return get_hub().switch()
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/usr/local/lib/python3.8/dist-packages/eventlet/hubs/hub.py", line 313, in switch
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     return self.greenlet.switch()
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova _queue.Empty
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova 
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova During handling of the above exception, another exception occurred:
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova 
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova Traceback (most recent call last):
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/usr/local/bin/nova-compute", line 10, in <module>
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     sys.exit(main())
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/opt/stack/nova/nova/cmd/compute.py", line 58, in main
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     server = service.Service.create(binary='nova-compute',
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/opt/stack/nova/nova/service.py", line 252, in create
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     utils.raise_if_old_compute()
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/opt/stack/nova/nova/utils.py", line 1088, in raise_if_old_compute
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     current_service_version = service.Service.get_minimum_version(
  Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/usr/local/lib/python3.8/dist-packages/oslo_versionedobjects/base.py", line 175, in wrapper
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     result = cls.indirection_api.object_class_action_versions(
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/opt/stack/nova/nova/conductor/rpcapi.py", line 240, in object_class_action_versions
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     return cctxt.call(context, 'object_class_action_versions',
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/usr/local/lib/python3.8/dist-packages/oslo_messaging/rpc/client.py", line 175, in call
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     self.transport._send(self.target, msg_ctxt, msg,
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/usr/local/lib/python3.8/dist-packages/oslo_messaging/transport.py", line 123, in _send
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     return self._driver.send(target, ctxt, message,
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/usr/local/lib/python3.8/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 652, in send
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     return self._send(target, ctxt, message, wait_for_reply, timeout,
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/usr/local/lib/python3.8/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 641, in _send
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     result = self._waiter.wait(msg_id, timeout,
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/usr/local/lib/python3.8/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 531, in wait
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     message = self.waiters.get(msg_id, timeout=timeout)
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova   File "/usr/local/lib/python3.8/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 407, in get
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova     raise oslo_messaging.MessagingTimeout(
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 5f9597ea9baf4b6fb37ba85d81040b62
  Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova 
  Nov 12 12:59:13.407277 ubuntu-focal-ovh-bhs1-0021738724 systemd[1]: devstack@n-cpu.service: Main process exited, code=exited, status=1/FAILURE
  Nov 12 12:59:13.407306 ubuntu-focal-ovh-bhs1-0021738724 systemd[1]: devstack@n-cpu.service: Failed with result 'exit-code'.

  Example run:
  https://1d1ac8c4d7a38514d020-4dcc482f43713c13ecd75f64a0eb3df3.ssl.cf1.rackcdn.com/762319/1/check
  /tempest-integrated-compute/0b7fa72/controller/logs/screen-n-cond-
  cell1.txt

  Signature:
  http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22current_service_version%20%3D%20service.Service.get_minimum_version(%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1904181/+subscriptions


References