yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #84438
[Bug 1904181] Re: nova-compute fails to start if cell conductor is not running.
Reviewed: https://review.opendev.org/762633
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=433bee58bc8d7d65edb6a0805021e51972e6bed6
Submitter: Zuul
Branch: master
commit 433bee58bc8d7d65edb6a0805021e51972e6bed6
Author: Balazs Gibizer <balazs.gibizer@xxxxxxxx>
Date: Fri Nov 13 11:47:51 2020 +0100
Restore retrying the RPC connection to conductor
Before Ie15ec8299ae52ae8f5334d591ed3944e9585cf71 if the compute was
started before the conductor then the compute retried the connection
until the conductor was up. The Ie15ec8299ae52ae8f5334d591ed3944e9585cf71
break this behavior as the service version check runs before this RPC
retry mechanism and therefore the compute simply fails to start without
a retry if no conductor is started.
This patch moves the service version check after the RPC connection
retry mechanism.
Change-Id: Iad0ba1a02868eebc2f43b1ac843fcc5096cd5c47
Closes-Bug: #1904181
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1904181
Title:
nova-compute fails to start if cell conductor is not running.
Status in OpenStack Compute (nova):
Fix Released
Bug description:
Since https://review.opendev.org/#/c/738482/ if the nova-compute is
started earlier than the cell conductor the compute belongs to then
the nova-compute service will fail with a MessagingTimeout and exists.
Before https://review.opendev.org/#/c/738482/ in this scenario the
nova compute would retry until the conductor is ready.
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: CRITICAL nova [None req-a8777aaa-2c27-4770-83d5-63d869124590 None None] Unhandled error: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 5f9597ea9baf4b6fb37ba85d81040b62
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova Traceback (most recent call last):
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/usr/local/lib/python3.8/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 405, in get
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova return self._queues[msg_id].get(block=True, timeout=timeout)
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/usr/local/lib/python3.8/dist-packages/eventlet/queue.py", line 322, in get
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova return waiter.wait()
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/usr/local/lib/python3.8/dist-packages/eventlet/queue.py", line 141, in wait
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova return get_hub().switch()
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/usr/local/lib/python3.8/dist-packages/eventlet/hubs/hub.py", line 313, in switch
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova return self.greenlet.switch()
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova _queue.Empty
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova During handling of the above exception, another exception occurred:
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova Traceback (most recent call last):
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/usr/local/bin/nova-compute", line 10, in <module>
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova sys.exit(main())
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/opt/stack/nova/nova/cmd/compute.py", line 58, in main
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova server = service.Service.create(binary='nova-compute',
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/opt/stack/nova/nova/service.py", line 252, in create
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova utils.raise_if_old_compute()
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/opt/stack/nova/nova/utils.py", line 1088, in raise_if_old_compute
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova current_service_version = service.Service.get_minimum_version(
Nov 12 12:59:12.962343 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/usr/local/lib/python3.8/dist-packages/oslo_versionedobjects/base.py", line 175, in wrapper
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova result = cls.indirection_api.object_class_action_versions(
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/opt/stack/nova/nova/conductor/rpcapi.py", line 240, in object_class_action_versions
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova return cctxt.call(context, 'object_class_action_versions',
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/usr/local/lib/python3.8/dist-packages/oslo_messaging/rpc/client.py", line 175, in call
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova self.transport._send(self.target, msg_ctxt, msg,
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/usr/local/lib/python3.8/dist-packages/oslo_messaging/transport.py", line 123, in _send
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova return self._driver.send(target, ctxt, message,
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/usr/local/lib/python3.8/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 652, in send
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova return self._send(target, ctxt, message, wait_for_reply, timeout,
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/usr/local/lib/python3.8/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 641, in _send
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova result = self._waiter.wait(msg_id, timeout,
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/usr/local/lib/python3.8/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 531, in wait
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova message = self.waiters.get(msg_id, timeout=timeout)
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova File "/usr/local/lib/python3.8/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 407, in get
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova raise oslo_messaging.MessagingTimeout(
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 5f9597ea9baf4b6fb37ba85d81040b62
Nov 12 12:59:12.965655 ubuntu-focal-ovh-bhs1-0021738724 nova-compute[81122]: ERROR nova
Nov 12 12:59:13.407277 ubuntu-focal-ovh-bhs1-0021738724 systemd[1]: devstack@n-cpu.service: Main process exited, code=exited, status=1/FAILURE
Nov 12 12:59:13.407306 ubuntu-focal-ovh-bhs1-0021738724 systemd[1]: devstack@n-cpu.service: Failed with result 'exit-code'.
Example run:
https://1d1ac8c4d7a38514d020-4dcc482f43713c13ecd75f64a0eb3df3.ssl.cf1.rackcdn.com/762319/1/check
/tempest-integrated-compute/0b7fa72/controller/logs/screen-n-cond-
cell1.txt
Signature:
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22current_service_version%20%3D%20service.Service.get_minimum_version(%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1904181/+subscriptions
References