yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #87610
[Bug 1777608] Re: Nova compute calls plug_vifs unnecessarily for ironic nodes in init_host
Reviewed: https://review.opendev.org/c/openstack/nova/+/813263
Committed: https://opendev.org/openstack/nova/commit/7f81cf28bf21ad2afa98accfde3087c83b8e269b
Submitter: "Zuul (22348)"
Branch: master
commit 7f81cf28bf21ad2afa98accfde3087c83b8e269b
Author: Julia Kreger <juliaashleykreger@xxxxxxxxx>
Date: Fri Oct 8 14:35:00 2021 -0700
Ignore plug_vifs on the ironic driver
When the nova-compute service starts, by default it attempts to
startup instance configuration states for aspects such as networking.
This is fine in most cases, and makes a lot of sense if the
nova-compute service is just managing virtual machines on a hypervisor.
This is done, one instance at a time.
However, when the compute driver is ironic, the networking is managed
as part of the physical machine lifecycle potentially all the way into
committed switch configurations. As such, there is no need to attempt
to call ``plug_vifs`` on every single instance managed by the
nova-compute process which is backed by Ironic.
Additionally, using ironic tends to manage far more physical machines
per nova-compute service instance then when when operating co-installed
with a hypervisor. Often this means a cluster of a thousand machines,
with three controllers, will see thousands of un-needed API calls upon
service start, which elongates the entire process and negatively
impacts operations.
In essence, nova.virt.ironic's plug_vifs call now does nothing,
and merely issues a debug LOG entry when called.
Closes-Bug: #1777608
Change-Id: Iba87cef50238c5b02ab313f2311b826081d5b4ab
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1777608
Title:
Nova compute calls plug_vifs unnecessarily for ironic nodes in
init_host
Status in OpenStack Compute (nova):
Fix Released
Bug description:
Originally reported there
https://bugzilla.redhat.com/show_bug.cgi?id=1590297#c14 and tracked
there https://bugzilla.redhat.com/show_bug.cgi?id=1592427
@owalsh: Looks like a race in the service startup.
(undercloud) [stack@undercloud ~]$ openstack server list
+--------------------------------------+-------------------------+--------+------------------------+------------------------+--------------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+-------------------------+--------+------------------------+------------------------+--------------+
| fcda23f8-7b98-41bf-9e45-7a4579164874 | overcloud-controller-1 | ERROR | ctlplane=192.168.24.7 | overcloud-full_renamed | oooq_control |
| ced0e0db-ad1f-4381-a523-0a79ae0303ff | overcloud-controller-2 | ERROR | ctlplane=192.168.24.21 | overcloud-full_renamed | oooq_control |
| 0731685f-a8fc-4126-868a-bdf9879238f3 | overcloud-controller-0 | ERROR | ctlplane=192.168.24.12 | overcloud-full_renamed | oooq_control |
| c4fd4e28-fe44-40d2-9b40-7196b7c3b6a2 | overcloud-novacompute-2 | ERROR | ctlplane=192.168.24.15 | overcloud-full_renamed | oooq_compute |
| 5bef3afd-4485-4cbc-b76c-f126c85bd015 | overcloud-novacompute-1 | ERROR | ctlplane=192.168.24.8 | overcloud-full_renamed | oooq_compute |
| e4c9da33-c452-446c-82c4-bd55a6b294d8 | overcloud-novacompute-0 | ERROR | ctlplane=192.168.24.9 | overcloud-full_renamed | oooq_compute |
+--------------------------------------+-------------------------+--------+------------------------+------------------------+--------------+
Looking at controller-1...
(undercloud) [stack@undercloud ~]$ openstack server show overcloud-controller-1
+-------------------------------------+---------------------------------------------------------------+
| Field | Value |
+-------------------------------------+---------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | undercloud |
| OS-EXT-SRV-ATTR:hypervisor_hostname | b6a32fba-b57a-4e7b-a6ce-99941b4d134d |
| OS-EXT-SRV-ATTR:instance_name | instance-00000005 |
| OS-EXT-STS:power_state | Running |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | 2018-06-11T14:56:34.000000 |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | ctlplane=192.168.24.7 |
| config_drive | True |
| created | 2018-06-11T14:53:09Z |
| flavor | oooq_control (04f8ba26-e9bd-472f-b92a-919e7ec8bed1) |
| hostId | da8848b6f3dc77a51235f70dcf44df197261eca0529173f670c94ce9 |
| id | fcda23f8-7b98-41bf-9e45-7a4579164874 |
| image | overcloud-full_renamed (4d965d80-61fc-45d0-88e1-e6365c4afd57) |
| key_name | default |
| name | overcloud-controller-1 |
| project_id | 3b778414471a47e4b0760bbecc9d4070 |
| properties | |
| security_groups | name='default' |
| status | ERROR |
| updated | 2018-06-15T09:29:17Z |
| user_id | 356b9a5451c643bb8162b9349bc9487b |
| volumes_attached | |
+-------------------------------------+---------------------------------------------------------------+
From /var/log/ironic/ironic-conductor.log I can see that ironic-
conductor started loading extensions after the updated timestamp:
2018-06-15 09:29:18.576 1408 DEBUG oslo_concurrency.lockutils
[req-6ad8cb26-b607-4f65-a8b7-30ac72a7a997 - - - - -] Lock
"extension_manager" acquired by
"ironic.common.driver_factory._init_extension_manager" :: waited
0.000s inner /usr/lib/python2.7/site-
packages/oslo_concurrency/lockutils.py:273
The errors in /var/log/ironic/app.log occurred before this, because
there wasn't a conductor registered that supports ipmi:
2018-06-15 09:29:14.419 1793 DEBUG wsme.api [req-81ce866e-1c15-4499-a452-044a44efa1c0 1a609f3c25c24c45ac30ee2fcc721eac 5909cc46dbad48058daa89d48f07ba71 - default default] Client-side error: No valid host was found. Reason: No conductor service registered which supports driver ipmi. format_exception /usr/lib/python2.7/site-packages/wsme/api.py:222
2018-06-15 09:29:15.304 1792 DEBUG wsme.api [req-be8a2443-12ad-4115-9640-b91b39d15765 1a609f3c25c24c45ac30ee2fcc721eac 5909cc46dbad48058daa89d48f07ba71 - default default] Client-side error: No valid host was found. Reason: No conductor service registered which supports driver ipmi. format_exception /usr/lib/python2.7/site-packages/wsme/api.py:222
2018-06-15 09:29:16.105 1793 DEBUG wsme.api [req-0f2c2010-3bce-4600-8c8a-54a6aa0b8965 1a609f3c25c24c45ac30ee2fcc721eac 5909cc46dbad48058daa89d48f07ba71 - default default] Client-side error: No valid host was found. Reason: No conductor service registered which supports driver ipmi. format_exception /usr/lib/python2.7/site-packages/wsme/api.py:222
2018-06-15 09:29:16.797 1792 DEBUG wsme.api [req-e660c0cd-8a25-4b93-815f-edc26c15ddba 1a609f3c25c24c45ac30ee2fcc721eac 5909cc46dbad48058daa89d48f07ba71 - default default] Client-side error: No valid host was found. Reason: No conductor service registered which supports driver ipmi. format_exception /usr/lib/python2.7/site-packages/wsme/api.py:222
2018-06-15 09:29:17.596 1793 DEBUG wsme.api [req-ffa50a06-3a62-4f3e-937c-4304818dca4c 1a609f3c25c24c45ac30ee2fcc721eac 5909cc46dbad48058daa89d48f07ba71 - default default] Client-side error: No valid host was found. Reason: No conductor service registered which supports driver ipmi. format_exception /usr/lib/python2.7/site-packages/wsme/api.py:222
2018-06-15 09:29:18.277 1793 DEBUG wsme.api [req-c391da42-997e-49ee-b604-11ab81f875e2 1a609f3c25c24c45ac30ee2fcc721eac 5909cc46dbad48058daa89d48f07ba71 - default default] Client-side error: No valid host was found. Reason: No conductor service registered which supports driver ipmi. format_exception /usr/lib/python2.7/site-packages/wsme/api.py:222
And /var/log/nova/nova-compute.log:
2018-06-15 09:29:12.007 2623 INFO nova.service [req-42a10a27-6527-45d3-9b7d-0c4dc2ac2f13 - - - - -] Starting compute node (version 18.0.0-0.20180601221704.f902e0d.el7)
2018-06-15 09:29:12.135 2623 DEBUG nova.servicegroup.drivers.db [req-42a10a27-6527-45d3-9b7d-0c4dc2ac2f13 - - - - -] Seems service nova-compute on host undercloud is down. Last heartbeat was 2018-06-15 09:27:40. Elapsed time is 92.135605 is_up /usr/lib/python2.7/site-packages/nova/servicegroup/drivers/db.py:79
2018-06-15 09:29:12.260 2623 DEBUG nova.compute.manager [req-42a10a27-6527-45d3-9b7d-0c4dc2ac2f13 - - - - -] [instance: 5bef3afd-4485-4cbc-b76c-f126c85bd015] Checking state _get_power_state /usr/lib/python2.7/site-packages/nova/compute/manager.py:1167
2018-06-15 09:29:13.819 2623 DEBUG nova.virt.ironic.driver [req-42a10a27-6527-45d3-9b7d-0c4dc2ac2f13 - - - - -] plug: instance_uuid=5bef3afd-4485-4cbc-b76c-f126c85bd015 vif=[{"profile": {}, "ovs_interfaceid": null, "preserve_on_delete": true, "network": {"bridge": null, "subnets": [{"ips": [{"meta": {}, "version": 4, "type": "fixed", "floating_ips": [], "address": "192.168.24.8"}], "version": 4, "meta": {"dhcp_server": "192.168.24.5"}, "dns": [{"meta": {}, "version": 4, "type": "dns", "address": "192.168.23.1"}], "routes": [{"interface": null, "cidr": "169.254.169.254/32", "meta": {}, "gateway": {"meta": {}, "version": 4, "type": "gateway", "address": "192.168.24.1"}}], "cidr": "192.168.24.0/24", "gateway": {"meta": {}, "version": 4, "type": "gateway", "address": "192.168.24.1"}}], "meta": {"injected": false, "tenant_id": "3b778414471a47e4b0760bbecc9d4070", "mtu": 1500}, "id": "1fe54003-4d8c-4b37-9f20-f04216ac4e26", "label": "ctlplane"}, "devname": "tapcbd2c5be-32", "vnic_type": "baremetal", "qbh_params": null, "meta": {}, "details": {}, "address": "00:1a:24:52:db:70", "active": true, "type": "other", "id": "cbd2c5be-32fd-4c5c-9a14-43e323071912", "qbg_params": null}] _plug_vifs /usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py:1397
2018-06-15 09:29:14.422 2623 ERROR nova.virt.ironic.driver [req-42a10a27-6527-45d3-9b7d-0c4dc2ac2f13 - - - - -] Cannot attach VIF cbd2c5be-32fd-4c5c-9a14-43e323071912 to the node 059ac0f4-3a78-4f67-9f75-57a7a1f9ec5c due to error: No valid host was found. Reason: No conductor service registered which supports driver ipmi. (HTTP 400): BadRequest: No valid host was found. Reason: No conductor service registered which supports driver ipmi. (HTTP 400)
2018-06-15 09:29:14.422 2623 ERROR nova.compute.manager [req-42a10a27-6527-45d3-9b7d-0c4dc2ac2f13 - - - - -] [instance: 5bef3afd-4485-4cbc-b76c-f126c85bd015] Vifs plug failed: VirtualInterfacePlugException: Cannot attach VIF cbd2c5be-32fd-4c5c-9a14-43e323071912 to the node 059ac0f4-3a78-4f67-9f75-57a7a1f9ec5c due to error: No valid host was found. Reason: No conductor service registered which supports driver ipmi. (HTTP 400)
2018-06-15 09:29:14.422 2623 ERROR nova.compute.manager [instance: 5bef3afd-4485-4cbc-b76c-f126c85bd015] Traceback (most recent call last):
2018-06-15 09:29:14.422 2623 ERROR nova.compute.manager [instance: 5bef3afd-4485-4cbc-b76c-f126c85bd015] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 942, in _init_instance
2018-06-15 09:29:14.422 2623 ERROR nova.compute.manager [instance: 5bef3afd-4485-4cbc-b76c-f126c85bd015] self.driver.plug_vifs(instance, net_info)
2018-06-15 09:29:14.422 2623 ERROR nova.compute.manager [instance: 5bef3afd-4485-4cbc-b76c-f126c85bd015] File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 1441, in plug_vifs
2018-06-15 09:29:14.422 2623 ERROR nova.compute.manager [instance: 5bef3afd-4485-4cbc-b76c-f126c85bd015] self._plug_vifs(node, instance, network_info)
2018-06-15 09:29:14.422 2623 ERROR nova.compute.manager [instance: 5bef3afd-4485-4cbc-b76c-f126c85bd015] File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 1408, in _plug_vifs
2018-06-15 09:29:14.422 2623 ERROR nova.compute.manager [instance: 5bef3afd-4485-4cbc-b76c-f126c85bd015] raise exception.VirtualInterfacePlugException(msg)
2018-06-15 09:29:14.422 2623 ERROR nova.compute.manager [instance: 5bef3afd-4485-4cbc-b76c-f126c85bd015] VirtualInterfacePlugException: Cannot attach VIF cbd2c5be-32fd-4c5c-9a14-43e323071912 to the node 059ac0f4-3a78-4f67-9f75-57a7a1f9ec5c due to error: No valid host was found. Reason: No conductor service registered which supports driver ipmi. (HTTP 400)
2018-06-15 09:29:14.422 2623 ERROR nova.compute.manager [instance: 5bef3afd-4485-4cbc-b76c-f126c85bd015]
etc... for the other instances
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1777608/+subscriptions
References