yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #90831
[Bug 2000378] [NEW] [OVN] orphaned virtual parent ports break new ports
Public bug reported:
Reproducible on stable/yoga.
Should the ovn port deletion fail due to backend (mariadb or ovn) connection failure, leftover switchports are left hanging in the OVN NB db.
oslo_db.exception.DBDeadlock: (pymysql.err.OperationalError) (1205, 'Lock wait timeout exceeded; try restarting transaction')
[SQL: DELETE FROM securitygroupportbindings WHERE securitygroupportbindings.port_id = %(port_id)s AND securitygroupportbindings.security_group_id = %(security_group_id)s]
[parameters: {'port_id': '76ff3324-7326-412d-bdc9-df5db5adcf84', 'security_group_id': 'fe1f6c5c-4d49-4ccc-ac2e-20ef23041510'}]
neutron/neutron-server.log:78508:2022-12-12 16:39:15.309 691 ERROR
neutron.plugins.ml2.managers [... - default default] Mechanism driver
'ovn' failed in delete_port_postcommit:
ovsdbapp.exceptions.TimeoutException: Commands
[DelLSwitchPortCommand(lport=76ff3324-7326-412d-bdc9-df5db5adcf84...
Such ports are detected by maintenance task, but only reported as warnings in logs:
neutron/neutron-server.log:76862:2022-12-12 16:35:11.420 712 WARNING
neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.maintenance
[req-4a4b33c2-85b3-48c1-8b15-ed6d65db3c2d - - - - -] Skip fixing
resource 76ff3324-7326-412d-bdc9-df5db5adcf84 (type: ports). Resource
does not exist in Neutron database anymore:
neutron_lib.exceptions.PortNotFound: Port
76ff3324-7326-412d-bdc9-df5db5adcf84 could not be found.
When neutron tries to create a new port for nova instance in the same network and the IP address of the new port matches the IP of the orphaned virtual-parent, neutron binds the new port's virtual switchport to the orphan but fails to proceed with binding algorithms, resulting in a perpetually-DOWN port.
For example, here is OVN-side body of a new virtual port, that has
failed to bind to compute:
addresses : ["fa:16:3e:44:8a:d5 10.0.0.29"]
enabled : true
external_ids : {"neutron:cidrs"="10.0.0.29/24", "neutron:device_id"="2098a135-d6a6-4221-a8e9-2584c170dade", "neutron:device_owner"="compute:nova", "neutron:network_name"=neutron-def3de91-2120-47b5-b9f1-6ed51cf0e604, "neutron:port_name"="", "neutron:project_id"="867ba703d19947629e01d800ecdc01c0", "neutron:revision_number"="3", "neutron:security_group_ids"="2ecda920-36a2-44ff-96fa-a652d1cbd6c1 fe1f6c5c-4d49-4ccc-ac2e-20ef23041510"}
name : "79cba8eb-dd1a-455a-873c-0e04f398c8d0"
options : {mcast_flood_reports="true", requested-chassis=cmpt-av-02, virtual-ip="10.0.0.29", virtual-parents="76ff3324-7326-412d-bdc9-df5db5adcf84"}
port_security : ["fa:16:3e:44:8a:d5 10.0.0.29"]
type : virtual
up : false
it was incorrectly bound to orphaned parent 76ff:
addresses : ["fa:16:3e:f6:cc:6a 10.0.0.29"]
enabled : true
external_ids : {"neutron:cidrs"="10.0.0.29/24", "neutron:device_id"="91e19b3e-1412-4519-b499-06ae794ee0a3", "neutron:device_owner"="", "neutron:network_name"=neutron-def3de91-2120-47b5-b9f1-6ed51cf0e604, "neutron:port_name"="", "neutron:project_id"="867ba703d19947629e01d800ecdc01c0", "neutron:revision_number"="1", "neutron:security_group_ids"="fe1f6c5c-4d49-4ccc-ac2e-20ef23041510"}
name : "76ff3324-7326-412d-bdc9-df5db5adcf84"
options : {mcast_flood_reports="true", requested-chassis=""}
port_security : ["fa:16:3e:f6:cc:6a 10.0.0.29"]
type : ""
up : false
As we can see, the only set of matching values is (IP, network_id) triplet, which may indicate that the problem lies in the usage of
def get_virtual_port_parents(self, virtual_ip, port):
function in
neutron\plugins\ml2\drivers\ovn\mech_driver\ovsdb\ovn_client.py:303
Manual workaround:
manually delete the port from OVN NB (ovn-nbctl lsp-del), and it's version from neutron ovn_revision_numbers table.
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2000378
Title:
[OVN] orphaned virtual parent ports break new ports
Status in neutron:
New
Bug description:
Reproducible on stable/yoga.
Should the ovn port deletion fail due to backend (mariadb or ovn) connection failure, leftover switchports are left hanging in the OVN NB db.
oslo_db.exception.DBDeadlock: (pymysql.err.OperationalError) (1205, 'Lock wait timeout exceeded; try restarting transaction')
[SQL: DELETE FROM securitygroupportbindings WHERE securitygroupportbindings.port_id = %(port_id)s AND securitygroupportbindings.security_group_id = %(security_group_id)s]
[parameters: {'port_id': '76ff3324-7326-412d-bdc9-df5db5adcf84', 'security_group_id': 'fe1f6c5c-4d49-4ccc-ac2e-20ef23041510'}]
neutron/neutron-server.log:78508:2022-12-12 16:39:15.309 691 ERROR
neutron.plugins.ml2.managers [... - default default] Mechanism driver
'ovn' failed in delete_port_postcommit:
ovsdbapp.exceptions.TimeoutException: Commands
[DelLSwitchPortCommand(lport=76ff3324-7326-412d-bdc9-df5db5adcf84...
Such ports are detected by maintenance task, but only reported as warnings in logs:
neutron/neutron-server.log:76862:2022-12-12 16:35:11.420 712 WARNING
neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.maintenance
[req-4a4b33c2-85b3-48c1-8b15-ed6d65db3c2d - - - - -] Skip fixing
resource 76ff3324-7326-412d-bdc9-df5db5adcf84 (type: ports). Resource
does not exist in Neutron database anymore:
neutron_lib.exceptions.PortNotFound: Port
76ff3324-7326-412d-bdc9-df5db5adcf84 could not be found.
When neutron tries to create a new port for nova instance in the same network and the IP address of the new port matches the IP of the orphaned virtual-parent, neutron binds the new port's virtual switchport to the orphan but fails to proceed with binding algorithms, resulting in a perpetually-DOWN port.
For example, here is OVN-side body of a new virtual port, that has
failed to bind to compute:
addresses : ["fa:16:3e:44:8a:d5 10.0.0.29"]
enabled : true
external_ids : {"neutron:cidrs"="10.0.0.29/24", "neutron:device_id"="2098a135-d6a6-4221-a8e9-2584c170dade", "neutron:device_owner"="compute:nova", "neutron:network_name"=neutron-def3de91-2120-47b5-b9f1-6ed51cf0e604, "neutron:port_name"="", "neutron:project_id"="867ba703d19947629e01d800ecdc01c0", "neutron:revision_number"="3", "neutron:security_group_ids"="2ecda920-36a2-44ff-96fa-a652d1cbd6c1 fe1f6c5c-4d49-4ccc-ac2e-20ef23041510"}
name : "79cba8eb-dd1a-455a-873c-0e04f398c8d0"
options : {mcast_flood_reports="true", requested-chassis=cmpt-av-02, virtual-ip="10.0.0.29", virtual-parents="76ff3324-7326-412d-bdc9-df5db5adcf84"}
port_security : ["fa:16:3e:44:8a:d5 10.0.0.29"]
type : virtual
up : false
it was incorrectly bound to orphaned parent 76ff:
addresses : ["fa:16:3e:f6:cc:6a 10.0.0.29"]
enabled : true
external_ids : {"neutron:cidrs"="10.0.0.29/24", "neutron:device_id"="91e19b3e-1412-4519-b499-06ae794ee0a3", "neutron:device_owner"="", "neutron:network_name"=neutron-def3de91-2120-47b5-b9f1-6ed51cf0e604, "neutron:port_name"="", "neutron:project_id"="867ba703d19947629e01d800ecdc01c0", "neutron:revision_number"="1", "neutron:security_group_ids"="fe1f6c5c-4d49-4ccc-ac2e-20ef23041510"}
name : "76ff3324-7326-412d-bdc9-df5db5adcf84"
options : {mcast_flood_reports="true", requested-chassis=""}
port_security : ["fa:16:3e:f6:cc:6a 10.0.0.29"]
type : ""
up : false
As we can see, the only set of matching values is (IP, network_id) triplet, which may indicate that the problem lies in the usage of
def get_virtual_port_parents(self, virtual_ip, port):
function in
neutron\plugins\ml2\drivers\ovn\mech_driver\ovsdb\ovn_client.py:303
Manual workaround:
manually delete the port from OVN NB (ovn-nbctl lsp-del), and it's version from neutron ovn_revision_numbers table.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2000378/+subscriptions