yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #93212
[Bug 2008943] Re: OVN DB Sync utility cannot find NB DB Port Group
A new package version with this fix has been uploaded to the focal
unapproved queue and victoria/wallaby/xena staging PPAs.
** Changed in: neutron (Ubuntu)
Status: New => Fix Released
** Changed in: cloud-archive
Status: New => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2008943
Title:
OVN DB Sync utility cannot find NB DB Port Group
Status in Ubuntu Cloud Archive:
Fix Released
Status in Ubuntu Cloud Archive ussuri series:
Triaged
Status in Ubuntu Cloud Archive victoria series:
Triaged
Status in Ubuntu Cloud Archive wallaby series:
Triaged
Status in Ubuntu Cloud Archive xena series:
Triaged
Status in neutron:
In Progress
Status in neutron package in Ubuntu:
Fix Released
Status in neutron source package in Focal:
Triaged
Bug description:
Runtime exception:
ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Port_Group
with name=pg_aa9f203b_ec51_4893_9bda_cfadbff9f800
can occure while performing database sync between Neutron db and OVN NB db using neutron-ovn-db-sync-util.
This exception occures when the `sync_networks_ports_and_dhcp_opts()` function ends up implicitly creating a new default security group for a tenant/project id. This is normally ok but the problem is that `sync_port_groups` was already called and thus the port_group does not exists in NB DB. When the `sync_acls()` is called later there is no port group found and exception occurs.
Quick way to reproduce on ML2/OVN:
- openstack project create test_project
- openstack create network --project test_project test_network
- openstack port delete $(openstack port list --network test_network -c ID -f value) # since this is an empty network only the metadata port should get listed and subsequently deleted
- openstack security group delete test_project
So now that you have a network without a metadata port in it and no
default security group for the project/tenant that this network
belongs to run
neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf
--config-file /etc/neutron/plugins/ml2/ml2_conf.ini --ovn-
neutron_sync_mode migrate
The exeption should occur
Here is a more realistic scenario how we can run into this with
ML2/OVS to ML2/OVN migration. I am also including why the code runs
into it.
1. ML2/OVS enviroment with a network but no default security group for the project/tenant associated with the network
2. Perform ML2/OVS to ML2/OVN migration. This migration process will run neutron-ovn-db-sync-util with --migrate
3. During the sync we first sync port groups[1] from Neutron DB to OVN DB
4. Then we sync network ports [2]. The process will detect that the network in question is not part of OVN NB. It will create that network in OVN NB db and along with that it will create a metadata port for it(OVN network requires metadataport). The Port_create call will implicitly notify _ensure_default_security_group_handler which will not find securty group for that tenant/project id and create one. Now you have a new security group with 4 new default security group rules.
5. When sync_acls[4] runs it will pick up those 4 new rules but commit to NB DB will fail since the port_group(aka security group) does not exists in NB DB
[1] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L104
[2] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L10
[3] https://opendev.org/openstack/neutron/src/branch/master/neutron/db/securitygroups_db.py#L915
[4] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L107
===== Ubuntu SRU Details =====
[Impact]
See bug description.
[Test Case]
Deploy openstack with OVN. Follow steps in "Quick way to reproduce on ML2/OVN" from bug description.
[Where problems could occur]
The fix mitigates the occurrence of the runtime exception, however the fix retries to sync port groups one more time, so there is potential for the same runtime exception to be raised.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/2008943/+subscriptions
References