← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2095000] [NEW] ovs-agent: Failed to store metadata for trunk ...: Cannot find Bridge with name=tbr-...

 

Public bug reported:

We have found a recurring error message and traceback in ovs-agent logs,
for example:

jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command [-] Error executing command (ListPortsCommand): ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Bridge with name=tbr-78e94a46-1
jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command Traceback (most recent call last):
jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command   File "/opt/stack/data/venv/lib/python3.12/site-packages/ovsdbapp/backend/ovs_idl/command.py", line 47, in execute
jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command     self.run_idl(None)
jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command   File "/opt/stack/data/venv/lib/python3.12/site-packages/ovsdbapp/schema/open_vswitch/commands.py", line 303, in run_idl
jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command     br = idlutils.row_by_value(self.api.idl, 'Bridge', 'name', self.bridge)
jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command   File "/opt/stack/data/venv/lib/python3.12/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 114, in row_by_value
jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command     raise RowNotFound(table=table, col=column, match=match)
jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Bridge with name=tbr-78e94a46-1
jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command·
jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR neutron.services.trunk.drivers.openvswitch.agent.ovsdb_handler [-] Failed to store metadata for trunk 78e94a46-1fe1-4b81-be34-3112c079a9eb: Cannot find Bridge with name=tbr-78e94a46-1: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Bridge with name=tbr-78e94a46-1

We can reliably trigger this by:

openstack network create tnet0
openstack subnet create --network tnet0 --subnet-range 10.0.100.0/24 tsubnet0
openstack port create --network tnet0 tport0
openstack network trunk create --parent-port tport0 trunk0

openstack server create --flavor cirros256 --image cirros-0.6.3-x86_64-disk --nic port-id=tport0 tvm0 --wait
# Deleting the server triggers the error message:
openstack server delete tvm0 --wait

# Instead of delete, rebuild triggers it just as well:
openstack server rebuild --image cirros-0.6.3-x86_64-disk tvm0 --wait

We log the same error message from two places: wire_subports_for_trunk() and unwire_subports_for_trunk(). This one of course comes from unwiring:
https://opendev.org/openstack/neutron/src/commit/585ea689d5d26356e28d8eb47f6d0511d21806cf/neutron/services/trunk/drivers/openvswitch/agent/ovsdb_handler.py#L321-L322

Clearly it can happen that neutron calls unwire...() when nova/os-vif
already deleted the trunk bridge. For example see these logs from during
a rebuild:

$ sudo journalctl -o short-precise -u devstack@n-cpu.service -u devstack@q-agt.service -S -20m | egrep '(plugged vif|Failed to store metadata for trunk).*tbr-dc729e53-3'
jan 14 14:05:38.558024 devstack0 nova-compute[1018]: INFO os_vif [None req-27af63bd-23d9-45d4-aa73-f70e357a38e0 admin admin] Successfully plugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:cd:66:ed,bridge_name='tbr-dc729e53-3',has_traffic_filtering=True,id=d10f23a7-ba4a-463a-b57b-6ded51e5949c,network=Network(1b2e288d-24d4-4068-9ee1-1e80417e2d19),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tapd10f23a7-ba')
jan 14 14:05:48.878403 devstack0 nova-compute[1018]: INFO os_vif [None req-91dd8c76-77d0-4463-9674-324fff56f881 admin admin] Successfully unplugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:cd:66:ed,bridge_name='tbr-dc729e53-3',has_traffic_filtering=True,id=d10f23a7-ba4a-463a-b57b-6ded51e5949c,network=Network(1b2e288d-24d4-4068-9ee1-1e80417e2d19),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tapd10f23a7-ba')
jan 14 14:05:49.285754 devstack0 neutron-openvswitch-agent[108540]: ERROR neutron.services.trunk.drivers.openvswitch.agent.ovsdb_handler [-] unwire: Failed to store metadata for trunk dc729e53-3902-4ef7-a596-ce358456edc0: Cannot find Bridge with name=tbr-dc729e53-3: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Bridge with name=tbr-dc729e53-3
jan 14 14:05:51.197047 devstack0 nova-compute[1018]: INFO os_vif [None req-91dd8c76-77d0-4463-9674-324fff56f881 admin admin] Successfully plugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:cd:66:ed,bridge_name='tbr-dc729e53-3',has_traffic_filtering=True,id=d10f23a7-ba4a-463a-b57b-6ded51e5949c,network=Network(1b2e288d-24d4-4068-9ee1-1e80417e2d19),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tapd10f23a7-ba')

My first guess is that this is an unintended side effect of the last
proper fix for bug https://bugs.launchpad.net/neutron/+bug/1869244, that
moved trunk bridge deletion from ovs-agent to os-vif.

So far I did not find any resources leaked, so we probably only have a
spurious error message we could suppress.

devstack 2f3440dc
neutron 8cca47f2e7

** Affects: neutron
     Importance: High
     Assignee: Bence Romsics (bence-romsics)
         Status: Confirmed


** Tags: ovs trunk

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2095000

Title:
  ovs-agent: Failed to store metadata for trunk ...: Cannot find Bridge
  with name=tbr-...

Status in neutron:
  Confirmed

Bug description:
  We have found a recurring error message and traceback in ovs-agent
  logs, for example:

  jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command [-] Error executing command (ListPortsCommand): ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Bridge with name=tbr-78e94a46-1
  jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command Traceback (most recent call last):
  jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command   File "/opt/stack/data/venv/lib/python3.12/site-packages/ovsdbapp/backend/ovs_idl/command.py", line 47, in execute
  jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command     self.run_idl(None)
  jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command   File "/opt/stack/data/venv/lib/python3.12/site-packages/ovsdbapp/schema/open_vswitch/commands.py", line 303, in run_idl
  jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command     br = idlutils.row_by_value(self.api.idl, 'Bridge', 'name', self.bridge)
  jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command   File "/opt/stack/data/venv/lib/python3.12/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 114, in row_by_value
  jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command     raise RowNotFound(table=table, col=column, match=match)
  jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Bridge with name=tbr-78e94a46-1
  jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR ovsdbapp.backend.ovs_idl.command·
  jan 10 14:58:30 devstack0 neutron-openvswitch-agent[1025]: ERROR neutron.services.trunk.drivers.openvswitch.agent.ovsdb_handler [-] Failed to store metadata for trunk 78e94a46-1fe1-4b81-be34-3112c079a9eb: Cannot find Bridge with name=tbr-78e94a46-1: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Bridge with name=tbr-78e94a46-1

  We can reliably trigger this by:

  openstack network create tnet0
  openstack subnet create --network tnet0 --subnet-range 10.0.100.0/24 tsubnet0
  openstack port create --network tnet0 tport0
  openstack network trunk create --parent-port tport0 trunk0

  openstack server create --flavor cirros256 --image cirros-0.6.3-x86_64-disk --nic port-id=tport0 tvm0 --wait
  # Deleting the server triggers the error message:
  openstack server delete tvm0 --wait

  # Instead of delete, rebuild triggers it just as well:
  openstack server rebuild --image cirros-0.6.3-x86_64-disk tvm0 --wait

  We log the same error message from two places: wire_subports_for_trunk() and unwire_subports_for_trunk(). This one of course comes from unwiring:
  https://opendev.org/openstack/neutron/src/commit/585ea689d5d26356e28d8eb47f6d0511d21806cf/neutron/services/trunk/drivers/openvswitch/agent/ovsdb_handler.py#L321-L322

  Clearly it can happen that neutron calls unwire...() when nova/os-vif
  already deleted the trunk bridge. For example see these logs from
  during a rebuild:

  $ sudo journalctl -o short-precise -u devstack@n-cpu.service -u devstack@q-agt.service -S -20m | egrep '(plugged vif|Failed to store metadata for trunk).*tbr-dc729e53-3'
  jan 14 14:05:38.558024 devstack0 nova-compute[1018]: INFO os_vif [None req-27af63bd-23d9-45d4-aa73-f70e357a38e0 admin admin] Successfully plugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:cd:66:ed,bridge_name='tbr-dc729e53-3',has_traffic_filtering=True,id=d10f23a7-ba4a-463a-b57b-6ded51e5949c,network=Network(1b2e288d-24d4-4068-9ee1-1e80417e2d19),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tapd10f23a7-ba')
  jan 14 14:05:48.878403 devstack0 nova-compute[1018]: INFO os_vif [None req-91dd8c76-77d0-4463-9674-324fff56f881 admin admin] Successfully unplugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:cd:66:ed,bridge_name='tbr-dc729e53-3',has_traffic_filtering=True,id=d10f23a7-ba4a-463a-b57b-6ded51e5949c,network=Network(1b2e288d-24d4-4068-9ee1-1e80417e2d19),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tapd10f23a7-ba')
  jan 14 14:05:49.285754 devstack0 neutron-openvswitch-agent[108540]: ERROR neutron.services.trunk.drivers.openvswitch.agent.ovsdb_handler [-] unwire: Failed to store metadata for trunk dc729e53-3902-4ef7-a596-ce358456edc0: Cannot find Bridge with name=tbr-dc729e53-3: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Bridge with name=tbr-dc729e53-3
  jan 14 14:05:51.197047 devstack0 nova-compute[1018]: INFO os_vif [None req-91dd8c76-77d0-4463-9674-324fff56f881 admin admin] Successfully plugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:cd:66:ed,bridge_name='tbr-dc729e53-3',has_traffic_filtering=True,id=d10f23a7-ba4a-463a-b57b-6ded51e5949c,network=Network(1b2e288d-24d4-4068-9ee1-1e80417e2d19),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tapd10f23a7-ba')

  My first guess is that this is an unintended side effect of the last
  proper fix for bug https://bugs.launchpad.net/neutron/+bug/1869244,
  that moved trunk bridge deletion from ovs-agent to os-vif.

  So far I did not find any resources leaked, so we probably only have a
  spurious error message we could suppress.

  devstack 2f3440dc
  neutron 8cca47f2e7

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2095000/+subscriptions



Follow ups