← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1874239] [NEW] [l2]ovs-agent lost connection to ovsdb-server lead to physical br reconfig

 

Public bug reported:

this bug is similar to https://bugs.launchpad.net/neutron/+bug/1803919.

kolla upgrade neutron, vm outside host traffic drop by physical bridge.

WARNING ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:6640: send error: Broken pipe
WARNING ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:6640: connection dropped (Broken pipe)
DEBUG neutron.agent.linux.utils [req-58ba8845-e217-4237-ab0a-d614d0849e57 - - - - -] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip6tables-save'] create_process /var/lib/kolla/venv/lib/python2.7/site-packages/neutron/agent/linux/utils.py:87
DEBUG neutron.agent.linux.utils [req-58ba8845-e217-4237-ab0a-d614d0849e57 - - - - -] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip6tables-restore', '-n'] create_process /var/lib/kolla/venv/lib/python2.7/site-packages/neutron/agent/linux/utils.py:87
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [-] Failed reporting state!: MessagingTimeout: Timed out waiting for a reply to message ID 99ff95277e314c0da7873a5650bf004a
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent Traceback (most recent call last):
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 354, in _report_state
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     True)
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/neutron/agent/rpc.py", line 101, in report_state
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     return method(context, 'report_state', **kwargs)
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 178, in call
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     retry=self.retry)
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/transport.py", line 128, in _send
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     retry=retry)
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 645, in send
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     call_monitor_timeout, retry=retry)
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 634, in _send
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     call_monitor_timeout)
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 520, in wait
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     message = self.waiters.get(msg_id, timeout=timeout)
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 397, in get
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     'to message ID %s' % msg_id)
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent MessagingTimeout: Timed out waiting for a reply to message ID 99ff95277e314c0da7873a5650bf004a
ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent
WARNING oslo.service.loopingcall [-] Function 'neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent.OVSNeutronAgent._report_state' run outlasted interval by 50.64 sec
DEBUG ovsdbapp.backend.ovs_idl.event [-] BridgeCreateEvent : Matched Bridge, ('create',), None None matches /var/lib/kolla/venv/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/event.py:40
DEBUG ovsdbapp.backend.ovs_idl.event [-] BridgeCreateEvent : Matched Bridge, ('create',), None None matches /var/lib/kolla/venv/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/event.py:40
DEBUG ovsdbapp.backend.ovs_idl.event [-] BridgeCreateEvent : Matched Bridge, ('create',), None None matches /var/lib/kolla/venv/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/event.py:40
DEBUG neutron.agent.ovsdb.native.connection [-] BridgeCreateEvent, bridge name: br-tun run /var/lib/kolla/venv/lib/python2.7/site-packages/neutron/agent/ovsdb/native/connection.py:72
DEBUG neutron.agent.ovsdb.native.connection [-] BridgeCreateEvent, bridge name: br-int run /var/lib/kolla/venv/lib/python2.7/site-packages/neutron/agent/ovsdb/native/connection.py:72
DEBUG neutron.agent.ovsdb.native.connection [-] BridgeCreateEvent, bridge name: br-ex run /var/lib/kolla/venv/lib/python2.7/site-packages/neutron/agent/ovsdb/native/connection.py:72

I think rabbit mq blocked ovs-agent for a long time, which lead to ovs-agent lost connection to ovsdb-server.
After reconnection, ovs-agent received physical BridgeCreateEvent, then _reconfigure_physical_bridges and setup_physical_bridges, over write dvr flows.

dvr flow:
(openvswitch-vswitchd)[root@openstack598 /]# ovs-ofctl dump-flows br-ex
 cookie=0x780b5a4b6ba48bc4, duration=40085.386s, table=0, n_packets=8634983, n_bytes=1101027259, priority=2,in_port="phy-br-ex" actions=resubmit(,1)
 cookie=0xd7991c8d1f9cbe1d, duration=34358.062s, table=0, n_packets=52656008, n_bytes=6328139917, priority=2,in_port="phy-br-ex" actions=drop
after over write:
(openvswitch-vswitchd)[root@openstack598 /]# ovs-ofctl dump-flows br-ex
 cookie=0xd7991c8d1f9cbe1d, duration=34358.062s, table=0, n_packets=52656008, n_bytes=6328139917, priority=2,in_port="phy-br-ex" actions=drop

so simple thought, after reconfig physical, call self.dvr_agent.setup_dvr_flows() set dvr flow.
any comment?

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1874239

Title:
  [l2]ovs-agent lost connection to ovsdb-server lead to physical br
  reconfig

Status in neutron:
  New

Bug description:
  this bug is similar to
  https://bugs.launchpad.net/neutron/+bug/1803919.

  kolla upgrade neutron, vm outside host traffic drop by physical
  bridge.

  WARNING ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:6640: send error: Broken pipe
  WARNING ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:6640: connection dropped (Broken pipe)
  DEBUG neutron.agent.linux.utils [req-58ba8845-e217-4237-ab0a-d614d0849e57 - - - - -] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip6tables-save'] create_process /var/lib/kolla/venv/lib/python2.7/site-packages/neutron/agent/linux/utils.py:87
  DEBUG neutron.agent.linux.utils [req-58ba8845-e217-4237-ab0a-d614d0849e57 - - - - -] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip6tables-restore', '-n'] create_process /var/lib/kolla/venv/lib/python2.7/site-packages/neutron/agent/linux/utils.py:87
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [-] Failed reporting state!: MessagingTimeout: Timed out waiting for a reply to message ID 99ff95277e314c0da7873a5650bf004a
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent Traceback (most recent call last):
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 354, in _report_state
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     True)
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/neutron/agent/rpc.py", line 101, in report_state
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     return method(context, 'report_state', **kwargs)
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 178, in call
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     retry=self.retry)
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/transport.py", line 128, in _send
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     retry=retry)
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 645, in send
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     call_monitor_timeout, retry=retry)
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 634, in _send
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     call_monitor_timeout)
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 520, in wait
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     message = self.waiters.get(msg_id, timeout=timeout)
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 397, in get
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent     'to message ID %s' % msg_id)
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent MessagingTimeout: Timed out waiting for a reply to message ID 99ff95277e314c0da7873a5650bf004a
  ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent
  WARNING oslo.service.loopingcall [-] Function 'neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent.OVSNeutronAgent._report_state' run outlasted interval by 50.64 sec
  DEBUG ovsdbapp.backend.ovs_idl.event [-] BridgeCreateEvent : Matched Bridge, ('create',), None None matches /var/lib/kolla/venv/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/event.py:40
  DEBUG ovsdbapp.backend.ovs_idl.event [-] BridgeCreateEvent : Matched Bridge, ('create',), None None matches /var/lib/kolla/venv/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/event.py:40
  DEBUG ovsdbapp.backend.ovs_idl.event [-] BridgeCreateEvent : Matched Bridge, ('create',), None None matches /var/lib/kolla/venv/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/event.py:40
  DEBUG neutron.agent.ovsdb.native.connection [-] BridgeCreateEvent, bridge name: br-tun run /var/lib/kolla/venv/lib/python2.7/site-packages/neutron/agent/ovsdb/native/connection.py:72
  DEBUG neutron.agent.ovsdb.native.connection [-] BridgeCreateEvent, bridge name: br-int run /var/lib/kolla/venv/lib/python2.7/site-packages/neutron/agent/ovsdb/native/connection.py:72
  DEBUG neutron.agent.ovsdb.native.connection [-] BridgeCreateEvent, bridge name: br-ex run /var/lib/kolla/venv/lib/python2.7/site-packages/neutron/agent/ovsdb/native/connection.py:72

  I think rabbit mq blocked ovs-agent for a long time, which lead to ovs-agent lost connection to ovsdb-server.
  After reconnection, ovs-agent received physical BridgeCreateEvent, then _reconfigure_physical_bridges and setup_physical_bridges, over write dvr flows.

  dvr flow:
  (openvswitch-vswitchd)[root@openstack598 /]# ovs-ofctl dump-flows br-ex
   cookie=0x780b5a4b6ba48bc4, duration=40085.386s, table=0, n_packets=8634983, n_bytes=1101027259, priority=2,in_port="phy-br-ex" actions=resubmit(,1)
   cookie=0xd7991c8d1f9cbe1d, duration=34358.062s, table=0, n_packets=52656008, n_bytes=6328139917, priority=2,in_port="phy-br-ex" actions=drop
  after over write:
  (openvswitch-vswitchd)[root@openstack598 /]# ovs-ofctl dump-flows br-ex
   cookie=0xd7991c8d1f9cbe1d, duration=34358.062s, table=0, n_packets=52656008, n_bytes=6328139917, priority=2,in_port="phy-br-ex" actions=drop

  so simple thought, after reconfig physical, call self.dvr_agent.setup_dvr_flows() set dvr flow.
  any comment?

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1874239/+subscriptions