← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1783965] [NEW] Openvswtich agent break the existing data plane as not stable server

 

Public bug reported:

Current openvswitch agent need to be stronger for more cases.

Please see [1]

This line will clean up all stale ovs flows. Try to think, if there is a
case, when the ovs agent restart and try to get its hold device info(rpc
to server get them and store into local cache if possible).In this case,
we can only get them from server after scan existing ovs bridge. But at
this moment, some device info can not be got successful by neutron
server not stable/rabbitmq hang. Then this kind devices will failure to
sync. The following step is [1], it cleans the previous ovs flow which
there maybe some users traffic on that. That means it breaks the
existing data plane. This is a terrible situation.

For private cloud providers, when they face the issue online or need to
upgrade servers. This kind situation would be very frequency. So once
they hit this issue, the effects are quite large.


[1]  http://git.openstack.org/cgit/openstack/neutron/tree/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#n2158

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1783965

Title:
  Openvswtich agent break the existing data plane as not stable server

Status in neutron:
  New

Bug description:
  Current openvswitch agent need to be stronger for more cases.

  Please see [1]

  This line will clean up all stale ovs flows. Try to think, if there is
  a case, when the ovs agent restart and try to get its hold device
  info(rpc to server get them and store into local cache if possible).In
  this case, we can only get them from server after scan existing ovs
  bridge. But at this moment, some device info can not be got successful
  by neutron server not stable/rabbitmq hang. Then this kind devices
  will failure to sync. The following step is [1], it cleans the
  previous ovs flow which there maybe some users traffic on that. That
  means it breaks the existing data plane. This is a terrible situation.

  For private cloud providers, when they face the issue online or need
  to upgrade servers. This kind situation would be very frequency. So
  once they hit this issue, the effects are quite large.

  
  [1]  http://git.openstack.org/cgit/openstack/neutron/tree/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#n2158

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1783965/+subscriptions


Follow ups