← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic

 

Reviewed:  https://review.opendev.org/733568
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=90212b12cdf62e92d811997ebba699cab431d696
Submitter: Zuul
Branch:    master

commit 90212b12cdf62e92d811997ebba699cab431d696
Author: shenjiatong <yshxxsjt715@xxxxxxxxx>
Date:   Thu Jun 18 15:33:13 2020 +0800

    Do not block connection between br-int and br-phys on startup
    
    Block traffic between br-int and br-physical is over kill
    and will at least
    
    1. interrupt vlan flow during startup, and is particularly
    so if dvr enabled
    2. if let's rabbitmq is not stable, it is possible data plane
    will be affected and vlan will never work.
    
    Using openstack on k8s particularly amplifies the problem
    because pod could be killed pretty easily by liveness
    probes.
    
    Change-Id: I51050c600ba7090fea71213687d94340bac0674a
    Closes-Bug: #1869808


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1869808

Title:
  reboot neutron-ovs-agent introduces a short interrupt of vlan traffic

Status in neutron:
  Fix Released

Bug description:
  We are using Openstack Neutron 13.0.6 and it is deployed using
  OpenStack-helm.

  I test ping servers in the same vlan while rebooting neutron-ovs-
  agent. The result shows

  root@mgt01:~# openstack server list
  +--------------------------------------+-----------------+--------+------------------------------------------+------------------------------+-----------+
  | ID                                   | Name            | Status | Networks                                 | Image                        | Flavor    |
  +--------------------------------------+-----------------+--------+------------------------------------------+------------------------------+-----------+
  | 22d55077-b1b5-452e-8eba-cbcd2d1514a8 | test-1-1        | ACTIVE | vlan105=172.31.10.4                      | Cirros 0.4.0 64-bit          | m1.tiny   |
  | 726bc888-7767-44bc-b68a-7a1f3a6babf1 | test-1-2        | ACTIVE | vlan105=172.31.10.18                     | Cirros 0.4.0 64-bit          | m1.tiny   |

  $ ping 172.31.10.4
  PING 172.31.10.4 (172.31.10.4): 56 data bytes
  ......
  64 bytes from 172.31.10.4: seq=59 ttl=64 time=0.465 ms
  64 bytes from 172.31.10.4: seq=60 ttl=64 time=0.510 ms <--------
  64 bytes from 172.31.10.4: seq=61 ttl=64 time=0.446 ms
  64 bytes from 172.31.10.4: seq=63 ttl=64 time=0.744 ms
  64 bytes from 172.31.10.4: seq=64 ttl=64 time=0.477 ms
  64 bytes from 172.31.10.4: seq=65 ttl=64 time=0.441 ms
  64 bytes from 172.31.10.4: seq=66 ttl=64 time=0.376 ms
  64 bytes from 172.31.10.4: seq=67 ttl=64 time=0.481 ms

  As one can see, packet seq 62 is lost, I believe, during rebooting ovs
  agent.

  Right now, I am suspecting
  https://github.com/openstack/neutron/blob/6d619ea7c13e89ec575295f04c63ae316759c50a/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py#L229
  this code is refreshing flow table rules even though it is not
  necessary.

  Because when I dump flows on phys bridge, I can see duration is
  rewinding to 0 which suggests flow has been deleted and created again

  """       duration=secs
                The  time,  in  seconds,  that  the entry has been in the table.
                secs includes as much precision as the switch provides, possibly
                to nanosecond resolution.
  """

  root@compute01:~# ovs-ofctl dump-flows br-floating
  ...
   cookie=0x673522f560f5ca4f, duration=323.852s, table=2, n_packets=1100, n_bytes=103409, 
                              ^------ this value resets
  priority=4,in_port="phy-br-floating",dl_vlan=2 actions=mod_vlan_vid:105,NORMAL
  ...

  IMO, rebooting ovs-agent should not affecting data plane.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1869808/+subscriptions


References