yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #83145
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
Reviewed: https://review.opendev.org/733568
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=90212b12cdf62e92d811997ebba699cab431d696
Submitter: Zuul
Branch: master
commit 90212b12cdf62e92d811997ebba699cab431d696
Author: shenjiatong <yshxxsjt715@xxxxxxxxx>
Date: Thu Jun 18 15:33:13 2020 +0800
Do not block connection between br-int and br-phys on startup
Block traffic between br-int and br-physical is over kill
and will at least
1. interrupt vlan flow during startup, and is particularly
so if dvr enabled
2. if let's rabbitmq is not stable, it is possible data plane
will be affected and vlan will never work.
Using openstack on k8s particularly amplifies the problem
because pod could be killed pretty easily by liveness
probes.
Change-Id: I51050c600ba7090fea71213687d94340bac0674a
Closes-Bug: #1869808
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1869808
Title:
reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
Status in neutron:
Fix Released
Bug description:
We are using Openstack Neutron 13.0.6 and it is deployed using
OpenStack-helm.
I test ping servers in the same vlan while rebooting neutron-ovs-
agent. The result shows
root@mgt01:~# openstack server list
+--------------------------------------+-----------------+--------+------------------------------------------+------------------------------+-----------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+-----------------+--------+------------------------------------------+------------------------------+-----------+
| 22d55077-b1b5-452e-8eba-cbcd2d1514a8 | test-1-1 | ACTIVE | vlan105=172.31.10.4 | Cirros 0.4.0 64-bit | m1.tiny |
| 726bc888-7767-44bc-b68a-7a1f3a6babf1 | test-1-2 | ACTIVE | vlan105=172.31.10.18 | Cirros 0.4.0 64-bit | m1.tiny |
$ ping 172.31.10.4
PING 172.31.10.4 (172.31.10.4): 56 data bytes
......
64 bytes from 172.31.10.4: seq=59 ttl=64 time=0.465 ms
64 bytes from 172.31.10.4: seq=60 ttl=64 time=0.510 ms <--------
64 bytes from 172.31.10.4: seq=61 ttl=64 time=0.446 ms
64 bytes from 172.31.10.4: seq=63 ttl=64 time=0.744 ms
64 bytes from 172.31.10.4: seq=64 ttl=64 time=0.477 ms
64 bytes from 172.31.10.4: seq=65 ttl=64 time=0.441 ms
64 bytes from 172.31.10.4: seq=66 ttl=64 time=0.376 ms
64 bytes from 172.31.10.4: seq=67 ttl=64 time=0.481 ms
As one can see, packet seq 62 is lost, I believe, during rebooting ovs
agent.
Right now, I am suspecting
https://github.com/openstack/neutron/blob/6d619ea7c13e89ec575295f04c63ae316759c50a/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py#L229
this code is refreshing flow table rules even though it is not
necessary.
Because when I dump flows on phys bridge, I can see duration is
rewinding to 0 which suggests flow has been deleted and created again
""" duration=secs
The time, in seconds, that the entry has been in the table.
secs includes as much precision as the switch provides, possibly
to nanosecond resolution.
"""
root@compute01:~# ovs-ofctl dump-flows br-floating
...
cookie=0x673522f560f5ca4f, duration=323.852s, table=2, n_packets=1100, n_bytes=103409,
^------ this value resets
priority=4,in_port="phy-br-floating",dl_vlan=2 actions=mod_vlan_vid:105,NORMAL
...
IMO, rebooting ovs-agent should not affecting data plane.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1869808/+subscriptions
References