← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1837252] Re: IFLA_BR_AGEING_TIME of 0 causes flooding across bridges

 

triaging as high as folding could lead to network disruption to guests
on multiple hosts.

i have root caused this as a result of combining the code into a single
shared codepath between the ovs and linux bridge plugin

for ovs hybrid plug we set the ageing to 0 to prevent packet loss during
live migation

https://github.com/openstack/os-
vif/commit/fa4ff64b86e6e1b6399f7250eadbee9775c22d32#diff-
f55bc78ffb4c10000bbf81b88bf68673

however this is not valid for linux bridge in general
 
https://github.com/openstack/os-vif/commit/1f6fed6a69e9fd386e421f3cacae97c11cdd7c75#diff-010d1833da7ca175fffc8c41a38497c2

which replace the use of brctl in the linux bridge driver resued the
common code i introduced in

https://github.com/openstack/os-vif/commit/5027ce833c6fccaa80b5ddc8544d262c0bf99dbd#diff-
cec1a2ac6413663c344b607129c39fab

and as a result it picked up the ovs ageing code which was not
intentinal.

ill fix this shortly and backport it.

** Changed in: os-vif
   Importance: Undecided => High

** Changed in: os-vif
       Status: New => Confirmed

** Changed in: os-vif
     Assignee: (unassigned) => sean mooney (sean-k-mooney)

** Changed in: nova
       Status: New => Invalid

** Changed in: neutron
       Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1837252

Title:
  IFLA_BR_AGEING_TIME of 0 causes flooding across bridges

Status in neutron:
  Invalid
Status in OpenStack Compute (nova):
  Invalid
Status in os-vif:
  Confirmed

Bug description:
  Release: OpenStack Stein
  Driver: LinuxBridge

  Using Stein w/ the LinuxBridge mech driver/agent, we have found that
  traffic is being flooded across bridges. Using tcpdump inside an
  instance, you can see unicast traffic for other instances.

  We have confirmed the macs table shows the aging timer set to 0 for
  permanent entries, and the bridge is NOT learning new MACs:

  root@lab-compute01:~# brctl showmacs brqd0084ac0-f7
  port no	mac addr		is local?	ageing timer
    5	24:be:05:a3:1f:e1	yes		   0.00
    5	24:be:05:a3:1f:e1	yes		   0.00
    1	fe:16:3e:02:62:18	yes		   0.00
    1	fe:16:3e:02:62:18	yes		   0.00
    7	fe:16:3e:07:65:47	yes		   0.00
    7	fe:16:3e:07:65:47	yes		   0.00
    4	fe:16:3e:1d:d6:33	yes		   0.00
    4	fe:16:3e:1d:d6:33	yes		   0.00
    9	fe:16:3e:2b:2f:f0	yes		   0.00
    9	fe:16:3e:2b:2f:f0	yes		   0.00
    8	fe:16:3e:3c:42:64	yes		   0.00
    8	fe:16:3e:3c:42:64	yes		   0.00
   10	fe:16:3e:5c:a6:6c	yes		   0.00
   10	fe:16:3e:5c:a6:6c	yes		   0.00
    2	fe:16:3e:86:9c:dd	yes		   0.00
    2	fe:16:3e:86:9c:dd	yes		   0.00
    6	fe:16:3e:91:9b:45	yes		   0.00
    6	fe:16:3e:91:9b:45	yes		   0.00
   11	fe:16:3e:b3:30:00	yes		   0.00
   11	fe:16:3e:b3:30:00	yes		   0.00
    3	fe:16:3e:dc:c3:3e	yes		   0.00
    3	fe:16:3e:dc:c3:3e	yes		   0.00

  root@lab-compute01:~# bridge fdb show | grep brqd0084ac0-f7
  01:00:5e:00:00:01 dev brqd0084ac0-f7 self permanent
  fe:16:3e:02:62:18 dev tap74af38f9-2e master brqd0084ac0-f7 permanent
  fe:16:3e:02:62:18 dev tap74af38f9-2e vlan 1 master brqd0084ac0-f7 permanent
  fe:16:3e:86:9c:dd dev tapb00b3c18-b3 master brqd0084ac0-f7 permanent
  fe:16:3e:86:9c:dd dev tapb00b3c18-b3 vlan 1 master brqd0084ac0-f7 permanent
  fe:16:3e:dc:c3:3e dev tap7284d235-2b master brqd0084ac0-f7 permanent
  fe:16:3e:dc:c3:3e dev tap7284d235-2b vlan 1 master brqd0084ac0-f7 permanent
  fe:16:3e:1d:d6:33 dev tapbeb9441a-99 vlan 1 master brqd0084ac0-f7 permanent
  fe:16:3e:1d:d6:33 dev tapbeb9441a-99 master brqd0084ac0-f7 permanent
  24:be:05:a3:1f:e1 dev eno1.102 vlan 1 master brqd0084ac0-f7 permanent
  24:be:05:a3:1f:e1 dev eno1.102 master brqd0084ac0-f7 permanent
  fe:16:3e:91:9b:45 dev tapc8ad2cec-90 master brqd0084ac0-f7 permanent
  fe:16:3e:91:9b:45 dev tapc8ad2cec-90 vlan 1 master brqd0084ac0-f7 permanent
  fe:16:3e:07:65:47 dev tap86e2c412-24 master brqd0084ac0-f7 permanent
  fe:16:3e:07:65:47 dev tap86e2c412-24 vlan 1 master brqd0084ac0-f7 permanent
  fe:16:3e:3c:42:64 dev tap37bcb70e-9e master brqd0084ac0-f7 permanent
  fe:16:3e:3c:42:64 dev tap37bcb70e-9e vlan 1 master brqd0084ac0-f7 permanent
  fe:16:3e:2b:2f:f0 dev tap40f6be7c-2d vlan 1 master brqd0084ac0-f7 permanent
  fe:16:3e:2b:2f:f0 dev tap40f6be7c-2d master brqd0084ac0-f7 permanent
  fe:16:3e:b3:30:00 dev tap6548bacb-c0 vlan 1 master brqd0084ac0-f7 permanent
  fe:16:3e:b3:30:00 dev tap6548bacb-c0 master brqd0084ac0-f7 permanent
  fe:16:3e:5c:a6:6c dev tap61107236-1e vlan 1 master brqd0084ac0-f7 permanent
  fe:16:3e:5c:a6:6c dev tap61107236-1e master brqd0084ac0-f7 permanent

  The ageing time for the bridge is set to 0:

  root@lab-compute01:~# brctl showstp brqd0084ac0-f7
  brqd0084ac0-f7
   bridge id		8000.24be05a31fe1
   designated root	8000.24be05a31fe1
   root port		   0			path cost		   0
   max age		  20.00			bridge max age		  20.00
   hello time		   2.00			bridge hello time	   2.00
   forward delay		   0.00			bridge forward delay	   0.00
   ageing time		   0.00
   hello timer		   0.00			tcn timer		   0.00
   topology change timer	   0.00			gc timer		   0.00
   flags

  The default ageing time of 300 is being overridden by the value set
  here:

  Stein: https://github.com/openstack/os-
  vif/blob/stable/stein/os_vif/internal/command/ip/linux/impl_pyroute2.py#L89

  Master: https://github.com/openstack/os-
  vif/blob/master/os_vif/internal/ip/linux/impl_pyroute2.py#L89

  I am not sure of the behavior in OVS environments using the iptables
  firewall, but I have confirmed the 'qbr' bridges also have a ageing
  time of 0 (formerly 300).

  Please let me know if you have any questions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1837252/+subscriptions


References