← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1789878] [NEW] bgpvpn router fallback broken by change in neutron openvswitch firewall

 

Public bug reported:

This issue impacts current master, stable/rocky and stable/queens.

The first symptom is that we have seen failures of many tests from
legacy-tempest-dsvm-networking-bgpvpn-bagpipe since the merge of [1] in
neutron code (August 14th).


Background:

networking-bagpipe code for BGPVPN has a "router fallback" mechanism: in
cases where a network is at the same time connected to a Router and
associated to a BGPVPN, the traffic sent by a VM to its gateway is
redirected to br-mpls to attempt BGPVPN route matching, before
eventually being sent, as a fallback, to the neutron netns router if it
did no VPN route was matched in br-mpls.

For this mechanism to work, a rule is in place in table 91 to override
the NORMAL action (which would result in flood/learn) for the traffic
destinated to the gateway MAC address, with a higher priority rule that
sends the traffic to br-tun instead (br-tun is where the redirection to
br-mpls takes place):

 cookie=0x8b0cf47ac991c941, duration=5371.870s, table=91, n_packets=217, n_bytes=21266, priority=2,reg6=0x18,dl_dst=fa:16:3e:c5:89:72 actions=mod_vlan_vid:24,output:"patch-tun"
 cookie=0x89f8a81c314f2696, duration=71265.896s, table=91, n_packets=338, n_bytes=27094, priority=1 actions=NORMAL

(above, fa:16:3e:c5:89:72 is the gateway MAC address for the network
with vlan_id 24)


Analysis of the issue:

Change [1] replaced some rule that were resubmiting to table 91, with a
NORMAL action, resulting in only the first packets (from a conntrack
standpoint) to reach table 91.

This prevents the redirection of traffic to br-tun,br-mpls.

The tricky thing is that the issue does not always occurs: when there is no entry in the MAC leaning table (ovs-appctl fdb/show br-int) for the gateway MAC, the traffic is flooded and eventually reaches br-tun,br-mpls .  This explains why some tests, but not all tests, fail.
(not also that the tests where no Router is used in the destination network do not seem to fail.)


[1]
https://review.openstack.org/#/q/Ib6ced838a7ec6d5c459a8475318556001c31bdf

** Affects: networking-bagpipe
     Importance: High
         Status: Confirmed

** Affects: neutron
     Importance: Undecided
         Status: New

** Also affects: neutron
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1789878

Title:
  bgpvpn router fallback broken by change in neutron openvswitch
  firewall

Status in BaGPipe:
  Confirmed
Status in neutron:
  New

Bug description:
  This issue impacts current master, stable/rocky and stable/queens.

  The first symptom is that we have seen failures of many tests from
  legacy-tempest-dsvm-networking-bgpvpn-bagpipe since the merge of [1]
  in neutron code (August 14th).


  Background:

  networking-bagpipe code for BGPVPN has a "router fallback" mechanism:
  in cases where a network is at the same time connected to a Router and
  associated to a BGPVPN, the traffic sent by a VM to its gateway is
  redirected to br-mpls to attempt BGPVPN route matching, before
  eventually being sent, as a fallback, to the neutron netns router if
  it did no VPN route was matched in br-mpls.

  For this mechanism to work, a rule is in place in table 91 to override
  the NORMAL action (which would result in flood/learn) for the traffic
  destinated to the gateway MAC address, with a higher priority rule
  that sends the traffic to br-tun instead (br-tun is where the
  redirection to br-mpls takes place):

   cookie=0x8b0cf47ac991c941, duration=5371.870s, table=91, n_packets=217, n_bytes=21266, priority=2,reg6=0x18,dl_dst=fa:16:3e:c5:89:72 actions=mod_vlan_vid:24,output:"patch-tun"
   cookie=0x89f8a81c314f2696, duration=71265.896s, table=91, n_packets=338, n_bytes=27094, priority=1 actions=NORMAL

  (above, fa:16:3e:c5:89:72 is the gateway MAC address for the network
  with vlan_id 24)

  
  Analysis of the issue:

  Change [1] replaced some rule that were resubmiting to table 91, with
  a NORMAL action, resulting in only the first packets (from a conntrack
  standpoint) to reach table 91.

  This prevents the redirection of traffic to br-tun,br-mpls.

  The tricky thing is that the issue does not always occurs: when there is no entry in the MAC leaning table (ovs-appctl fdb/show br-int) for the gateway MAC, the traffic is flooded and eventually reaches br-tun,br-mpls .  This explains why some tests, but not all tests, fail.
  (not also that the tests where no Router is used in the destination network do not seem to fail.)


  [1]
  https://review.openstack.org/#/q/Ib6ced838a7ec6d5c459a8475318556001c31bdf

To manage notifications about this bug go to:
https://bugs.launchpad.net/networking-bagpipe/+bug/1789878/+subscriptions


Follow ups