← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2002316] [NEW] ha router - vxlan - incorrect ovs flow ?

 

Public bug reported:

Hi all,

* Summary:

ha router - vxlan - incorrect ovs flow ?

* High level description:

When creating a ha router, there's sometimes a flow on the active
neutron-l3-agent server for that router that points to a
network-l3-agent server that isn't hosting the router. Therefore packet
are sent to a useless host generating unnecessary network/cpu
processing.

* Pre-conditions:

12 x neutron-l3-agent on dedicated servers
dvr disabled
no DPDK
routers in ha mode with 2 agents (active/standby)


* Step-by-step reproduction steps

Simply creating a router using :

openstack router create router1

* Actual output

openstack network agent list --router 67d62ea8-e944-4204-b7b7-3b6b85f789b5 --long
+--------------------------------------+------------+------------------------------+-------------------+-------+-------+------------------+----------+
| ID                                   | Agent Type | Host                         | Availability Zone | Alive | State | Binary           | HA State |
+--------------------------------------+------------+------------------------------+-------------------+-------+-------+------------------+----------+
| 0b506db2-dd4c-401d-ba6b-a2b5908408aa | L3 agent   | network-node-7               | nova              | :-)   | UP    | neutron-l3-agent | active   |
| 53d24a34-2226-4fbd-b125-5351777048b4 | L3 agent   | network-node-5               | nova              | :-)   | UP    | neutron-l3-agent | standby  |
+--------------------------------------+------------+------------------------------+-------------------+-------+-------+------------------+----------+

network-node-7>_ ~ # ip netns exec qrouter-67d62ea8-e944-4204-b7b7-3b6b85f789b5 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
235: ha-3172bc80-9b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:88:7c:33 brd ff:ff:ff:ff:ff:ff
    inet 169.254.194.212/18 brd 169.254.255.255 scope global ha-3172bc80-9b
       valid_lft forever preferred_lft forever
    inet 169.254.0.248/24 scope global ha-3172bc80-9b
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe88:7c33/64 scope link
       valid_lft forever preferred_lft forever

network-node-7>_ ~ # ovs-ofctl dump-ports-desc br-int | grep ha-3172bc80-9b
 227(ha-3172bc80-9b): addr:fa:16:3e:88:7c:33

network-node-7>_ ~ # ovs-appctl ofproto/trace br-int in_port=227,dl_src=fa:16:3e:88:7c:33
Flow: in_port=227,vlan_tci=0x0000,dl_src=fa:16:3e:88:7c:33,dl_dst=00:00:00:00:00:00,dl_type=0x0000

bridge("br-int")
----------------
 0. priority 0, cookie 0x67f75f52fb46d598
    goto_table:60
60. priority 3, cookie 0x67f75f52fb46d598
    NORMAL
     -> no learned MAC for destination, flooding

    bridge("br-ex")
    ---------------
         0. in_port=2, priority 2, cookie 0x43bc653b06d7eccb
            drop

bridge("br-tun")
----------------
 0. in_port=1, priority 1, cookie 0xd40beb3f0a036588
    goto_table:2
 2. dl_dst=00:00:00:00:00:00/01:00:00:00:00:00, priority 0, cookie 0xd40beb3f0a036588
    goto_table:20
20. priority 0, cookie 0xd40beb3f0a036588
    goto_table:22
22. dl_vlan=158, priority 1, cookie 0xd40beb3f0a036588
    pop_vlan
    set_field:0x24be->tun_id
    output:155
     -> output to kernel tunnel
    output:156
     -> output to kernel tunnel

Final flow: unchanged
Megaflow: pkt_mark=0,recirc_id=0,eth,in_port=227,dl_src=fa:16:3e:88:7c:33,dl_dst=00:00:00:00:00:00,dl_type=0x0000
Datapath actions: push_vlan(vid=158,pcp=0),3,set(tunnel(tun_id=0x24be,src=1.2.3.8,dst=1.2.3.12,ttl=64,tp_dst=4789,flags(df|key))),pop_vlan,9,set(tunnel(tun_id=0x24be,src=1.2.3.8,dst=1.2.3.2,ttl=64,tp_dst=4789,flags(df|key))),9


I don't understand the flow.
1.2.3.8 = active l3-agent
1.2.3.2 = standby l3-agent
1.2.3.12 = another l3-agent, not part of the router's configuration, never was. 

why is 1.2.3.12 listed here ?

network-node-7>_ ~ # ovs-dpctl dump-flows -m | grep 0x24be
ufid:17870c16-c303-4cb1-b3e7-e164160c1013, recirc_id(0),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x24be,src=1.2.3.12,dst=1.2.3.8,ttl=0/0,flags(-df-csum+key)),in_port(vxlan_sys_4789),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=fa:16:3e:3b:0d:23,dst=01:00:5e:00:00:12),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:424, bytes:22896, used:0.102s, dp:ovs, actions:push_vlan(vid=158,pcp=0),br-int,pop_vlan,ha-3172bc80-9b
ufid:08137467-da7d-4a15-bb5c-08f80bfe03d2, recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(ha-3172bc80-9b),skb_mark(0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=fa:16:3e:88:7c:33,dst=01:00:5e:00:00:12),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:419, bytes:22626, used:1.702s, dp:ovs, actions:push_vlan(vid=158,pcp=0),br-int,set(tunnel(tun_id=0x24be,src=1.2.3.8,dst=1.2.3.12,ttl=64,tp_dst=4789,flags(df|key))),pop_vlan,vxlan_sys_4789,set(tunnel(tun_id=0x24be,src=1.2.3.8,dst=1.2.3.2,ttl=64,tp_dst=4789,flags(df|key))),vxlan_sys_4789


* Version:

dpkg -l | grep neutron
ii  neutron-common                       2:17.4.1-0+deb11u2~bpo11+1                all          OpenStack virtual network service - common files
ii  neutron-dhcp-agent                   2:17.4.1-0+deb11u2~bpo11+1                all          OpenStack virtual network service - DHCP agent
ii  neutron-dynamic-routing-common       2:17.0.0-2                                all          OpenStack Neutron Dynamic Routing - common files
ii  neutron-l3-agent                     2:17.4.1-0+deb11u2~bpo11+1                all          OpenStack virtual network service - l3 agent
ii  neutron-metadata-agent               2:17.4.1-0+deb11u2~bpo11+1                all          OpenStack virtual network service - metadata agent
ii  neutron-openvswitch-agent            2:17.4.1-0+deb11u2~bpo11+1                all          OpenStack virtual network service - Open vSwitch agent
ii  python3-neutron                      2:17.4.1-0+deb11u2~bpo11+1                all          OpenStack virtual network service - Python library
ii  python3-neutron-dynamic-routing      2:17.0.0-2                                all          OpenStack Neutron Dynamic Routing - Python library
ii  python3-neutron-lib                  2.6.2-1~bpo11+1                           all          Neutron shared routines and utilities - Python 3.x
ii  python3-neutronclient                1:7.2.1-2                                 all          client API library for Neutron - Python 3.x


kernel : 5.10.0-19-amd64

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2002316

Title:
  ha router - vxlan - incorrect ovs flow ?

Status in neutron:
  New

Bug description:
  Hi all,

  * Summary:

  ha router - vxlan - incorrect ovs flow ?

  * High level description:

  When creating a ha router, there's sometimes a flow on the active
  neutron-l3-agent server for that router that points to a
  network-l3-agent server that isn't hosting the router. Therefore
  packet are sent to a useless host generating unnecessary network/cpu
  processing.

  * Pre-conditions:

  12 x neutron-l3-agent on dedicated servers
  dvr disabled
  no DPDK
  routers in ha mode with 2 agents (active/standby)

  
  * Step-by-step reproduction steps

  Simply creating a router using :

  openstack router create router1

  * Actual output

  openstack network agent list --router 67d62ea8-e944-4204-b7b7-3b6b85f789b5 --long
  +--------------------------------------+------------+------------------------------+-------------------+-------+-------+------------------+----------+
  | ID                                   | Agent Type | Host                         | Availability Zone | Alive | State | Binary           | HA State |
  +--------------------------------------+------------+------------------------------+-------------------+-------+-------+------------------+----------+
  | 0b506db2-dd4c-401d-ba6b-a2b5908408aa | L3 agent   | network-node-7               | nova              | :-)   | UP    | neutron-l3-agent | active   |
  | 53d24a34-2226-4fbd-b125-5351777048b4 | L3 agent   | network-node-5               | nova              | :-)   | UP    | neutron-l3-agent | standby  |
  +--------------------------------------+------------+------------------------------+-------------------+-------+-------+------------------+----------+

  network-node-7>_ ~ # ip netns exec qrouter-67d62ea8-e944-4204-b7b7-3b6b85f789b5 ip a
  1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
      link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
      inet 127.0.0.1/8 scope host lo
         valid_lft forever preferred_lft forever
      inet6 ::1/128 scope host
         valid_lft forever preferred_lft forever
  235: ha-3172bc80-9b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
      link/ether fa:16:3e:88:7c:33 brd ff:ff:ff:ff:ff:ff
      inet 169.254.194.212/18 brd 169.254.255.255 scope global ha-3172bc80-9b
         valid_lft forever preferred_lft forever
      inet 169.254.0.248/24 scope global ha-3172bc80-9b
         valid_lft forever preferred_lft forever
      inet6 fe80::f816:3eff:fe88:7c33/64 scope link
         valid_lft forever preferred_lft forever

  network-node-7>_ ~ # ovs-ofctl dump-ports-desc br-int | grep ha-3172bc80-9b
   227(ha-3172bc80-9b): addr:fa:16:3e:88:7c:33

  network-node-7>_ ~ # ovs-appctl ofproto/trace br-int in_port=227,dl_src=fa:16:3e:88:7c:33
  Flow: in_port=227,vlan_tci=0x0000,dl_src=fa:16:3e:88:7c:33,dl_dst=00:00:00:00:00:00,dl_type=0x0000

  bridge("br-int")
  ----------------
   0. priority 0, cookie 0x67f75f52fb46d598
      goto_table:60
  60. priority 3, cookie 0x67f75f52fb46d598
      NORMAL
       -> no learned MAC for destination, flooding

      bridge("br-ex")
      ---------------
           0. in_port=2, priority 2, cookie 0x43bc653b06d7eccb
              drop

  bridge("br-tun")
  ----------------
   0. in_port=1, priority 1, cookie 0xd40beb3f0a036588
      goto_table:2
   2. dl_dst=00:00:00:00:00:00/01:00:00:00:00:00, priority 0, cookie 0xd40beb3f0a036588
      goto_table:20
  20. priority 0, cookie 0xd40beb3f0a036588
      goto_table:22
  22. dl_vlan=158, priority 1, cookie 0xd40beb3f0a036588
      pop_vlan
      set_field:0x24be->tun_id
      output:155
       -> output to kernel tunnel
      output:156
       -> output to kernel tunnel

  Final flow: unchanged
  Megaflow: pkt_mark=0,recirc_id=0,eth,in_port=227,dl_src=fa:16:3e:88:7c:33,dl_dst=00:00:00:00:00:00,dl_type=0x0000
  Datapath actions: push_vlan(vid=158,pcp=0),3,set(tunnel(tun_id=0x24be,src=1.2.3.8,dst=1.2.3.12,ttl=64,tp_dst=4789,flags(df|key))),pop_vlan,9,set(tunnel(tun_id=0x24be,src=1.2.3.8,dst=1.2.3.2,ttl=64,tp_dst=4789,flags(df|key))),9

  
  I don't understand the flow.
  1.2.3.8 = active l3-agent
  1.2.3.2 = standby l3-agent
  1.2.3.12 = another l3-agent, not part of the router's configuration, never was. 

  why is 1.2.3.12 listed here ?

  network-node-7>_ ~ # ovs-dpctl dump-flows -m | grep 0x24be
  ufid:17870c16-c303-4cb1-b3e7-e164160c1013, recirc_id(0),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x24be,src=1.2.3.12,dst=1.2.3.8,ttl=0/0,flags(-df-csum+key)),in_port(vxlan_sys_4789),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=fa:16:3e:3b:0d:23,dst=01:00:5e:00:00:12),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:424, bytes:22896, used:0.102s, dp:ovs, actions:push_vlan(vid=158,pcp=0),br-int,pop_vlan,ha-3172bc80-9b
  ufid:08137467-da7d-4a15-bb5c-08f80bfe03d2, recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(ha-3172bc80-9b),skb_mark(0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=fa:16:3e:88:7c:33,dst=01:00:5e:00:00:12),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:419, bytes:22626, used:1.702s, dp:ovs, actions:push_vlan(vid=158,pcp=0),br-int,set(tunnel(tun_id=0x24be,src=1.2.3.8,dst=1.2.3.12,ttl=64,tp_dst=4789,flags(df|key))),pop_vlan,vxlan_sys_4789,set(tunnel(tun_id=0x24be,src=1.2.3.8,dst=1.2.3.2,ttl=64,tp_dst=4789,flags(df|key))),vxlan_sys_4789

  
  * Version:

  dpkg -l | grep neutron
  ii  neutron-common                       2:17.4.1-0+deb11u2~bpo11+1                all          OpenStack virtual network service - common files
  ii  neutron-dhcp-agent                   2:17.4.1-0+deb11u2~bpo11+1                all          OpenStack virtual network service - DHCP agent
  ii  neutron-dynamic-routing-common       2:17.0.0-2                                all          OpenStack Neutron Dynamic Routing - common files
  ii  neutron-l3-agent                     2:17.4.1-0+deb11u2~bpo11+1                all          OpenStack virtual network service - l3 agent
  ii  neutron-metadata-agent               2:17.4.1-0+deb11u2~bpo11+1                all          OpenStack virtual network service - metadata agent
  ii  neutron-openvswitch-agent            2:17.4.1-0+deb11u2~bpo11+1                all          OpenStack virtual network service - Open vSwitch agent
  ii  python3-neutron                      2:17.4.1-0+deb11u2~bpo11+1                all          OpenStack virtual network service - Python library
  ii  python3-neutron-dynamic-routing      2:17.0.0-2                                all          OpenStack Neutron Dynamic Routing - Python library
  ii  python3-neutron-lib                  2.6.2-1~bpo11+1                           all          Neutron shared routines and utilities - Python 3.x
  ii  python3-neutronclient                1:7.2.1-2                                 all          client API library for Neutron - Python 3.x

  
  kernel : 5.10.0-19-amd64

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2002316/+subscriptions