← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1587831] [NEW] [SNAT][HA]snat traffic broken after restarting network nodes

 

Public bug reported:

After restarting both network nodes (l3 agent_mode=dvr_snat) at same
time, both snat namespaces on the nodes can't talk to each other, and
promote itself as the active one. In this case, there are 2 active snat
namespaces.

Then, once the one who actually takes SNAT traffic is done, the other
one won't take over the responsibility.


[root@zk22-01 ~]# neutron router-list
+--------------------------------------+------+---------------------------------------------------------------------------------------------------------------------------------+-------------+------+
| id                                   | name | external_gateway_info                                                                                                           | distributed | ha   |
+--------------------------------------+------+---------------------------------------------------------------------------------------------------------------------------------+-------------+------+
| c497892b-8ff4-441d-9f4e-43fd30401930 | rt   | {"network_id": "c892d21d-fea9-4d4b-b5f6-276345c7901f", "enable_snat": true, "external_fixed_ips": [{"subnet_id": "129df259-0104 | True        | True |
|                                      |      | -400e-8c76-a4d9250eb9c9", "ip_address": "192.168.122.4"}]}                                                                      |             |      |
+--------------------------------------+------+---------------------------------------------------------------------------------------------------------------------------------+-------------+------+

[root@zk22-01 ~]# neutron l3-agent-list-hosting-router c497892b-8ff4-441d-9f4e-43fd30401930
+--------------------------------------+---------+----------------+-------+----------+
| id                                   | host    | admin_state_up | alive | ha_state |
+--------------------------------------+---------+----------------+-------+----------+
| be5526ce-ad40-46af-9dc8-898cf08ebe9b | zk22-01 | True           | :-)   | active   |
| dcdfc230-c5d1-4dd3-b541-a6abac6531ba | zk22-02 | True           | :-)   | active   |
+--------------------------------------+---------+----------------+-------+----------+


[root@zk22-01 ~]# ip netns exec snat-c497892b-8ff4-441d-9f4e-43fd30401930 tcpdump -nn -i ha-004331fc-9f
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ha-004331fc-9f, link-type EN10MB (Ethernet), capture size 65535 bytes
18:59:03.574554 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:05.575500 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:07.576432 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:09.577361 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:11.578293 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:13.579243 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20

[root@zk22-02 ~]# ip netns exec snat-c497892b-8ff4-441d-9f4e-43fd30401930 tcpdump -nn -i ha-dda33de1-3e
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ha-dda33de1-3e, link-type EN10MB (Ethernet), capture size 65535 bytes
18:59:15.918725 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:17.919038 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:19.920036 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:21.921004 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:23.922007 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
18:59:25.923017 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1587831

Title:
  [SNAT][HA]snat traffic broken after restarting network nodes

Status in neutron:
  New

Bug description:
  After restarting both network nodes (l3 agent_mode=dvr_snat) at same
  time, both snat namespaces on the nodes can't talk to each other, and
  promote itself as the active one. In this case, there are 2 active
  snat namespaces.

  Then, once the one who actually takes SNAT traffic is done, the other
  one won't take over the responsibility.

  
  [root@zk22-01 ~]# neutron router-list
  +--------------------------------------+------+---------------------------------------------------------------------------------------------------------------------------------+-------------+------+
  | id                                   | name | external_gateway_info                                                                                                           | distributed | ha   |
  +--------------------------------------+------+---------------------------------------------------------------------------------------------------------------------------------+-------------+------+
  | c497892b-8ff4-441d-9f4e-43fd30401930 | rt   | {"network_id": "c892d21d-fea9-4d4b-b5f6-276345c7901f", "enable_snat": true, "external_fixed_ips": [{"subnet_id": "129df259-0104 | True        | True |
  |                                      |      | -400e-8c76-a4d9250eb9c9", "ip_address": "192.168.122.4"}]}                                                                      |             |      |
  +--------------------------------------+------+---------------------------------------------------------------------------------------------------------------------------------+-------------+------+

  [root@zk22-01 ~]# neutron l3-agent-list-hosting-router c497892b-8ff4-441d-9f4e-43fd30401930
  +--------------------------------------+---------+----------------+-------+----------+
  | id                                   | host    | admin_state_up | alive | ha_state |
  +--------------------------------------+---------+----------------+-------+----------+
  | be5526ce-ad40-46af-9dc8-898cf08ebe9b | zk22-01 | True           | :-)   | active   |
  | dcdfc230-c5d1-4dd3-b541-a6abac6531ba | zk22-02 | True           | :-)   | active   |
  +--------------------------------------+---------+----------------+-------+----------+

  
  [root@zk22-01 ~]# ip netns exec snat-c497892b-8ff4-441d-9f4e-43fd30401930 tcpdump -nn -i ha-004331fc-9f
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on ha-004331fc-9f, link-type EN10MB (Ethernet), capture size 65535 bytes
  18:59:03.574554 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
  18:59:05.575500 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
  18:59:07.576432 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
  18:59:09.577361 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
  18:59:11.578293 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
  18:59:13.579243 IP 169.254.192.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20

  [root@zk22-02 ~]# ip netns exec snat-c497892b-8ff4-441d-9f4e-43fd30401930 tcpdump -nn -i ha-dda33de1-3e
  tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  listening on ha-dda33de1-3e, link-type EN10MB (Ethernet), capture size 65535 bytes
  18:59:15.918725 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
  18:59:17.919038 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
  18:59:19.920036 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
  18:59:21.921004 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
  18:59:23.922007 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
  18:59:25.923017 IP 169.254.192.1 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1587831/+subscriptions