yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #73952
[Bug 1783470] [NEW] get_subnet_for_dvr returns SNAT mac instead of gateway in subnet_info
Public bug reported:
On our dvr_snat host, the "install_dvr_to_src_mac" is installing the
rule in br-int with the SNAT MAC instead instead of the DVR mac address
(subnet's gateway aka network:router_interface_distributed). For
example, the subnet's gateway is 172.16.0.1, with MAC fa:16:3e:42:a2:ec.
On most hosts, we see following rules in br-int:
[root@stan ~]# ovs-ofctl dump-flows br-int | grep fa:16:3e:42:a2:ec
cookie=0x77f69fee58f51737, duration=11872.801s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:22:eb:8b actions=mod_dl_src:fa:16:3e:42:a2:ec,resubmit(,60)
cookie=0x77f69fee58f51737, duration=11872.790s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:cd:71:e1 actions=mod_dl_src:fa:16:3e:42:a2:ec,resubmit(,60)
cookie=0x77f69fee58f51737, duration=11865.953s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:20:77:00 actions=mod_dl_src:fa:16:3e:42:a2:ec,resubmit(,60)
cookie=0x77f69fee58f51737, duration=11865.933s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:ab:2d:1a actions=mod_dl_src:fa:16:3e:42:a2:ec,resubmit(,60)
cookie=0x77f69fee58f51737, duration=11860.735s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:76:e9:ae actions=mod_dl_src:fa:16:3e:42:a2:ec,resubmit(,60)
cookie=0x77f69fee58f51737, duration=11859.335s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:cb:48:27 actions=mod_dl_src:fa:16:3e:42:a2:ec,resubmit(,60)
However, on our dvr_snat host, these rules are all missing for the dl_src MAC. Instead, they get added with the MAC of the network:router_centralized_snat instead:
root@krusty:~# ovs-ofctl dump-flows br-int | grep fa:16:3e:84:0b:42
cookie=0xbb5ebbfa2dfadb74, duration=5351.368s, table=2, n_packets=2976001, n_bytes=362273213, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:84:0b:42 actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5195.362s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:86:91:e2 actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5195.349s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:a2:04:d3 actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5195.336s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:82:ef:3b actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5195.325s, table=2, n_packets=24, n_bytes=2044, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:e4:d9:f3 actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5195.272s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:b9:a0:fe actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5194.118s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:1a:42:fa actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5194.098s, table=2, n_packets=56, n_bytes=4792, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:84:33:df actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5193.995s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:34:e1:92 actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5193.509s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:6d:3e:f3 actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5191.408s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:30:97:8f actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5188.895s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:57:e5:ad actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5351.361s, table=60, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:84:0b:42 actions=strip_vlan,output:951
root@krusty:~#
I have traced this to the get_subnet_for_dvr call. In the subnet_info,
the gateway_mac returned is incorrect. Initially upon restarting OVS
agent, the dvr_local_map is empty. So OVS agent makes the
get_subnet_for_dvr call to populate local subnet info map. On good
hosts, it is querying with fixed_ip = subnet gateway (172.16.0.1). On
the snat host, it is querying first with fixed_ip = 172.16.0.3.
Either this is incorrect, or even when querying with SNAT port, the
gateway_mac in subnet should be DVR MAC, not snat MAC:
Good host:
root@barney:~# cat ovs.log | grep get_subnet_for_dvr | grep "172.16"
2018-07-24 19:42:24.454 15840 DEBUG neutron.api.rpc.handlers.dvr_rpc [req-5ece05d6-f2cd-46a4-b81e-7e579e61990b - - - - -] neutron.api.rpc.handlers.dvr_rpc.DVRServerRpcApi method get_subnet_for_dvr called with arguments (<neutron_lib.context.ContextBase object at 0x7f52f1983150>, '3707b250-b6f5-4701-9b17-01a8f288c17a') {'fixed_ips': [{'subnet_id': '3707b250-b6f5-4701-9b17-01a8f288c17a', 'ip_address': '172.16.0.1'}]} wrapper /opt/pf9/pf9-neutron/lib/python2.7/site-packages/oslo_log/helpers.py:66
2018-07-24 19:42:24.820 15840 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_dvr_neutron_agent [req-5ece05d6-f2cd-46a4-b81e-7e579e61990b - - - - -] get_subnet_for_dvr for subnet 3707b250-b6f5-4701-9b17-01a8f288c17a returned with {u'shared': True, u'service_types': [], u'description': None, u'enable_dhcp': True, u'tags': [], u'network_id': u'3f6ec232-7649-4639-b828-c3af9960481b', u'tenant_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'dns_nameservers': [u'10.1.10.19', u'8.8.8.8', u'8.8.4.4'], u'gateway_ip': u'172.16.0.1', u'ipv6_ra_mode': None, u'allocation_pools': [{u'start': u'172.16.0.2', u'end': u'172.16.255.254'}], u'host_routes': [], u'revision_number': 2, u'ipv6_address_mode': None, u'ip_version': 4, u'gateway_mac': u'fa:16:3e:42:a2:ec', u'cidr': u'172.16.0.0/16', u'project_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'id': u'3707b250-b6f5-4701-9b17-01a8f288c17a', u'subnetpool_id': None, u'name': u'172.16.0.0/16'} _bind_distributed_router_interface_port /opt/pf9/pf9-neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py:371
2018-07-24 19:42:25.686 15840 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_dvr_neutron_agent [req-5ece05d6-f2cd-46a4-b81e-7e579e61990b - - - - -] get_subnet_for_dvr for subnet 98d2750d-60ce-4b53-88ef-423b77d5f5f5 returned with {u'shared': True, u'service_types': [], u'description': None, u'enable_dhcp': True, u'tags': [], u'network_id': u'655c3eb4-b9f5-4e30-92de-2262d6e87c92', u'tenant_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'dns_nameservers': [], u'gateway_ip': u'10.100.0.1', u'ipv6_ra_mode': None, u'allocation_pools': [{u'start': u'10.100.0.2', u'end': u'10.100.255.254'}], u'host_routes': [{u'destination': u'0.0.0.0/0', u'nexthop': u'172.16.0.1'}], u'revision_number': 0, u'ipv6_address_mode': None, u'ip_version': 4, u'gateway_mac': u'fa:16:3e:13:61:98', u'cidr': u'10.100.0.0/16', u'project_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'id': u'98d2750d-60ce-4b53-88ef-423b77d5f5f5', u'subnetpool_id': None, u'name': u'dogfood-vxlan-8000-sub'} _bind_distributed_router_interface_port /opt/pf9/pf9-neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py:371
Bad Host:
root@krusty:~# cat ovs.log | grep get_subnet_for_dvr | grep "172.16"
2018-07-24 19:44:44.135 31138 DEBUG neutron.api.rpc.handlers.dvr_rpc [req-6d269f17-c49c-4f64-93f8-139639020c5d - - - - -] neutron.api.rpc.handlers.dvr_rpc.DVRServerRpcApi method get_subnet_for_dvr called with arguments (<neutron_lib.context.ContextBase object at 0x7f1c09d3b410>, '3707b250-b6f5-4701-9b17-01a8f288c17a') {'fixed_ips': [{'subnet_id': '3707b250-b6f5-4701-9b17-01a8f288c17a', 'ip_address': '172.16.0.3'}]} wrapper /opt/pf9/pf9-neutron/lib/python2.7/site-packages/oslo_log/helpers.py:66
2018-07-24 19:44:44.369 31138 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_dvr_neutron_agent [req-6d269f17-c49c-4f64-93f8-139639020c5d - - - - -] get_subnet_for_dvr for subnet 3707b250-b6f5-4701-9b17-01a8f288c17a returned with {u'shared': True, u'service_types': [], u'description': None, u'enable_dhcp': True, u'tags': [], u'network_id': u'3f6ec232-7649-4639-b828-c3af9960481b', u'tenant_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'dns_nameservers': [u'10.1.10.19', u'8.8.8.8', u'8.8.4.4'], u'gateway_ip': u'172.16.0.1', u'ipv6_ra_mode': None, u'allocation_pools': [{u'start': u'172.16.0.2', u'end': u'172.16.255.254'}], u'host_routes': [], u'revision_number': 2, u'ipv6_address_mode': None, u'ip_version': 4, u'gateway_mac': u'fa:16:3e:84:0b:42', u'cidr': u'172.16.0.0/16', u'project_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'id': u'3707b250-b6f5-4701-9b17-01a8f288c17a', u'subnetpool_id': None, u'name': u'172.16.0.0/16'} _bind_centralized_snat_port_on_dvr_subnet /opt/pf9/pf9-neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py:553
2018-07-24 19:44:51.786 31138 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_dvr_neutron_agent [req-6d269f17-c49c-4f64-93f8-139639020c5d - - - - -] get_subnet_for_dvr for subnet 98d2750d-60ce-4b53-88ef-423b77d5f5f5 returned with {u'shared': True, u'service_types': [], u'description': None, u'enable_dhcp': True, u'tags': [], u'network_id': u'655c3eb4-b9f5-4e30-92de-2262d6e87c92', u'tenant_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'dns_nameservers': [], u'gateway_ip': u'10.100.0.1', u'ipv6_ra_mode': None, u'allocation_pools': [{u'start': u'10.100.0.2', u'end': u'10.100.255.254'}], u'host_routes': [{u'destination': u'0.0.0.0/0', u'nexthop': u'172.16.0.1'}], u'revision_number': 0, u'ipv6_address_mode': None, u'ip_version': 4, u'gateway_mac': u'fa:16:3e:b1:bd:33', u'cidr': u'10.100.0.0/16', u'project_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'id': u'98d2750d-60ce-4b53-88ef-423b77d5f5f5', u'subnetpool_id': None, u'name': u'dogfood-vxlan-8000-sub'} _bind_centralized_snat_port_on_dvr_subnet /opt/pf9/pf9-neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py:553
This causes a whole slew of problems - packets are sent into network
infrastructure with src MAC of the local DVR mac, causing this MAC to
flap on remote hosts' br-int between patch cable and qr interface. If we
shut the snat host's interfaces or bring the host down, the dvr MAC
stops flapping on br-int on other hosts, and network connectivity is
restored.
** Affects: neutron
Importance: Undecided
Status: New
** Tags: l3-dvr-backlog ovs
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1783470
Title:
get_subnet_for_dvr returns SNAT mac instead of gateway in subnet_info
Status in neutron:
New
Bug description:
On our dvr_snat host, the "install_dvr_to_src_mac" is installing the
rule in br-int with the SNAT MAC instead instead of the DVR mac
address (subnet's gateway aka network:router_interface_distributed).
For example, the subnet's gateway is 172.16.0.1, with MAC
fa:16:3e:42:a2:ec.
On most hosts, we see following rules in br-int:
[root@stan ~]# ovs-ofctl dump-flows br-int | grep fa:16:3e:42:a2:ec
cookie=0x77f69fee58f51737, duration=11872.801s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:22:eb:8b actions=mod_dl_src:fa:16:3e:42:a2:ec,resubmit(,60)
cookie=0x77f69fee58f51737, duration=11872.790s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:cd:71:e1 actions=mod_dl_src:fa:16:3e:42:a2:ec,resubmit(,60)
cookie=0x77f69fee58f51737, duration=11865.953s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:20:77:00 actions=mod_dl_src:fa:16:3e:42:a2:ec,resubmit(,60)
cookie=0x77f69fee58f51737, duration=11865.933s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:ab:2d:1a actions=mod_dl_src:fa:16:3e:42:a2:ec,resubmit(,60)
cookie=0x77f69fee58f51737, duration=11860.735s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:76:e9:ae actions=mod_dl_src:fa:16:3e:42:a2:ec,resubmit(,60)
cookie=0x77f69fee58f51737, duration=11859.335s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:cb:48:27 actions=mod_dl_src:fa:16:3e:42:a2:ec,resubmit(,60)
However, on our dvr_snat host, these rules are all missing for the dl_src MAC. Instead, they get added with the MAC of the network:router_centralized_snat instead:
root@krusty:~# ovs-ofctl dump-flows br-int | grep fa:16:3e:84:0b:42
cookie=0xbb5ebbfa2dfadb74, duration=5351.368s, table=2, n_packets=2976001, n_bytes=362273213, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:84:0b:42 actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5195.362s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:86:91:e2 actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5195.349s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:a2:04:d3 actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5195.336s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:82:ef:3b actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5195.325s, table=2, n_packets=24, n_bytes=2044, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:e4:d9:f3 actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5195.272s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:b9:a0:fe actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5194.118s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:1a:42:fa actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5194.098s, table=2, n_packets=56, n_bytes=4792, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:84:33:df actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5193.995s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:34:e1:92 actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5193.509s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:6d:3e:f3 actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5191.408s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:30:97:8f actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5188.895s, table=2, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:57:e5:ad actions=mod_dl_src:fa:16:3e:84:0b:42,resubmit(,60)
cookie=0xbb5ebbfa2dfadb74, duration=5351.361s, table=60, n_packets=0, n_bytes=0, idle_age=65534, priority=4,dl_vlan=795,dl_dst=fa:16:3e:84:0b:42 actions=strip_vlan,output:951
root@krusty:~#
I have traced this to the get_subnet_for_dvr call. In the subnet_info,
the gateway_mac returned is incorrect. Initially upon restarting OVS
agent, the dvr_local_map is empty. So OVS agent makes the
get_subnet_for_dvr call to populate local subnet info map. On good
hosts, it is querying with fixed_ip = subnet gateway (172.16.0.1). On
the snat host, it is querying first with fixed_ip = 172.16.0.3.
Either this is incorrect, or even when querying with SNAT port, the
gateway_mac in subnet should be DVR MAC, not snat MAC:
Good host:
root@barney:~# cat ovs.log | grep get_subnet_for_dvr | grep "172.16"
2018-07-24 19:42:24.454 15840 DEBUG neutron.api.rpc.handlers.dvr_rpc [req-5ece05d6-f2cd-46a4-b81e-7e579e61990b - - - - -] neutron.api.rpc.handlers.dvr_rpc.DVRServerRpcApi method get_subnet_for_dvr called with arguments (<neutron_lib.context.ContextBase object at 0x7f52f1983150>, '3707b250-b6f5-4701-9b17-01a8f288c17a') {'fixed_ips': [{'subnet_id': '3707b250-b6f5-4701-9b17-01a8f288c17a', 'ip_address': '172.16.0.1'}]} wrapper /opt/pf9/pf9-neutron/lib/python2.7/site-packages/oslo_log/helpers.py:66
2018-07-24 19:42:24.820 15840 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_dvr_neutron_agent [req-5ece05d6-f2cd-46a4-b81e-7e579e61990b - - - - -] get_subnet_for_dvr for subnet 3707b250-b6f5-4701-9b17-01a8f288c17a returned with {u'shared': True, u'service_types': [], u'description': None, u'enable_dhcp': True, u'tags': [], u'network_id': u'3f6ec232-7649-4639-b828-c3af9960481b', u'tenant_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'dns_nameservers': [u'10.1.10.19', u'8.8.8.8', u'8.8.4.4'], u'gateway_ip': u'172.16.0.1', u'ipv6_ra_mode': None, u'allocation_pools': [{u'start': u'172.16.0.2', u'end': u'172.16.255.254'}], u'host_routes': [], u'revision_number': 2, u'ipv6_address_mode': None, u'ip_version': 4, u'gateway_mac': u'fa:16:3e:42:a2:ec', u'cidr': u'172.16.0.0/16', u'project_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'id': u'3707b250-b6f5-4701-9b17-01a8f288c17a', u'subnetpool_id': None, u'name': u'172.16.0.0/16'} _bind_distributed_router_interface_port /opt/pf9/pf9-neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py:371
2018-07-24 19:42:25.686 15840 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_dvr_neutron_agent [req-5ece05d6-f2cd-46a4-b81e-7e579e61990b - - - - -] get_subnet_for_dvr for subnet 98d2750d-60ce-4b53-88ef-423b77d5f5f5 returned with {u'shared': True, u'service_types': [], u'description': None, u'enable_dhcp': True, u'tags': [], u'network_id': u'655c3eb4-b9f5-4e30-92de-2262d6e87c92', u'tenant_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'dns_nameservers': [], u'gateway_ip': u'10.100.0.1', u'ipv6_ra_mode': None, u'allocation_pools': [{u'start': u'10.100.0.2', u'end': u'10.100.255.254'}], u'host_routes': [{u'destination': u'0.0.0.0/0', u'nexthop': u'172.16.0.1'}], u'revision_number': 0, u'ipv6_address_mode': None, u'ip_version': 4, u'gateway_mac': u'fa:16:3e:13:61:98', u'cidr': u'10.100.0.0/16', u'project_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'id': u'98d2750d-60ce-4b53-88ef-423b77d5f5f5', u'subnetpool_id': None, u'name': u'dogfood-vxlan-8000-sub'} _bind_distributed_router_interface_port /opt/pf9/pf9-neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py:371
Bad Host:
root@krusty:~# cat ovs.log | grep get_subnet_for_dvr | grep "172.16"
2018-07-24 19:44:44.135 31138 DEBUG neutron.api.rpc.handlers.dvr_rpc [req-6d269f17-c49c-4f64-93f8-139639020c5d - - - - -] neutron.api.rpc.handlers.dvr_rpc.DVRServerRpcApi method get_subnet_for_dvr called with arguments (<neutron_lib.context.ContextBase object at 0x7f1c09d3b410>, '3707b250-b6f5-4701-9b17-01a8f288c17a') {'fixed_ips': [{'subnet_id': '3707b250-b6f5-4701-9b17-01a8f288c17a', 'ip_address': '172.16.0.3'}]} wrapper /opt/pf9/pf9-neutron/lib/python2.7/site-packages/oslo_log/helpers.py:66
2018-07-24 19:44:44.369 31138 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_dvr_neutron_agent [req-6d269f17-c49c-4f64-93f8-139639020c5d - - - - -] get_subnet_for_dvr for subnet 3707b250-b6f5-4701-9b17-01a8f288c17a returned with {u'shared': True, u'service_types': [], u'description': None, u'enable_dhcp': True, u'tags': [], u'network_id': u'3f6ec232-7649-4639-b828-c3af9960481b', u'tenant_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'dns_nameservers': [u'10.1.10.19', u'8.8.8.8', u'8.8.4.4'], u'gateway_ip': u'172.16.0.1', u'ipv6_ra_mode': None, u'allocation_pools': [{u'start': u'172.16.0.2', u'end': u'172.16.255.254'}], u'host_routes': [], u'revision_number': 2, u'ipv6_address_mode': None, u'ip_version': 4, u'gateway_mac': u'fa:16:3e:84:0b:42', u'cidr': u'172.16.0.0/16', u'project_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'id': u'3707b250-b6f5-4701-9b17-01a8f288c17a', u'subnetpool_id': None, u'name': u'172.16.0.0/16'} _bind_centralized_snat_port_on_dvr_subnet /opt/pf9/pf9-neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py:553
2018-07-24 19:44:51.786 31138 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_dvr_neutron_agent [req-6d269f17-c49c-4f64-93f8-139639020c5d - - - - -] get_subnet_for_dvr for subnet 98d2750d-60ce-4b53-88ef-423b77d5f5f5 returned with {u'shared': True, u'service_types': [], u'description': None, u'enable_dhcp': True, u'tags': [], u'network_id': u'655c3eb4-b9f5-4e30-92de-2262d6e87c92', u'tenant_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'dns_nameservers': [], u'gateway_ip': u'10.100.0.1', u'ipv6_ra_mode': None, u'allocation_pools': [{u'start': u'10.100.0.2', u'end': u'10.100.255.254'}], u'host_routes': [{u'destination': u'0.0.0.0/0', u'nexthop': u'172.16.0.1'}], u'revision_number': 0, u'ipv6_address_mode': None, u'ip_version': 4, u'gateway_mac': u'fa:16:3e:b1:bd:33', u'cidr': u'10.100.0.0/16', u'project_id': u'f175f441ebbb4c2b8fedf6469d6415fc', u'id': u'98d2750d-60ce-4b53-88ef-423b77d5f5f5', u'subnetpool_id': None, u'name': u'dogfood-vxlan-8000-sub'} _bind_centralized_snat_port_on_dvr_subnet /opt/pf9/pf9-neutron/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py:553
This causes a whole slew of problems - packets are sent into network
infrastructure with src MAC of the local DVR mac, causing this MAC to
flap on remote hosts' br-int between patch cable and qr interface. If
we shut the snat host's interfaces or bring the host down, the dvr MAC
stops flapping on br-int on other hosts, and network connectivity is
restored.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1783470/+subscriptions
Follow ups