yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #29841
[Bug 1179223] Re: Retired GRE and VXLAN tunnels persists in neutron db
** Changed in: neutron
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1179223
Title:
Retired GRE and VXLAN tunnels persists in neutron db
Status in OpenStack Neutron (virtual network service):
Fix Released
Bug description:
Setup is multi-node, with per-tenant routers and gre or vxlan tunneling,
ovs or ML2, both affected.
SYMPTOM:
VM's are available on the external network for about 1-2 minutes,
after which point the connection times out and cannot be re-
established unless traffic is generated from the VM console. VMs with
dhcp interface settings will periodically and temporarily come back on
line after requesting new leases.
When I attempt to ping from the external network, I can trace the
traffic all the way to the tap interface on the compute node, where
the VM responds to the arp request sent by the tenant router (which is
on the separate network node). However, this arp reply never makes it
back to the tenant router. It seems to die at the GRE terminus at
bridge br-tun.
CAUSE:
* I have a three nics on my network node. The VM traffic goes out the
1st nic on 192.168.239.99/24 to the other compute nodes, while
management traffic goes out the 2nd nic on 192.168.241.99. The 3rd nic
is external and has no IP.
* I have four GRE endpoints on the VM network, one at the network node
(192.168.239.99) and three on compute nodes
(192.168.239.{110,114,115}), all with IDs 2-5.
* I have a fifth GRE endpoint with id 1 to 192.168.241.99, the network
node's management interface, on each of the compute nodes. This was
the first tunnel created when I deployed the network node because that
is how I set the remote_ip in the ovs plugin ini. I corrected the
setting later, but the 192.168.241.99 endpoint persists:
mysql> select * from ovs_tunnel_endpoints;
+-----------------+----+
| ip_address | id |
+-----------------+----+
| 192.168.239.110 | 3 |
| 192.168.239.114 | 4 |
| 192.168.239.115 | 5 |
| 192.168.239.99 | 2 |
| 192.168.241.99 | 1 | <======== HERE
+-----------------+----+
5 rows in set (0.00 sec)
* Thus, after plugin restarts or reboots, this endpoint is re-created
every time.
* The effect is that traffic from the VM has two possible flows from
which to make a routing/switching decision. I was unable to determine
how this decision is made, but obviously this is not a working
configuration. Traffic the originates from the VM always seems to use
the correct flow initially, but traffic which originates from the
network node is never returned via the right flow unless the
connection has been active within the previous 1-2 minutes. In both
cases, successful connections timeout after 1-2 minutes of inactivity.
SOLUTION:
mysql> delete from ovs_tunnel_endpoints where id = 1;
Query OK, 1 row affected (0.00 sec)
mysql> select * from ovs_tunnel_endpoints;
+-----------------+----+
| ip_address | id |
+-----------------+----+
| 192.168.239.110 | 3 |
| 192.168.239.114 | 4 |
| 192.168.239.115 | 5 |
| 192.168.239.99 | 2 |
+-----------------+----+
4 rows in set (0.00 sec)
* After doing that, I simply restarted the quantum ovs agents on the
network and compute nodes. The old GRE tunnel is not re-created.
Thereafter, VM network traffic to and from the external network
proceeds without incident.
* Should these tables be cleaned up as well, I wonder:
mysql> select * from ovs_network_bindings;
+--------------------------------------+--------------+------------------+-----------------+
| network_id | network_type | physical_network | segmentation_id |
+--------------------------------------+--------------+------------------+-----------------+
| 4e8aacca-8b38-40ac-a628-18cac3168fe6 | gre | NULL | 2 |
| af224f3f-8de6-4e0d-b043-6bcd5cb014c5 | gre | NULL | 1 |
+--------------------------------------+--------------+------------------+-----------------+
2 rows in set (0.00 sec)
mysql> select * from ovs_tunnel_allocations where allocated != 0;
+-----------+-----------+
| tunnel_id | allocated |
+-----------+-----------+
| 1 | 1 |
| 2 | 1 |
+-----------+-----------+
2 rows in set (0.00 sec)
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1179223/+subscriptions