yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #31329
[Bug 1438969] Re: Newly created DVR router as a result of new VM does not get ARP neighbors update, new VM has no connectivity
** Changed in: neutron
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1438969
Title:
Newly created DVR router as a result of new VM does not get ARP
neighbors update, new VM has no connectivity
Status in OpenStack Neutron (virtual network service):
Fix Released
Bug description:
Create a DVR router, connect it to subnet 'red'. Create a new VM
connected to said subnet on a hypervisor that is currently not hosting
any ports on the subnet. Before creation of the VM, the DVR qrouter
namespace is does not exist. After the creation of the VM, it does.
However, observing
neutron/db/l3_dvrscheduler_db._notify_l3_agent_new_port, we can see
that L3 agents are first notified of new VMs going up for ARP purposes
(l3plugin.dvr_vmarp_table_update), and then L3 agents are notified of
the new VM (l3plugin.dvr_update_router_addvm, which also schedules the
router on the new VM's node). This means that in the ARP notifier, the
router is not scheduled yet on the new hypervisor, so the ARP entries
for that subnet will not be sent to it. We can confirm this by seeing
that the newly created router has no permanent entries in 'ip neigh'.
Reversing the order of notifications (First notify L3 agents of the
new VM, then send the ARP notification) is not a good fix because it's
raceful... It only guarantees that the notification to configure the
router is sent before the ARP RPC message, but not the actual
configuration of the router. This results in the ARP RPC message
failing to complete the 'ip neigh' command because the router doesn't
exist yet.
Apparently the ARP entries in the qrouter namespace are not an
optimization, they're mandatory. If a distributed router doesn't have
an ARP entry for a remote VM, and it sends an ARP request, it won't be
answered. I found out that the lack of static ARP entries in the
distributed qrouter namespaces was an issue in the scenario above: The
first VMs on compute nodes connected via distributed routers won't be
able to ping each other.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1438969/+subscriptions
References