← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1813787] [NEW] [L3] DVR router in compute node was not up but nova port needs its functionality

 

Public bug reported:

There is a race condition between nova-compute boots instance and l3-agent processes DVR (local) router in compute node.
This issue can be seen when a large number of instances were booted to one same host, and instances are under different DVR router.
So the l3-agent will concurrently process all these dvr router in this host at the same time.
Although we have a green pool for the router ResourceProcessingQueue with 8 greenlet,
https://github.com/openstack/neutron/blob/master/neutron/agent/l3/agent.py#L642
some of these routers can still be waiting, event worse thing is that there are time-consuming actions during the router processing procedure.
For instance, installing arp entrys, iptables rules, route rules etc.
So when the VM is up, it will try to get meta via the local proxy hosting by the dvr router. But the router is not ready yet in that host.
And finally those instances will not be able to setup some config in the guest OS.

Some potential solutions:
(1) increase that green pool room
(2) still (provisioning) block the VM port to be set to ACTIVE until the dvr router is up in that host for the first one.

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1813787

Title:
  [L3] DVR router in compute node was not up but nova port needs its
  functionality

Status in neutron:
  New

Bug description:
  There is a race condition between nova-compute boots instance and l3-agent processes DVR (local) router in compute node.
  This issue can be seen when a large number of instances were booted to one same host, and instances are under different DVR router.
  So the l3-agent will concurrently process all these dvr router in this host at the same time.
  Although we have a green pool for the router ResourceProcessingQueue with 8 greenlet,
  https://github.com/openstack/neutron/blob/master/neutron/agent/l3/agent.py#L642
  some of these routers can still be waiting, event worse thing is that there are time-consuming actions during the router processing procedure.
  For instance, installing arp entrys, iptables rules, route rules etc.
  So when the VM is up, it will try to get meta via the local proxy hosting by the dvr router. But the router is not ready yet in that host.
  And finally those instances will not be able to setup some config in the guest OS.

  Some potential solutions:
  (1) increase that green pool room
  (2) still (provisioning) block the VM port to be set to ACTIVE until the dvr router is up in that host for the first one.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1813787/+subscriptions