← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1513678] Re: At scale router scheduling takes a long time with DVR routers with multiple compute nodes hosting thousands of VMs

 

Reviewed:  https://review.openstack.org/242286
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=411e6ff1570f9508424eb985201943e881084d7a
Submitter: Jenkins
Branch:    master

commit 411e6ff1570f9508424eb985201943e881084d7a
Author: Swaminathan Vasudevan <swaminathan.vasudevan@xxxxxx>
Date:   Thu Nov 5 17:00:49 2015 -0800

    Tune _get_candidates for faster scheduling in dvr
    
    Right now we have seen some performance issues when
    dvr routers are scheduled on multiple compute nodes
    with thousands of VMs on the routed subnets.
    
    The _get_candidates call get_l3_agent_candidates with
    a complete list of agents irrespective of the routers
    already hosted on the agents or not.
    
    So this fix will reduce the amount of iterations that
    get_l3_agent_candidates need to process for all the
    agents and would increase the control plane performance.
    
    Closes-Bug: #1513678
    Change-Id: I8f781d4cbc996ce13441303c9296e4f6ec822b94


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1513678

Title:
  At scale router scheduling takes a long time with DVR routers with
  multiple compute nodes hosting thousands of VMs

Status in neutron:
  Fix Released

Bug description:
  At scale when we have 100s of compute Node and 1000s of VM in networks that are routed by Distributed Virtual Router, we are seeing a control plane performance issue.
  It takes a while for all the routers to be schedule in the Nodes.

  The _schedule_router calls _get_candidates, and it internally calls
  get_l3_agent_candidates. In the case of the DVR Routers, all the
  active agents are passed to the get_l3_agent_candidates which iterates
  through the agents and for each agent it tries to find out if there
  are any dvr_service ports available in the routed subnet.

  This might be taking lot more time.

  So we need to figure out the issue and reduce the time taken for the
  scheduling.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1513678/+subscriptions


References