← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1483601] Re: l2 population failed when bulk live migrate VMs

 

Reviewed:  https://review.openstack.org/215467
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=c5fa665de3173f3ad82cc3e7624b5968bc52c08d
Submitter: Jenkins
Branch:    master

commit c5fa665de3173f3ad82cc3e7624b5968bc52c08d
Author: shihanzhang <shihanzhang@xxxxxxxxxx>
Date:   Fri Aug 21 09:51:59 2015 +0800

    ML2: update port's status to DOWN if its binding info has changed
    
    This fixes the problem that when two or more ports in a network
    are migrated to a host that did not previously have any ports in
    the same network, the new host is sometimes not told about the
    IP/MAC addresses of all the other ports in the network. In other
    words, initial L2population does not work, for the new host.
    
    This is because the l2pop mechanism driver only sends catch-up
    information to the host when it thinks it is dealing with the first
    active port on that host; and currently, when multiple ports are
    migrated to a new host, there is always more than one active port so
    the condition above is never triggered.
    
    The fix is for the ML2 plugin to set a port's status to DOWN when
    its binding info changes.
    
    This patch also fixes the bug when nova thinks it should not wait
    for any events from neutron because all ports are already active.
    
    Closes-bug: #1483601
    Closes-bug: #1443421
    Closes-Bug: #1522824
    Related-Bug: #1450604
    
    Change-Id: I342ad910360b21085316c25df2154854fd1001b2


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1483601

Title:
  l2 population failed when bulk live migrate VMs

Status in neutron:
  Fix Released

Bug description:
  when we bulk live migrate VMs, the l2 population may possiblly(not always) failed at destination compute nodes, because when nova migrate VM at destination compute node, it just update port's binding:host,  the port's status is still active, from neutron perspective, the progress of port status is : active -> build -> active,
  in bellow case, l2 population  will fail:
  1. nova successfully live migrate vm A and VM B from compute A to compute B.
  2. port A and port B status are active,  binding:host are compute B .
  3. l2 agent scans these two port, then handle them one by one.
  4. neutron-server firstly handle port A, its status will be build(remember port B status is still active), and do bellow check
  in l2 population check,  this check will be fail

  def _update_port_up(self, context):
          ......
    if agent_active_ports == 1 or (self.get_agent_uptime(agent) < cfg.CONF.l2pop.agent_boot_time):
    # First port activated on current agent in this network,
    # we have to provide it with the whole list of fdb entries

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1483601/+subscriptions


References