← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1691602] Re: live migration generates several network-changed events which lock up refreshing the nw info cache

 

Reviewed:  https://review.openstack.org/465783
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=bf8e6007cfa50d461790be325e9e97b8b396ae47
Submitter: Jenkins
Branch:    master

commit bf8e6007cfa50d461790be325e9e97b8b396ae47
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date:   Wed May 17 22:18:54 2017 -0400

    Send port ID in network-changed event to Nova
    
    When Nova gets a network-changed event, it rebuilds the
    entire network info cache for the instance if it does not
    have a specific port ID. This can be costly and redundant
    when performing something like a live migration with multiple
    ports attached to the same instance.
    
    This change simply adds the port ID to the network-changed event
    since we have it in scope. Nova can use it or not, but at least
    the information is provided for context.
    
    Change-Id: Ifdaef05208d09ddd9587fed6214cf388e5265ba4
    Closes-Bug: #1691602


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1691602

Title:
  live migration generates several network-changed events which lock up
  refreshing the nw info cache

Status in neutron:
  Fix Released
Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Chris Friesen has reported that in Newton with a live migration that
  has ~16 ports per instance, the "network-changed" events generated
  from neutron when the vifs are unplugged from the source host can
  effectively block the network info cache refresh that's called at the
  end of the live migration operation. Details are in the IRC logs:

  http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-
  nova.2017-05-17.log.html#t2017-05-17T22:50:31

  But this stands out:

  cfriesen        mriedem: so it looks like _build_network_info_model()
  costs about 200ms plus about 125ms per port since we query each port
  separatly from neutron.  and the refresh_cache lock is held the whole
  time

  In Nova the 'network-changed' event is handled generically because
  there is no port id in the event, so nova just refreshes the entire nw
  info cache on the instance - which can be expensive and redundant
  since it's doing a lot of queries to Neutron to build up information
  about ports, fixed IPs, floating IPs, subnets and networks, and
  Neutron doesn't have bulk query APIs or allow OR filters in the API
  for bulk queries on things like floating IPs.

  https://github.com/openstack/nova/blob/8d492c76d53f3fcfacdd945a277446bdfe6797b0/nova/compute/manager.py#L6854

  Looking in neutron's code that sends the network-changed event, there
  is a port in scope, it's just not sent like for network-vif-deleted
  events.

  We should be able to scope the network-changed event to a specific
  port on the neutron side and check for that on the nova side so we
  don't have to refresh the entire network info cache, but just the vif
  that was updated.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1691602/+subscriptions


References