← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1837075] Re: Evacuation takes too long when destination host has a large number of NICs

 

Reviewed:  https://review.opendev.org/671471
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=30d8159d4ee51a26a03de1cb134ea64c6c07ffb2
Submitter: Zuul
Branch:    master

commit 30d8159d4ee51a26a03de1cb134ea64c6c07ffb2
Author: Artom Lifshitz <alifshit@xxxxxxxxxx>
Date:   Fri Jul 19 11:35:24 2019 -0400

    libvirt: move checking CONF.my_ip to init_host()
    
    Migrations use the libvirt driver's get_host_ip_addr() method to
    determine the dest_host field of the migration object.
    get_host_ip_addr() checks whether CONF.my_ip is actually assigned to
    one of the host's interfaces. It does so by calling
    get_machine_ips(), which iterates over all of the host's interfaces.
    If the host has many interfaces, this can take a long time, and
    introduces needless delays in processing the migration.
    get_machine_ips() is only used to print a warning, so this patch moves
    the get_machine_ips() call to a single method in init_host(). This
    way, a warning is still emitted at compute service startup, and
    migration progress is not needlessly slowed down.
    
    This patch also has a chicken and egg problem with the patch on top of
    it, which poisons use of netifaces.interfaces() in tests. While this
    patch fixes all the tests that break with that poison, it starts
    breaking different tests because of the move of get_machine_ips() into
    init_host(). Therefore, while not directly related to the bug, this
    patch also preventatively mocks or stubs out any use of
    get_machine_ips() that will get poisoned with the subsequent patch.
    
    Closes-bug: 1837075
    Change-Id: I58a4038b04d5a9c28927d914e71609e4deea3d9f


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1837075

Title:
  Evacuation takes too long when destination host has a large number of
  NICs

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Description
  ===========

  Evacuation takes a long time if the destination host a large number of
  network interfaces.

  Steps to reproduce
  ==================

  1. Have a host down, or force it down.

  2. Evacuate instances to a host with a large number of network
  interfaces.

  Expected result
  ===============

  Evacuation completes in a reasonable time frame.

  Actual result
  =============

  Evacuation takes too long.

  Additional info
  ===============

  This was initially reported against OSP10/Newton [1]. In that case,
  based on the included sosreports, the compute host has 1324 network
  interfaces, and 109 instances are being evacuated. That means in
  total, there's 109 * 1324 = 144316 iterations over the loop in
  get_machine_ips().

  [1] https://bugzilla.redhat.com/show_bug.cgi?id=1709400

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1837075/+subscriptions


References