yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #79346
[Bug 1837075] Re: Evacuation takes too long when destination host has a large number of NICs
Reviewed: https://review.opendev.org/671471
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=30d8159d4ee51a26a03de1cb134ea64c6c07ffb2
Submitter: Zuul
Branch: master
commit 30d8159d4ee51a26a03de1cb134ea64c6c07ffb2
Author: Artom Lifshitz <alifshit@xxxxxxxxxx>
Date: Fri Jul 19 11:35:24 2019 -0400
libvirt: move checking CONF.my_ip to init_host()
Migrations use the libvirt driver's get_host_ip_addr() method to
determine the dest_host field of the migration object.
get_host_ip_addr() checks whether CONF.my_ip is actually assigned to
one of the host's interfaces. It does so by calling
get_machine_ips(), which iterates over all of the host's interfaces.
If the host has many interfaces, this can take a long time, and
introduces needless delays in processing the migration.
get_machine_ips() is only used to print a warning, so this patch moves
the get_machine_ips() call to a single method in init_host(). This
way, a warning is still emitted at compute service startup, and
migration progress is not needlessly slowed down.
This patch also has a chicken and egg problem with the patch on top of
it, which poisons use of netifaces.interfaces() in tests. While this
patch fixes all the tests that break with that poison, it starts
breaking different tests because of the move of get_machine_ips() into
init_host(). Therefore, while not directly related to the bug, this
patch also preventatively mocks or stubs out any use of
get_machine_ips() that will get poisoned with the subsequent patch.
Closes-bug: 1837075
Change-Id: I58a4038b04d5a9c28927d914e71609e4deea3d9f
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1837075
Title:
Evacuation takes too long when destination host has a large number of
NICs
Status in OpenStack Compute (nova):
Fix Released
Bug description:
Description
===========
Evacuation takes a long time if the destination host a large number of
network interfaces.
Steps to reproduce
==================
1. Have a host down, or force it down.
2. Evacuate instances to a host with a large number of network
interfaces.
Expected result
===============
Evacuation completes in a reasonable time frame.
Actual result
=============
Evacuation takes too long.
Additional info
===============
This was initially reported against OSP10/Newton [1]. In that case,
based on the included sosreports, the compute host has 1324 network
interfaces, and 109 instances are being evacuated. That means in
total, there's 109 * 1324 = 144316 iterations over the loop in
get_machine_ips().
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1709400
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1837075/+subscriptions
References