yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #74476
[Bug 1746863] Re: scheduler affinity doesn't work with multiple cells
Reviewed: https://review.openstack.org/540258
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=14f4c502f92b10b669044e5069ac3b3555a42ee0
Submitter: Zuul
Branch: master
commit 14f4c502f92b10b669044e5069ac3b3555a42ee0
Author: melanie witt <melwittt@xxxxxxxxx>
Date: Fri Feb 2 05:41:20 2018 +0000
Make scheduler.utils.setup_instance_group query all cells
To check affinity and anti-affinity policies for scheduling instances,
we use the RequestSpec.instance_group.hosts field to check the hosts
that have group members on them. Access of the 'hosts' field calls
InstanceGroup.get_hosts during a lazy-load and get_hosts does a query
for all instances that are members of the group and returns their hosts
after removing duplicates. The InstanceList query isn't targeting any
cells, so it will return [] in a multi-cell environment in both the
instance create case and the instance move case. In the move case, we
do have a cell-targeted RequestContext when setup_instance_group is
called *but* the RequestSpec.instance_group object is queried early in
compute/api before we're targeted to a cell, so a call of
RequestSpec.instance_group.get_hosts() will result in [] still, even
for move operations.
This makes setup_instance_group query all cells for instances that are
members of the instance group if the RequestContext is untargeted, else
it queries the targeted cell for the instances.
Closes-Bug: #1746863
Change-Id: Ia5f5a0d75953b1154a8de3e1eaa15f8042e32d77
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1746863
Title:
scheduler affinity doesn't work with multiple cells
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) pike series:
Confirmed
Status in OpenStack Compute (nova) queens series:
Confirmed
Status in OpenStack Compute (nova) rocky series:
New
Bug description:
I happened upon this while hacking on my WIP CellDatabases fixture
patch.
Some of the nova/tests/functional/test_server_group.py tests started
failing with multiple cells and I found that it's because there's a
database query 'objects.InstanceList.get_by_filters' for all instances
who are members of the server group to do the affinity check. The
query for instances doesn't check all cells, so it fails to return any
hosts that group members are currently on.
This makes the ServerGroup[Anti|]AffinityFilter a no-op for multiple
cells. Affinity is checked again via the late-affinity check in
compute, but compute is using the same InstanceGroup.get_hosts method
and will only find group member's hosts that are in its cell.
This is the code that populates the RequestSpec.instance_group.hosts via a
lazy-load on first access:
nova/objects/instance_group.py:
def obj_load_attr(self, attrname):
...
self.hosts = self.get_hosts()
self.obj_reset_changes(['hosts'])
...
@base.remotable
def get_hosts(self, exclude=None):
"""Get a list of hosts for non-deleted instances in the group
This method allows you to get a list of the hosts where instances in
this group are currently running. There's also an option to exclude
certain instance UUIDs from this calculation.
"""
filter_uuids = self.members
if exclude:
filter_uuids = set(filter_uuids) - set(exclude)
filters = {'uuid': filter_uuids, 'deleted': False}
instances = objects.InstanceList.get_by_filters(self._context,
filters=filters)
return list(set([instance.host for instance in instances
if instance.host]))
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1746863/+subscriptions
References