← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1746863] Re: scheduler affinity doesn't work with multiple cells

 

Reviewed:  https://review.openstack.org/540258
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=14f4c502f92b10b669044e5069ac3b3555a42ee0
Submitter: Zuul
Branch:    master

commit 14f4c502f92b10b669044e5069ac3b3555a42ee0
Author: melanie witt <melwittt@xxxxxxxxx>
Date:   Fri Feb 2 05:41:20 2018 +0000

    Make scheduler.utils.setup_instance_group query all cells
    
    To check affinity and anti-affinity policies for scheduling instances,
    we use the RequestSpec.instance_group.hosts field to check the hosts
    that have group members on them. Access of the 'hosts' field calls
    InstanceGroup.get_hosts during a lazy-load and get_hosts does a query
    for all instances that are members of the group and returns their hosts
    after removing duplicates. The InstanceList query isn't targeting any
    cells, so it will return [] in a multi-cell environment in both the
    instance create case and the instance move case. In the move case, we
    do have a cell-targeted RequestContext when setup_instance_group is
    called *but* the RequestSpec.instance_group object is queried early in
    compute/api before we're targeted to a cell, so a call of
    RequestSpec.instance_group.get_hosts() will result in [] still, even
    for move operations.
    
    This makes setup_instance_group query all cells for instances that are
    members of the instance group if the RequestContext is untargeted, else
    it queries the targeted cell for the instances.
    
    Closes-Bug: #1746863
    
    Change-Id: Ia5f5a0d75953b1154a8de3e1eaa15f8042e32d77


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1746863

Title:
  scheduler affinity doesn't work with multiple cells

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) pike series:
  Confirmed
Status in OpenStack Compute (nova) queens series:
  Confirmed
Status in OpenStack Compute (nova) rocky series:
  New

Bug description:
  I happened upon this while hacking on my WIP CellDatabases fixture
  patch.

  Some of the nova/tests/functional/test_server_group.py tests started
  failing with multiple cells and I found that it's because there's a
  database query 'objects.InstanceList.get_by_filters' for all instances
  who are members of the server group to do the affinity check. The
  query for instances doesn't check all cells, so it fails to return any
  hosts that group members are currently on.

  This makes the ServerGroup[Anti|]AffinityFilter a no-op for multiple
  cells. Affinity is checked again via the late-affinity check in
  compute, but compute is using the same InstanceGroup.get_hosts method
  and will only find group member's hosts that are in its cell.

  This is the code that populates the RequestSpec.instance_group.hosts via a
  lazy-load on first access:

  nova/objects/instance_group.py:

      def obj_load_attr(self, attrname):
          ...
          self.hosts = self.get_hosts()
          self.obj_reset_changes(['hosts'])

      ...

      @base.remotable
      def get_hosts(self, exclude=None):
          """Get a list of hosts for non-deleted instances in the group
          This method allows you to get a list of the hosts where instances in
          this group are currently running.  There's also an option to exclude
          certain instance UUIDs from this calculation.
          """
          filter_uuids = self.members
          if exclude:
              filter_uuids = set(filter_uuids) - set(exclude)
          filters = {'uuid': filter_uuids, 'deleted': False}
          instances = objects.InstanceList.get_by_filters(self._context,
                                                          filters=filters)
          return list(set([instance.host for instance in instances
                           if instance.host]))

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1746863/+subscriptions


References