yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1652335] Re: The scheduler returns limits even if the instance was boot using AZ forced flag

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date: Tue, 23 Apr 2019 22:02:31 -0000
Reply-to: Bug 1652335 <1652335@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

If you're forcing the host you're bypassing the scheduler filters and
that's why you're risking the over-subscription (and failure) of
resources. https://specs.openstack.org/openstack/nova-
specs/specs/train/approved/add-host-and-hypervisor-hostname-flag-to-
create-server.html will solve this, but there are other ways to still
request a specific host and run through the filters:

https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#jsonfilter

Anyway, I don't really consider this a bug worth fixing since there are
better alternatives than hacking the forced host mess we already have
(that's the problem that needs fixing, which is what that spec proposes
to do).

** Changed in: nova
       Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1652335

Title:
  The scheduler returns limits even if the instance was boot using AZ
  forced flag

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  We don't have a consistent behaviour when we're using the AZ hack for
  forcing a destination.

  To be clear, when calling the scheduler by adding a forced
  destination, we're not verifying the filters but we still return the
  limits from the HostState. Unfortunately, given those limits are only
  set by the corresponding filter, we return what we have in memory that
  doesn't really correspond to the usage request.

  Example : 
  Say I'm booting an instance first and then booting another instance with the AZ flag. In that case, the tuple returned by the scheduler will include the exisiting limits. For example, say that I've an allocation ratio for VCPUs of 1.0, then I got :

  [{'host': u'foo', 'nodename': u'bar', 'limits': {'vcpu': 1.0}}]

  Now, restart the scheduler service (so the corresponding HostState is
  recreated) and just boot an instance using the AZ hack, then we'll
  return :

  [{'host': u'foo', 'nodename': u'bar', 'limits': {}]

  That's very inconsistent as two requests can have very different behaviour. For example, say I'm running a compute with only one pCPU and an CPU allocation ratio of 1.0 :
   - in the first case above, the forced instance will ERROR
   - in the second case (ie. restart the scheduler and issue the exact same command), it will boot.

  Honestly, I just think we should just not return limits if the
  instance is having the AZ hack.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1652335/+subscriptions

References

[Bug 1652335] [NEW] The scheduler returns limits even if the instance was boot using AZ forced flag
From: Sylvain Bauza, 2016-12-23