yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #90385
[Bug 1996732] [NEW] Late affinity check failre counted as failed build
Public bug reported:
The late anti-affinity checks runs in the compute manager to avoid
parallel scheduling requests to invalidate the anti-affinity server
group policy. When the check fails the instance is re-scheduled. However
this failure counted as a real instance boot failure of the compute
host[1][2][3] and can lead to de-prioritization of the compute host in
the scheduler via BuildFailureWeigher[4]. As the late anti-affinity
check is does not indicate any fault of the compute host itself it
should not be counted towards the build failure counter.
[1] https://github.com/openstack/nova/blob/2eb358cdcec36fcfe5388ce6982d2961ca949d0a/nova/compute/manager.py#L2496
[2] https://github.com/openstack/nova/blob/2eb358cdcec36fcfe5388ce6982d2961ca949d0a/nova/compute/manager.py#L1808
[3] https://github.com/openstack/nova/blob/2eb358cdcec36fcfe5388ce6982d2961ca949d0a/nova/compute/manager.py#L2265
[4] https://docs.openstack.org/nova/latest/configuration/config.html#compute.consecutive_build_service_disable_threshold
** Affects: nova
Importance: Undecided
Status: New
** Tags: compute scheduler
** Tags added: compute scheduler
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1996732
Title:
Late affinity check failre counted as failed build
Status in OpenStack Compute (nova):
New
Bug description:
The late anti-affinity checks runs in the compute manager to avoid
parallel scheduling requests to invalidate the anti-affinity server
group policy. When the check fails the instance is re-scheduled.
However this failure counted as a real instance boot failure of the
compute host[1][2][3] and can lead to de-prioritization of the compute
host in the scheduler via BuildFailureWeigher[4]. As the late anti-
affinity check is does not indicate any fault of the compute host
itself it should not be counted towards the build failure counter.
[1] https://github.com/openstack/nova/blob/2eb358cdcec36fcfe5388ce6982d2961ca949d0a/nova/compute/manager.py#L2496
[2] https://github.com/openstack/nova/blob/2eb358cdcec36fcfe5388ce6982d2961ca949d0a/nova/compute/manager.py#L1808
[3] https://github.com/openstack/nova/blob/2eb358cdcec36fcfe5388ce6982d2961ca949d0a/nova/compute/manager.py#L2265
[4] https://docs.openstack.org/nova/latest/configuration/config.html#compute.consecutive_build_service_disable_threshold
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1996732/+subscriptions
Follow ups