yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #92477
[Bug 1996732] Re: Late affinity check failre counted as failed build
Reviewed: https://review.opendev.org/c/openstack/nova/+/873216
Committed: https://opendev.org/openstack/nova/commit/56d320a203a13f262a2e94e491af222032e453d3
Submitter: "Zuul (22348)"
Branch: master
commit 56d320a203a13f262a2e94e491af222032e453d3
Author: Yusuke Okada <okada.yusuke@xxxxxxxxxxx>
Date: Wed Feb 8 22:10:31 2023 -0500
Fix failed count for anti-affinity check
The late anti-affinity check runs in the compute manager to avoid
parallel scheduling requests to invalidate the anti-affinity server
group policy. When the check fails the instance is re-scheduled.
However this failure counted as a real instance boot failure of the
compute host and can lead to de-prioritization of the compute host
in the scheduler via BuildFailureWeigher. As the late anti-affinity
check does not indicate any fault of the compute host itself it
should not be counted towards the build failure counter.
This patch adds new build results to handle this case.
Closes-Bug: #1996732
Change-Id: I2ba035c09ace20e9835d9d12a5c5bee17d616718
Signed-off-by: Yusuke Okada <okada.yusuke@xxxxxxxxxxx>
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1996732
Title:
Late affinity check failre counted as failed build
Status in OpenStack Compute (nova):
Fix Released
Bug description:
The late anti-affinity checks runs in the compute manager to avoid
parallel scheduling requests to invalidate the anti-affinity server
group policy. When the check fails the instance is re-scheduled.
However this failure counted as a real instance boot failure of the
compute host[1][2][3] and can lead to de-prioritization of the compute
host in the scheduler via BuildFailureWeigher[4]. As the late anti-
affinity check is does not indicate any fault of the compute host
itself it should not be counted towards the build failure counter.
[1] https://github.com/openstack/nova/blob/2eb358cdcec36fcfe5388ce6982d2961ca949d0a/nova/compute/manager.py#L2496
[2] https://github.com/openstack/nova/blob/2eb358cdcec36fcfe5388ce6982d2961ca949d0a/nova/compute/manager.py#L1808
[3] https://github.com/openstack/nova/blob/2eb358cdcec36fcfe5388ce6982d2961ca949d0a/nova/compute/manager.py#L2265
[4] https://docs.openstack.org/nova/latest/configuration/config.html#compute.consecutive_build_service_disable_threshold
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1996732/+subscriptions
References