← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1996732] Re: Late affinity check failre counted as failed build

 

Reviewed:  https://review.opendev.org/c/openstack/nova/+/873216
Committed: https://opendev.org/openstack/nova/commit/56d320a203a13f262a2e94e491af222032e453d3
Submitter: "Zuul (22348)"
Branch:    master

commit 56d320a203a13f262a2e94e491af222032e453d3
Author: Yusuke Okada <okada.yusuke@xxxxxxxxxxx>
Date:   Wed Feb 8 22:10:31 2023 -0500

    Fix failed count for anti-affinity check
    
    The late anti-affinity check runs in the compute manager to avoid
    parallel scheduling requests to invalidate the anti-affinity server
    group policy. When the check fails the instance is re-scheduled.
    However this failure counted as a real instance boot failure of the
    compute host and can lead to de-prioritization of the compute host
    in the scheduler via BuildFailureWeigher. As the late anti-affinity
    check does not indicate any fault of the compute host itself it
    should not be counted towards the build failure counter.
    This patch adds new build results to handle this case.
    
    Closes-Bug: #1996732
    Change-Id: I2ba035c09ace20e9835d9d12a5c5bee17d616718
    Signed-off-by: Yusuke Okada <okada.yusuke@xxxxxxxxxxx>


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1996732

Title:
  Late affinity check failre counted as failed build

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  The late anti-affinity checks runs in the compute manager to avoid
  parallel scheduling requests to invalidate the anti-affinity server
  group policy. When the check fails the instance is re-scheduled.
  However this failure counted as a real instance boot failure of the
  compute host[1][2][3] and can lead to de-prioritization of the compute
  host in the scheduler via BuildFailureWeigher[4]. As the late anti-
  affinity check is does not indicate any fault of the compute host
  itself it should not be counted towards the build failure counter.

  [1] https://github.com/openstack/nova/blob/2eb358cdcec36fcfe5388ce6982d2961ca949d0a/nova/compute/manager.py#L2496
  [2] https://github.com/openstack/nova/blob/2eb358cdcec36fcfe5388ce6982d2961ca949d0a/nova/compute/manager.py#L1808
  [3] https://github.com/openstack/nova/blob/2eb358cdcec36fcfe5388ce6982d2961ca949d0a/nova/compute/manager.py#L2265
  [4] https://docs.openstack.org/nova/latest/configuration/config.html#compute.consecutive_build_service_disable_threshold

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1996732/+subscriptions



References