← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2119578] [NEW] _validate_instance_group_policy() no longer raises RescheduledException

 

Public bug reported:

Noticed this while working on unrelated patches.

The _validate_instance_group_policy() method used for the late affinity
check on compute hosts originally raised RescheduledException when
affinity policy was violated.

This was however changed in 2023.2 (Bobcat) and backported to Yoga as
part of a different bug fix [1][2]:

-                    msg = _("Affinity instance group policy was violated.")
-                    raise exception.RescheduledException(
-                            instance_uuid=instance.uuid,
-                            reason=msg)
+                    raise exception.GroupAffinityViolation(
+                        instance_uuid=instance.uuid, policy='Affinity')

There is a lot of error handling code in the compute manager that
assumes _validate_instance_group_policy() will raise
RescheduledException that will no longer work as intended because they
are not handling the new exception GroupAffinityViolation.

I'm opening this bug to capture the issue. I think an audit will be
needed to identify all of the places that need to be fixed and also add
test coverage because obviously this problem was not caught by existing
test coverage.


[1] https://github.com/openstack/nova/commit/56d320a203a13f262a2e94e491af222032e453d3
[2] https://review.opendev.org/q/I2ba035c09ace20e9835d9d12a5c5bee17d616718

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: compute

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2119578

Title:
  _validate_instance_group_policy() no longer raises
  RescheduledException

Status in OpenStack Compute (nova):
  New

Bug description:
  Noticed this while working on unrelated patches.

  The _validate_instance_group_policy() method used for the late
  affinity check on compute hosts originally raised RescheduledException
  when affinity policy was violated.

  This was however changed in 2023.2 (Bobcat) and backported to Yoga as
  part of a different bug fix [1][2]:

  -                    msg = _("Affinity instance group policy was violated.")
  -                    raise exception.RescheduledException(
  -                            instance_uuid=instance.uuid,
  -                            reason=msg)
  +                    raise exception.GroupAffinityViolation(
  +                        instance_uuid=instance.uuid, policy='Affinity')

  There is a lot of error handling code in the compute manager that
  assumes _validate_instance_group_policy() will raise
  RescheduledException that will no longer work as intended because they
  are not handling the new exception GroupAffinityViolation.

  I'm opening this bug to capture the issue. I think an audit will be
  needed to identify all of the places that need to be fixed and also
  add test coverage because obviously this problem was not caught by
  existing test coverage.

  
  [1] https://github.com/openstack/nova/commit/56d320a203a13f262a2e94e491af222032e453d3
  [2] https://review.opendev.org/q/I2ba035c09ace20e9835d9d12a5c5bee17d616718

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2119578/+subscriptions