yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1838541] Re: Spurious warnings in compute logs while building/unshelving an instance: Instance cf1dc8a6-48fe-42fd-90a7-d352c58e1454 is not being actively managed by this compute host but has allocations referencing this compute host: {u'resources': {u'VCPU': 1, u'MEMORY_MB': 64}}. Skipping heal of allocation because we do not know what to do.

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date: Wed, 31 Jul 2019 16:40:12 -0000
Reply-to: Bug 1838541 <1838541@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Technically this goes back to Pike but I'm not sure we care about fixing
it there at this point since Pike is in Extended Maintenance mode
upstream. Someone can backport it to stable/pike if they care to.

** Also affects: nova/stein
   Importance: Undecided
       Status: New

** Also affects: nova/queens
   Importance: Undecided
       Status: New

** Also affects: nova/rocky
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1838541

Title:
  Spurious warnings in compute logs while building/unshelving an
  instance: Instance cf1dc8a6-48fe-42fd-90a7-d352c58e1454 is not being
  actively managed by this compute host but has allocations referencing
  this compute host: {u'resources': {u'VCPU': 1, u'MEMORY_MB': 64}}.
  Skipping heal of allocation because we do not know what to do.

Status in OpenStack Compute (nova):
  In Progress
Status in OpenStack Compute (nova) queens series:
  Confirmed
Status in OpenStack Compute (nova) rocky series:
  Confirmed
Status in OpenStack Compute (nova) stein series:
  Confirmed

Bug description:
  This warning log from the ResourceTracker is logged quite a bit in CI
  runs:

  http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22is%20not%20being%20actively%20managed%20by%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d

  2601 hits in 7 days.

  Looking at one of these the warning shows up while spawning the
  instance during an unshelve operation. This is a possible race for the
  rt.instance_claim call because the instance.host/node are set here:

  https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L208

  before the instance would be added to the rt.tracked_instances dict
  started here:

  https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L217

  If the update_available_resource periodic task runs between those
  times, we'll call _remove_deleted_instances_allocations with the
  instance and it will have allocations on the node, created by the
  scheduler, but may not be in tracked_instances yet so we don't short-
  circuit here:

  https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L1339

  And hit the log condition here:

  https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L1397

  We should probably downgrade that warning to DEBUG if the instance
  task_state is set since clearly the instance is undergoing some state
  transition. We should log the task_state and only log the message as a
  warning if the instance does not have a task_state set but is also not
  tracked on the host.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1838541/+subscriptions

References

[Bug 1838541] [NEW] Spurious warnings in compute logs while building/unshelving an instance: Instance cf1dc8a6-48fe-42fd-90a7-d352c58e1454 is not being actively managed by this compute host but has allocations referencing this compute host: {u'resources': {u'VCPU': 1, u'MEMORY_MB': 64}}. Skipping heal of allocation because we do not know what to do.
From: Matt Riedemann, 2019-07-31