← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1838541] Re: Spurious warnings in compute logs while building/unshelving an instance: Instance cf1dc8a6-48fe-42fd-90a7-d352c58e1454 is not being actively managed by this compute host but has allocations referencing this compute host: {u'resources': {u'VCPU': 1, u'MEMORY_MB': 64}}. Skipping heal of allocation because we do not know what to do.

 

Reviewed:  https://review.opendev.org/673873
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ad4ef5af8bf7dc34caef21b0062ef9cc504bc216
Submitter: Zuul
Branch:    master

commit ad4ef5af8bf7dc34caef21b0062ef9cc504bc216
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date:   Wed Jul 31 12:41:00 2019 -0400

    rt: soften warning case in _remove_deleted_instances_allocations
    
    During an instance_claim during initial server create or unshelve,
    the instance.host/node values will be set before the resource
    tracker has the instance in the tracked_instances dict. If
    _remove_deleted_instances_allocations is called with the instance
    before it's being tracked, a warning like this is logged:
    
      Jul 31 13:12:57.455904 ubuntu-bionic-rax-iad-0009534722
      nova-compute[31951]: WARNING nova.compute.resource_tracker
      [None req-d6f2ae97-d8f7-46f6-8974-b42aeb58302d None None]
      Instance 227c23cd-aeb2-4b5a-b001-21bd920a5e77 is not being actively
      managed by this compute host but has allocations referencing this
      compute host: {u'resources': {u'MEMORY_MB': 64, u'VCPU': 1,
      u'DISK_GB': 1}}. Skipping heal of allocation because we do not know
      what to do.
    
    This shows up quite frequently in CI runs (see the bug report for
    a logstash query) which means it should not be a warning.
    
    This change checks the instance task_state and if set then we only
    log a debug message rather than the warning since we can assume we
    are racing and the task will correct itself upon completion.
    
    Change-Id: I6db8bea6761b68c39e6332d4698d1f8312863758
    Closes-Bug: #1838541


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1838541

Title:
  Spurious warnings in compute logs while building/unshelving an
  instance: Instance cf1dc8a6-48fe-42fd-90a7-d352c58e1454 is not being
  actively managed by this compute host but has allocations referencing
  this compute host: {u'resources': {u'VCPU': 1, u'MEMORY_MB': 64}}.
  Skipping heal of allocation because we do not know what to do.

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) queens series:
  Confirmed
Status in OpenStack Compute (nova) rocky series:
  Confirmed
Status in OpenStack Compute (nova) stein series:
  Confirmed

Bug description:
  This warning log from the ResourceTracker is logged quite a bit in CI
  runs:

  http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22is%20not%20being%20actively%20managed%20by%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d

  2601 hits in 7 days.

  Looking at one of these the warning shows up while spawning the
  instance during an unshelve operation. This is a possible race for the
  rt.instance_claim call because the instance.host/node are set here:

  https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L208

  before the instance would be added to the rt.tracked_instances dict
  started here:

  https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L217

  If the update_available_resource periodic task runs between those
  times, we'll call _remove_deleted_instances_allocations with the
  instance and it will have allocations on the node, created by the
  scheduler, but may not be in tracked_instances yet so we don't short-
  circuit here:

  https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L1339

  And hit the log condition here:

  https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L1397

  We should probably downgrade that warning to DEBUG if the instance
  task_state is set since clearly the instance is undergoing some state
  transition. We should log the task_state and only log the message as a
  warning if the instance does not have a task_state set but is also not
  tracked on the host.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1838541/+subscriptions


References