← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1587386] Re: Unshelve results in duplicated resource deallocated

 

Reviewed:  https://review.openstack.org/323269
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f1320a7c2debf127a93773046adffb80563fd20b
Submitter: Jenkins
Branch:    master

commit f1320a7c2debf127a93773046adffb80563fd20b
Author: Stephen Finucane <stephen.finucane@xxxxxxxxx>
Date:   Mon May 30 16:03:35 2016 +0100

    Evaluate 'task_state' in resource (de)allocation
    
    There are two types of VM states associated with shelving. The first,
    'shelved' indicates that the VM has been powered off but the resources
    remain allocated on the hypervisor. The second, 'shelved_offloaded',
    indicates that the VM has been powered off and the resources freed.
    When "unshelving" VMs in the latter state, the VM state does not change
    from 'shelved_offloaded' until some time after the VM has been
    "unshelved".
    
    Change I83a5f06 introduced a change that allowed for deallocation of
    resources when they were set to the 'shelved_offloaded' state. However,
    the resource (de)allocation code path assumes any VM with a state of
    'shelved_offloaded' should have resources deallocated from it, rather
    than allocated to it. As the VM state has not changed when this code
    path is executed, resources are incorrectly deallocated from the
    instance twice.
    
    Enhance the aformentioned check to account for task state in addition to
    VM state. This ensures a VM that's still in 'shelved_offloaded' state,
    but is in fact being unshelved, does not trigger deallocation.
    
    Change-Id: Ie2e7b91937fc3d61bb1197fffc3549bebc65e8aa
    Signed-off-by: Stephen Finucane <stephen.finucane@xxxxxxxxx>
    Resolves-bug: #1587386
    Related-bug: #1545675


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1587386

Title:
  Unshelve results in duplicated resource deallocated

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Description
  ===========

  Shelve/unshelve operations fail when using "NFV flavors". This was
  reported on the mailing list initially.

  http://lists.openstack.org/pipermail/openstack-
  dev/2016-May/095631.html

  Steps to reproduce
  ==================

  1. Create a flavor with 'hw:numa_nodes=2', 'hw:cpu_policy=dedicated' and 'hw:mempage_size=large'
  2. Configure Tempest to use this new flavor
  3. Run Tempest tests

  Expected result
  ===============

  All tests will pass.

  Actual result
  =============

  The shelve/unshelve Tempest tests always result in a timeout exception 
  being raised, looking similar to the following, from [1]:

      Traceback (most recent call last):
        File "tempest/api/compute/base.py", line 166, in server_check_teardown
      cls.server_id, 'ACTIVE')
        File "tempest/common/waiters.py", line 95, in wait_for_server_status
          raise exceptions.TimeoutException(message)2016-05-22 22:25:30.697 13974 ERROR tempest.api.compute.base TimeoutException: Request timed out
      Details: (ServerActionsTestJSON:tearDown) Server cae6fd47-0968-4922-a03e-3f2872e4eb52 failed to reach ACTIVE status and task state "None" within the required time (196 s). Current status: SHELVED_OFFLOADED. Current task state: None.

  The following errors are raised in the compute logs:

      Traceback (most recent call last):
        File "/opt/stack/new/nova/nova/compute/manager.py", line 4230, in _unshelve_instance
          with rt.instance_claim(context, instance, limits):
        File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 271, in inner
          return f(*args, **kwargs)
        File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 151, in instance_claim
      self._update_usage_from_instance(context, instance_ref)
        File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 827, in _update_usage_from_instance
          self._update_usage(instance, sign=sign)
        File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 666, in _update_usage
          self.compute_node, usage, free)
        File "/opt/stack/new/nova/nova/virt/hardware.py", line 1482, in get_host_numa_usage_from_instance
          host_numa_topology, instance_numa_topology, free=free))
        File "/opt/stack/new/nova/nova/virt/hardware.py", line 1348, in numa_usage_from_instances
          newcell.unpin_cpus(pinned_cpus)
        File "/opt/stack/new/nova/nova/objects/numa.py", line 94, in unpin_cpus
          pinned=list(self.pinned_cpus))
      CPUPinningInvalid: Cannot pin/unpin cpus [6] from the following pinned set [0, 2, 4]

  [1] http://intel-openstack-ci-logs.ovh/86/319686/1/check/tempest-dsvm-
  full-nfv/b463722/testr_results.html.gz

  Environment
  ===========

  1. Exact version of OpenStack you are running. See the following

  Commit '25fdf64'.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1587386/+subscriptions


References