← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1641750] [NEW] PCI devices are sometime not freed after a migration

 

Public bug reported:

Description
===========

During stress testing of cold migration, it has been observed that
sometimes the PCI devices are not freed by the resource tracker on the
source node.

If on the source node the periodic resource audit kicks-in in the middle
of the migration, the instance uuid is moved from tracked_migrations to
tracked_instances.  In which case the PCI devices won't get freed
because the current logic in the code only cares about tracked_migration
(see
https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L355).

Steps to reproduce
==================

1) Boot a guest with a SR-IOV device.
2) Migrate and confirm the migration
3) Repeat 2 over and over

Expected result
===============

In this case the PCI devices will only get freed on the next periodic
audit.  For PCI resources such as PCI passthrough, those are limited in
number and should be freed right away.

Actual result
=============

The PCI devices are not freed during the confirm_resize stage.

Environment
===========

$ git log -1
commit 633c817de5a67e798d8610d0df1135e5a568fd8a
Author: Matt Riedemann <mriedem@xxxxxxxxxx>
Date:   Sat Nov 12 11:59:13 2016 -0500

    api-ref: fix server_id in metadata docs
    
    The api-ref was saying that the server_id was in the body of the
    server metadata requests but it's actually in the path for all
    of the requests.
    
    Change-Id: Icdecd980767f89ee5fcc5bdd4802b2c263268a26
    Closes-Bug: #1641331

** Affects: nova
     Importance: Undecided
     Assignee: Ludovic Beliveau (ludovic-beliveau)
         Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1641750

Title:
  PCI devices are sometime not freed after a migration

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Description
  ===========

  During stress testing of cold migration, it has been observed that
  sometimes the PCI devices are not freed by the resource tracker on the
  source node.

  If on the source node the periodic resource audit kicks-in in the
  middle of the migration, the instance uuid is moved from
  tracked_migrations to tracked_instances.  In which case the PCI
  devices won't get freed because the current logic in the code only
  cares about tracked_migration (see
  https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L355).

  Steps to reproduce
  ==================

  1) Boot a guest with a SR-IOV device.
  2) Migrate and confirm the migration
  3) Repeat 2 over and over

  Expected result
  ===============

  In this case the PCI devices will only get freed on the next periodic
  audit.  For PCI resources such as PCI passthrough, those are limited
  in number and should be freed right away.

  Actual result
  =============

  The PCI devices are not freed during the confirm_resize stage.

  Environment
  ===========

  $ git log -1
  commit 633c817de5a67e798d8610d0df1135e5a568fd8a
  Author: Matt Riedemann <mriedem@xxxxxxxxxx>
  Date:   Sat Nov 12 11:59:13 2016 -0500

      api-ref: fix server_id in metadata docs
      
      The api-ref was saying that the server_id was in the body of the
      server metadata requests but it's actually in the path for all
      of the requests.
      
      Change-Id: Icdecd980767f89ee5fcc5bdd4802b2c263268a26
      Closes-Bug: #1641331

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1641750/+subscriptions


Follow ups