← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1512880] [NEW] Failed cold migration with SR-IOV

 

Public bug reported:

Cold migration of an instance that has an SR-IOV interface fails to
migrate because on migrated compute's nova is trying to use the PCI
device/address that has been allocated from the incoming compute.
Obviously this is failing since the PCI device is not present on the
migrated compute.

See the error "libvirtError: Device 0000:83:10.6 not found: could not
access /sys/bus/pci/devices/0000:83:10.6/config: No such file or
directory" in the log in the attachment.

Nova should allocate a new PCI device based the hardware configuration
of the compute where the instance is being migrated and this PCI device
should be use to create the instance XML.

Nova version:
commit 2397d636ff6ea3767fe62ee681d609fce4fc98ca
Author: OpenStack Proposal Bot <openstack-infra@xxxxxxxxxxxxxxxxxxx>
Date:   Tue Oct 27 06:30:34 2015 +0000

    Imported Translations from Zanata
   
    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure
   
    Change-Id: I38f537e37972e5ddae13d388021412d85f6be898

Devstack setup:

* One server configured with controller and compute functions
    * Intel 10G port is configured with 8 VFs: $ echo 8 > /sys/bus/pci/devices/0000\:85\:00.0/sriov_numvfs
     * /etc/nova/nova.conf: pci_passthrough_whitelist = {"address":"*:85:10.*","physical_network":"default"}
* One server configured with compute function only
    * Intel 10G port is configured with 8 VFs: $ echo 8 > /sys/bus/pci/devices/0000\:83\:00.0/sriov_numvfs
     * /etc/nova/nova.conf: pci_passthrough_whitelist = {"address":"*:83:10.*","physical_network":"default"}
* Note that it is important for this test that the PCI addresses for the SR-IOV interfaces are different.  We want to validate that new PCI devices are claimed/allocated on the incoming compute.

Reproduce steps:

1) Boot an instance with an SR-IOV interface:

$ NETID=`neutron net-list | grep default | awk '{print $2}'`
$ neutron port-create $NETID --binding:vnic-type direct --name p-direct
$ PORTID=`neutron port-list | grep "p-direct" | awk '{print $2}'`
$ nova boot test --image=ubuntu --nic port-id=$PORTID --flavor=m1.small

2) Migrate the instance to the other compute:

$ nova migrate test

Expected result:

The instance is successfully migrated on the other server.

Actual result:

The instance failed to migrate.  Instance is stuck in error.  See log in
attachment for more information.

** Affects: nova
     Importance: Undecided
         Status: New

** Attachment added: "n-cpu.log"
   https://bugs.launchpad.net/bugs/1512880/+attachment/4512352/+files/n-cpu.log

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1512880

Title:
  Failed cold migration with SR-IOV

Status in OpenStack Compute (nova):
  New

Bug description:
  Cold migration of an instance that has an SR-IOV interface fails to
  migrate because on migrated compute's nova is trying to use the PCI
  device/address that has been allocated from the incoming compute.
  Obviously this is failing since the PCI device is not present on the
  migrated compute.

  See the error "libvirtError: Device 0000:83:10.6 not found: could not
  access /sys/bus/pci/devices/0000:83:10.6/config: No such file or
  directory" in the log in the attachment.

  Nova should allocate a new PCI device based the hardware configuration
  of the compute where the instance is being migrated and this PCI
  device should be use to create the instance XML.

  Nova version:
  commit 2397d636ff6ea3767fe62ee681d609fce4fc98ca
  Author: OpenStack Proposal Bot <openstack-infra@xxxxxxxxxxxxxxxxxxx>
  Date:   Tue Oct 27 06:30:34 2015 +0000

      Imported Translations from Zanata
     
      For more information about this automatic import see:
      https://wiki.openstack.org/wiki/Translations/Infrastructure
     
      Change-Id: I38f537e37972e5ddae13d388021412d85f6be898

  Devstack setup:

  * One server configured with controller and compute functions
      * Intel 10G port is configured with 8 VFs: $ echo 8 > /sys/bus/pci/devices/0000\:85\:00.0/sriov_numvfs
       * /etc/nova/nova.conf: pci_passthrough_whitelist = {"address":"*:85:10.*","physical_network":"default"}
  * One server configured with compute function only
      * Intel 10G port is configured with 8 VFs: $ echo 8 > /sys/bus/pci/devices/0000\:83\:00.0/sriov_numvfs
       * /etc/nova/nova.conf: pci_passthrough_whitelist = {"address":"*:83:10.*","physical_network":"default"}
  * Note that it is important for this test that the PCI addresses for the SR-IOV interfaces are different.  We want to validate that new PCI devices are claimed/allocated on the incoming compute.

  Reproduce steps:

  1) Boot an instance with an SR-IOV interface:

  $ NETID=`neutron net-list | grep default | awk '{print $2}'`
  $ neutron port-create $NETID --binding:vnic-type direct --name p-direct
  $ PORTID=`neutron port-list | grep "p-direct" | awk '{print $2}'`
  $ nova boot test --image=ubuntu --nic port-id=$PORTID --flavor=m1.small

  2) Migrate the instance to the other compute:

  $ nova migrate test

  Expected result:

  The instance is successfully migrated on the other server.

  Actual result:

  The instance failed to migrate.  Instance is stuck in error.  See log
  in attachment for more information.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1512880/+subscriptions


Follow ups