← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1853445] [NEW] pci dev_type type-PCI passthrough failed

 

Public bug reported:

Description
===========
pci dev_type type-PCI passthrough failed

Steps to reproduce
==================
compute has two pci device 04:00.0 and 04:00.1 as mellanox network card, 04:00.0 no sriov and 04:00.1 config sriov, max vf num is 8.
nova-scheduler config
alias={"vendor_id":"15b3","product_id":"1015","device_type":"type-PCI","numa_policy":"preferred","name":"mellanox-0"}
openstack flavor create --ram  1024 --disk 32 --vcpus 2 --property  "pci_passthrough:alias"="mellanox-0:1" passthrough-flavor 
openstack server create --flavor passthrough-flavor --image centos75-raw --network provider --wait passthrough-server
instance create failed

Analysze
==================
virsh #  nodedev-dumpxml pci_0000_04_00_0
<device>
  <name>pci_0000_04_00_0</name>
  <path>/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0</path>
  <parent>pci_0000_00_02_0</parent>
  <driver>
    <name>vfio-pci</name>
  </driver>
  <capability type='pci'>
    <domain>0</domain>
    <bus>4</bus>
    <slot>0</slot>
    <function>0</function>
    <product id='0x1015'>MT27710 Family [ConnectX-4 Lx]</product>
    <vendor id='0x15b3'>Mellanox Technologies</vendor>
    <capability type='virt_functions' maxCount='8'/>
    <iommuGroup number='48'>
      <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </iommuGroup>
    <numa node='0'/>
    <pci-express>
      <link validity='cap' port='0' speed='8' width='8'/>
      <link validity='sta' speed='8' width='8'/>
    </pci-express>
  </capability>
</device>


virsh #  nodedev-dumpxml pci_0000_04_00_1
<device>
  <name>pci_0000_04_00_1</name>
  <path>/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1</path>
  <parent>pci_0000_00_02_0</parent>
  <driver>
    <name>mlx5_core</name>
  </driver>
  <capability type='pci'>
    <domain>0</domain>
    <bus>4</bus>
    <slot>0</slot>
    <function>1</function>
    <product id='0x1015'>MT27710 Family [ConnectX-4 Lx]</product>
    <vendor id='0x15b3'>Mellanox Technologies</vendor>
    <capability type='virt_functions' maxCount='8'>
      <address domain='0x0000' bus='0x04' slot='0x01' function='0x2'/>
      <address domain='0x0000' bus='0x04' slot='0x01' function='0x3'/>
      <address domain='0x0000' bus='0x04' slot='0x01' function='0x4'/>
      <address domain='0x0000' bus='0x04' slot='0x01' function='0x5'/>
      <address domain='0x0000' bus='0x04' slot='0x01' function='0x6'/>
      <address domain='0x0000' bus='0x04' slot='0x01' function='0x7'/>
      <address domain='0x0000' bus='0x04' slot='0x02' function='0x0'/>
      <address domain='0x0000' bus='0x04' slot='0x02' function='0x1'/>
    </capability>
    <iommuGroup number='49'>
      <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
    </iommuGroup>
    <numa node='0'/>
    <pci-express>
      <link validity='cap' port='0' speed='8' width='8'/>
      <link validity='sta' speed='8' width='8'/>
    </pci-express>
  </capability>
</device>


virsh #  nodedev-dumpxml pci_0000_04_01_2
<device>
  <name>pci_0000_04_01_2</name>
  <path>/sys/devices/pci0000:00/0000:00:02.0/0000:04:01.2</path>
  <parent>pci_0000_00_02_0</parent>
  <driver>
    <name>mlx5_core</name>
  </driver>
  <capability type='pci'>
    <domain>0</domain>
    <bus>4</bus>
    <slot>1</slot>
    <function>2</function>
    <product id='0x1016'>MT27710 Family [ConnectX-4 Lx Virtual Function]</product>
    <vendor id='0x15b3'>Mellanox Technologies</vendor>
    <capability type='phys_function'>
      <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
    </capability>
    <iommuGroup number='58'>
      <address domain='0x0000' bus='0x04' slot='0x01' function='0x2'/>
    </iommuGroup>
    <numa node='0'/>
    <pci-express>
      <link validity='cap' port='0' speed='8' width='8'/>
      <link validity='sta' width='0'/>
    </pci-express>
  </capability>
</device>


both PCI and VF have type as virt_functions, this diffrence is PCI <capability> <capability/> has no <address><address/>, but PF has all vf's <address><address/>

let see nova/virt/libvirt/driver.py  function 
        def _get_device_type(cfgdev, pci_address):
            """Get a PCI device's device type.

            An assignable PCI device can be a normal PCI device,
            a SR-IOV Physical Function (PF), or a SR-IOV Virtual
            Function (VF). Only normal PCI devices or SR-IOV VFs
            are assignable, while SR-IOV PFs are always owned by
            hypervisor.
            """
            for fun_cap in cfgdev.pci_capability.fun_capability:
                if fun_cap.type == 'virt_functions':
                    return {
                        'dev_type': fields.PciDeviceType.SRIOV_PF,
                    }
                if (fun_cap.type == 'phys_function' and
                    len(fun_cap.device_addrs) != 0):
                    phys_address = "%04x:%02x:%02x.%01x" % (
                        fun_cap.device_addrs[0][0],
                        fun_cap.device_addrs[0][1],
                        fun_cap.device_addrs[0][2],
                        fun_cap.device_addrs[0][3])
                    result = {
                        'dev_type': fields.PciDeviceType.SRIOV_VF,
                        'parent_addr': phys_address,
                    }
                    parent_ifname = None
                    try:
                        parent_ifname = pci_utils.get_ifname_by_pci_address(
                            pci_address, pf_interface=True)
                    except exception.PciDeviceNotFoundById:
                        # NOTE(sean-k-mooney): we ignore this error as it
                        # is expected when the virtual function is not a NIC.
                        pass
                    if parent_ifname:
                        result['parent_ifname'] = parent_ifname
                    return result

            return {'dev_type': fields.PciDeviceType.STANDARD}


I think this is the key point:

                 if fun_cap.type == 'virt_functions':
                    return {
                        'dev_type': fields.PciDeviceType.SRIOV_PF,
                    }
PCI and PF are all virt_functions type

                    if len(fun_cap.device_addrs) != 0:
                        return {
                            'dev_type': fields.PciDeviceType.SRIOV_PF,
                        }
                    else:
                        return {'dev_type': fields.PciDeviceType.STANDARD}

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1853445

Title:
  pci dev_type type-PCI passthrough failed

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========
  pci dev_type type-PCI passthrough failed

  Steps to reproduce
  ==================
  compute has two pci device 04:00.0 and 04:00.1 as mellanox network card, 04:00.0 no sriov and 04:00.1 config sriov, max vf num is 8.
  nova-scheduler config
  alias={"vendor_id":"15b3","product_id":"1015","device_type":"type-PCI","numa_policy":"preferred","name":"mellanox-0"}
  openstack flavor create --ram  1024 --disk 32 --vcpus 2 --property  "pci_passthrough:alias"="mellanox-0:1" passthrough-flavor 
  openstack server create --flavor passthrough-flavor --image centos75-raw --network provider --wait passthrough-server
  instance create failed

  Analysze
  ==================
  virsh #  nodedev-dumpxml pci_0000_04_00_0
  <device>
    <name>pci_0000_04_00_0</name>
    <path>/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0</path>
    <parent>pci_0000_00_02_0</parent>
    <driver>
      <name>vfio-pci</name>
    </driver>
    <capability type='pci'>
      <domain>0</domain>
      <bus>4</bus>
      <slot>0</slot>
      <function>0</function>
      <product id='0x1015'>MT27710 Family [ConnectX-4 Lx]</product>
      <vendor id='0x15b3'>Mellanox Technologies</vendor>
      <capability type='virt_functions' maxCount='8'/>
      <iommuGroup number='48'>
        <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
      </iommuGroup>
      <numa node='0'/>
      <pci-express>
        <link validity='cap' port='0' speed='8' width='8'/>
        <link validity='sta' speed='8' width='8'/>
      </pci-express>
    </capability>
  </device>

  
  virsh #  nodedev-dumpxml pci_0000_04_00_1
  <device>
    <name>pci_0000_04_00_1</name>
    <path>/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1</path>
    <parent>pci_0000_00_02_0</parent>
    <driver>
      <name>mlx5_core</name>
    </driver>
    <capability type='pci'>
      <domain>0</domain>
      <bus>4</bus>
      <slot>0</slot>
      <function>1</function>
      <product id='0x1015'>MT27710 Family [ConnectX-4 Lx]</product>
      <vendor id='0x15b3'>Mellanox Technologies</vendor>
      <capability type='virt_functions' maxCount='8'>
        <address domain='0x0000' bus='0x04' slot='0x01' function='0x2'/>
        <address domain='0x0000' bus='0x04' slot='0x01' function='0x3'/>
        <address domain='0x0000' bus='0x04' slot='0x01' function='0x4'/>
        <address domain='0x0000' bus='0x04' slot='0x01' function='0x5'/>
        <address domain='0x0000' bus='0x04' slot='0x01' function='0x6'/>
        <address domain='0x0000' bus='0x04' slot='0x01' function='0x7'/>
        <address domain='0x0000' bus='0x04' slot='0x02' function='0x0'/>
        <address domain='0x0000' bus='0x04' slot='0x02' function='0x1'/>
      </capability>
      <iommuGroup number='49'>
        <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
      </iommuGroup>
      <numa node='0'/>
      <pci-express>
        <link validity='cap' port='0' speed='8' width='8'/>
        <link validity='sta' speed='8' width='8'/>
      </pci-express>
    </capability>
  </device>

  
  virsh #  nodedev-dumpxml pci_0000_04_01_2
  <device>
    <name>pci_0000_04_01_2</name>
    <path>/sys/devices/pci0000:00/0000:00:02.0/0000:04:01.2</path>
    <parent>pci_0000_00_02_0</parent>
    <driver>
      <name>mlx5_core</name>
    </driver>
    <capability type='pci'>
      <domain>0</domain>
      <bus>4</bus>
      <slot>1</slot>
      <function>2</function>
      <product id='0x1016'>MT27710 Family [ConnectX-4 Lx Virtual Function]</product>
      <vendor id='0x15b3'>Mellanox Technologies</vendor>
      <capability type='phys_function'>
        <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
      </capability>
      <iommuGroup number='58'>
        <address domain='0x0000' bus='0x04' slot='0x01' function='0x2'/>
      </iommuGroup>
      <numa node='0'/>
      <pci-express>
        <link validity='cap' port='0' speed='8' width='8'/>
        <link validity='sta' width='0'/>
      </pci-express>
    </capability>
  </device>

  
  both PCI and VF have type as virt_functions, this diffrence is PCI <capability> <capability/> has no <address><address/>, but PF has all vf's <address><address/>

  let see nova/virt/libvirt/driver.py  function 
          def _get_device_type(cfgdev, pci_address):
              """Get a PCI device's device type.

              An assignable PCI device can be a normal PCI device,
              a SR-IOV Physical Function (PF), or a SR-IOV Virtual
              Function (VF). Only normal PCI devices or SR-IOV VFs
              are assignable, while SR-IOV PFs are always owned by
              hypervisor.
              """
              for fun_cap in cfgdev.pci_capability.fun_capability:
                  if fun_cap.type == 'virt_functions':
                      return {
                          'dev_type': fields.PciDeviceType.SRIOV_PF,
                      }
                  if (fun_cap.type == 'phys_function' and
                      len(fun_cap.device_addrs) != 0):
                      phys_address = "%04x:%02x:%02x.%01x" % (
                          fun_cap.device_addrs[0][0],
                          fun_cap.device_addrs[0][1],
                          fun_cap.device_addrs[0][2],
                          fun_cap.device_addrs[0][3])
                      result = {
                          'dev_type': fields.PciDeviceType.SRIOV_VF,
                          'parent_addr': phys_address,
                      }
                      parent_ifname = None
                      try:
                          parent_ifname = pci_utils.get_ifname_by_pci_address(
                              pci_address, pf_interface=True)
                      except exception.PciDeviceNotFoundById:
                          # NOTE(sean-k-mooney): we ignore this error as it
                          # is expected when the virtual function is not a NIC.
                          pass
                      if parent_ifname:
                          result['parent_ifname'] = parent_ifname
                      return result

              return {'dev_type': fields.PciDeviceType.STANDARD}

  
  I think this is the key point:

                   if fun_cap.type == 'virt_functions':
                      return {
                          'dev_type': fields.PciDeviceType.SRIOV_PF,
                      }
  PCI and PF are all virt_functions type

                      if len(fun_cap.device_addrs) != 0:
                          return {
                              'dev_type': fields.PciDeviceType.SRIOV_PF,
                          }
                      else:
                          return {'dev_type': fields.PciDeviceType.STANDARD}

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1853445/+subscriptions


Follow ups