← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1628301] Re: SR-IOV not working in Mitaka and Intel X series NIC

 

Adding Neutron since I believe the issue is the neutron-sriov-nic-agent
not building the port so that nova can allocate it for the instance.

** Also affects: neutron
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1628301

Title:
  SR-IOV not working in Mitaka and Intel X series NIC

Status in neutron:
  New
Status in OpenStack Compute (nova):
  New

Bug description:
  The SRIO functionality in Mitaka seems broken, all configuration
  options we evaluated lead to

   NovaException: Unexpected vif_type=binding_failed

  errors, stack following.
  We are currently using this code base, along with SRIOV configuration posted here

  Nova SHA 611efbe77c712d9ac35904f659d28dd0f0c1b3ff # HEAD of "stable/mitaka" as of 08.09.2016
  Neutron SHA c73269fa480a8a955f440570fc2fa6c347e3bb3c # HEAD of "stable/mitaka" as of 08.09.2016

  Stack :

  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] Traceback (most recent call last):
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File "/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/compute/manager.py", line 2218, in _build_resources
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]     yield resources
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File "/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/compute/manager.py", line 2064, in _build_and_run_instance
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]     block_device_info=block_device_info)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File "/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2776, in spawn
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]     write_to_disk=True)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File "/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4729, in _get_guest_xml
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]     context)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File "/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4595, in _get_guest_config
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]     flavor, virt_type, self._host)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File "/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/vif.py", line 447, in get_config
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]     _("Unexpected vif_type=%s") % vif_type)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] NovaException: Unexpected vif_type=binding_failed
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]

  Interestingly the nova resource tracker seem to be able to create a
  list of all available sriov devices and they show up correctly inside
  the database as pci_device table entries

  2016-09-27 16:13:52.175 10248 INFO nova.compute.resource_tracker [req-284a7832-3794-4597-b939-273ea75d45f7 - - - - -] Total usable vcpus: 32, total allocated vcpus: 0
  2016-09-27 16:13:52.175 10248 INFO nova.compute.resource_tracker [req-284a7832-3794-4597-b939-273ea75d45f7 - - - - -] Final resource view: name=compute01 phys_ram=257777
  MB used_ram=2048MB phys_disk=1935GB used_disk=2GB total_vcpus=32 used_vcpus=0 pci_stats=[PciDevicePool(count=15,numa_node=None,product_id='10ed',tags={dev_type='type-VF',physical_network='physnet1'},vendor
  _id='8086'), PciDevicePool(count=2,numa_node=None,product_id='10fb',tags={dev_type='type-PF',physical_network='physnet1'},vendor_id='8086')]

  Available ports inside DB:
  +-----------------+--------------+------------+-----------+----------+------------------+-----------+
  | compute_node_id | address      | product_id | vendor_id | dev_type | dev_id           | status    |
  +-----------------+--------------+------------+-----------+----------+------------------+-----------+
  |               5 | 0000:88:10.1 | 10ed       | 8086      | type-VF  | pci_0000_88_10_1 | available |
  |               5 | 0000:88:10.3 | 10ed       | 8086      | type-VF  | pci_0000_88_10_3 | available |
  |               5 | 0000:88:10.5 | 10ed       | 8086      | type-VF  | pci_0000_88_10_5 | available |
  |               5 | 0000:88:10.7 | 10ed       | 8086      | type-VF  | pci_0000_88_10_7 | available |
  |               5 | 0000:88:11.1 | 10ed       | 8086      | type-VF  | pci_0000_88_11_1 | available |
  |               5 | 0000:88:11.3 | 10ed       | 8086      | type-VF  | pci_0000_88_11_3 | available |
  |               5 | 0000:88:11.5 | 10ed       | 8086      | type-VF  | pci_0000_88_11_5 | available |
  |               5 | 0000:88:11.7 | 10ed       | 8086      | type-VF  | pci_0000_88_11_7 | available |
  |               5 | 0000:88:12.1 | 10ed       | 8086      | type-VF  | pci_0000_88_12_1 | available |
  |               5 | 0000:88:12.3 | 10ed       | 8086      | type-VF  | pci_0000_88_12_3 | available |
  |               5 | 0000:88:12.5 | 10ed       | 8086      | type-VF  | pci_0000_88_12_5 | available |
  |               5 | 0000:88:12.7 | 10ed       | 8086      | type-VF  | pci_0000_88_12_7 | available |
  |               5 | 0000:88:13.1 | 10ed       | 8086      | type-VF  | pci_0000_88_13_1 | available |
  |               5 | 0000:88:13.3 | 10ed       | 8086      | type-VF  | pci_0000_88_13_3 | available |
  |               5 | 0000:88:13.5 | 10ed       | 8086      | type-VF  | pci_0000_88_13_5 | available |
  |               5 | 0000:88:00.0 | 10fb       | 8086      | type-PF  | pci_0000_88_00_0 | available |
  |               5 | 0000:88:00.1 | 10fb       | 8086      | type-PF  | pci_0000_88_00_1 | available |
  |               2 | 0000:88:11.5 | 10ed       | 8086      | type-VF  | pci_0000_88_11_5 | available |
  |               2 | 0000:88:10.5 | 10ed       | 8086      | type-VF  | pci_0000_88_10_5 | available |
  |               2 | 0000:88:11.1 | 10ed       | 8086      | type-VF  | pci_0000_88_11_1 | available |
  |               2 | 0000:88:10.7 | 10ed       | 8086      | type-VF  | pci_0000_88_10_7 | available |
  |               2 | 0000:88:10.1 | 10ed       | 8086      | type-VF  | pci_0000_88_10_1 | available |
  |               2 | 0000:88:10.3 | 10ed       | 8086      | type-VF  | pci_0000_88_10_3 | available |
  |               2 | 0000:88:11.3 | 10ed       | 8086      | type-VF  | pci_0000_88_11_3 | available |
  +-----------------+--------------+------------+-----------+----------+------------------+-----------+

  Also the NICs seem to be available just fine :

  # lspci -nnnn | grep 'Virtual Function'
  88:10.1 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
  88:10.3 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
  88:10.5 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
  88:10.7 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
  88:11.1 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
  88:11.3 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
  88:11.5 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
  88:11.7 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
  88:12.1 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
  88:12.3 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
  88:12.5 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
  88:12.7 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
  88:13.1 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
  88:13.3 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
  88:13.5 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)

  VFs:
  ip link show dev p4p2
  11: p4p2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
      link/ether 14:02:ec:68:49:65 brd ff:ff:ff:ff:ff:ff
      vf 0 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
      vf 1 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
      vf 2 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
      vf 3 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
      vf 4 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
      vf 5 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
      vf 6 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
      vf 7 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
      vf 8 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
      vf 9 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
      vf 10 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
      vf 11 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
      vf 12 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
      vf 13 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
      vf 14 MAC 00:00:00:00:00:00, spoof checking on, link-state auto

  iommu:
  [    0.000000] ACPI: DMAR 000000007b7fd000 000332 (v01 INTEL  INTEL ID 00000001    ? 00000001)
  [    0.096499] dmar: Host address width 46
  [    0.096502] dmar: DRHD base: 0x000000fbffc000 flags: 0x0
  [    0.096508] dmar: IOMMU 0: reg_base_addr fbffc000 ver 1:0 cap d2078c106f0466 ecap f020df
  [    0.096510] dmar: DRHD base: 0x000000c7ffc000 flags: 0x1
  [    0.096515] dmar: IOMMU 1: reg_base_addr c7ffc000 ver 1:0 cap d2078c106f0466 ecap f020df
  [    0.096517] dmar: RMRR base: 0x0000007916d000 end: 0x0000007916ffff
  [    0.096518] dmar: RMRR base: 0x000000791eb000 end: 0x000000791eefff
  [    0.096520] dmar: RMRR base: 0x000000791db000 end: 0x000000791eafff
  [    0.096521] dmar: RMRR base: 0x000000791c8000 end: 0x000000791d8fff
  [    0.096523] dmar: RMRR base: 0x000000791d9000 end: 0x000000791dafff
  [    0.096524] dmar: RMRR base: 0x0000005a7a1000 end: 0x0000005a7e0fff
  [    1.488485] DMAR: No ATSR found

  grep -i "Enabled IRQ" /var/log/dmesg
  [    0.097126] Enabled IRQ remapping in x2apic mode

  The neutron-sriov-nic-agent seems to be only doing

  2016-09-27 18:15:01.951 14081 DEBUG neutron.agent.linux.utils [req-0b259e93-97fc-45fa-8d2f-d0e408cc08dd - - - - -] Running command: ['sudo', '/openstack/venvs/neutron-13.3.4/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'link', 'show'] create_process /openstack/venvs/neutron-13.3.4/lib/python2.7/site-packages/neutron/agent/linux/utils.py:84
  2016-09-27 18:15:02.003 14081 DEBUG neutron.agent.linux.utils [req-0b259e93-97fc-45fa-8d2f-d0e408cc08dd - - - - -] Exit code: 0 execute /openstack/venvs/neutron-13.3.4/lib/python2.7/site-packages/neutron/agent/linux/utils.py:142

  in a loop which maybe ok, considering that the code mostly container
  check conditions.

  /etc/nova/nova.conf:
  [DEFAULT]
  pci_passthrough_whitelist = { "address":"*:88:*", "physical_network": "physnet1" }

  /etc/neutron/plugins/ml2/ml2_conf.ini:
   [ml2]
   type_drivers = flat,vlan,vxlan,local
   tenant_network_types = vxlan,vlan
   mechanism_drivers = linuxbridge,l2population,sriovnicswitch
   extension_drivers = port_security

   [m2_sriov]
   supported_pci_vendor_devs = 8086:10ed
   agent_required = True

  /etc/neutron/plugins/ml2/sriov_agent.ini
   [DEFAULT]
   verbose = True
   debug = True

   [securitygroup]
   firewall_driver = neutron.agent.firewall.NoopFirewallDriver

   [sriov_nic]
   physical_device_mappings = physnet1:p4p2
   exclude_devices =

   [m2_sriov]
   supported_pci_vendor_devs = 8086:10ed
   agent_required = True

  Sample network configuration:

  neutron net-create --provider:physical_network=physnet1
  --provider:network_type=vlan --provider:segmentation_id=123 --shared
  INSIDE_NET

  neutron subnet-create INSIDE_NET 1.2.3.0/22 --name INSIDE_SUBNET1
  --gateway=1.2.3.1 --allocation-pool start=1.2.3.11,end=1.2.6.254
  --dns-nameservers list=true 8.8.8.8 8.8.4.4

  port_id=`neutron port-create INSIDE_NET --name sriov_port1
  --binding:vnic_type direct --device_owner nova-compute | awk '/ id / {
  print $4 }'`

  Afterwards we tried to boot a simple instance with --nic port-id
  $port_id parameter, leading to the error above.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1628301/+subscriptions


References