yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #57265
[Bug 1628301] Re: SR-IOV not working in Mitaka and Intel X series NIC
Adding Neutron since I believe the issue is the neutron-sriov-nic-agent
not building the port so that nova can allocate it for the instance.
** Also affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1628301
Title:
SR-IOV not working in Mitaka and Intel X series NIC
Status in neutron:
New
Status in OpenStack Compute (nova):
New
Bug description:
The SRIO functionality in Mitaka seems broken, all configuration
options we evaluated lead to
NovaException: Unexpected vif_type=binding_failed
errors, stack following.
We are currently using this code base, along with SRIOV configuration posted here
Nova SHA 611efbe77c712d9ac35904f659d28dd0f0c1b3ff # HEAD of "stable/mitaka" as of 08.09.2016
Neutron SHA c73269fa480a8a955f440570fc2fa6c347e3bb3c # HEAD of "stable/mitaka" as of 08.09.2016
Stack :
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] Traceback (most recent call last):
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] File "/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/compute/manager.py", line 2218, in _build_resources
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] yield resources
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] File "/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/compute/manager.py", line 2064, in _build_and_run_instance
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] block_device_info=block_device_info)
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] File "/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2776, in spawn
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] write_to_disk=True)
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] File "/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4729, in _get_guest_xml
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] context)
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] File "/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4595, in _get_guest_config
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] flavor, virt_type, self._host)
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] File "/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/vif.py", line 447, in get_config
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] _("Unexpected vif_type=%s") % vif_type)
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] NovaException: Unexpected vif_type=binding_failed
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]
Interestingly the nova resource tracker seem to be able to create a
list of all available sriov devices and they show up correctly inside
the database as pci_device table entries
2016-09-27 16:13:52.175 10248 INFO nova.compute.resource_tracker [req-284a7832-3794-4597-b939-273ea75d45f7 - - - - -] Total usable vcpus: 32, total allocated vcpus: 0
2016-09-27 16:13:52.175 10248 INFO nova.compute.resource_tracker [req-284a7832-3794-4597-b939-273ea75d45f7 - - - - -] Final resource view: name=compute01 phys_ram=257777
MB used_ram=2048MB phys_disk=1935GB used_disk=2GB total_vcpus=32 used_vcpus=0 pci_stats=[PciDevicePool(count=15,numa_node=None,product_id='10ed',tags={dev_type='type-VF',physical_network='physnet1'},vendor
_id='8086'), PciDevicePool(count=2,numa_node=None,product_id='10fb',tags={dev_type='type-PF',physical_network='physnet1'},vendor_id='8086')]
Available ports inside DB:
+-----------------+--------------+------------+-----------+----------+------------------+-----------+
| compute_node_id | address | product_id | vendor_id | dev_type | dev_id | status |
+-----------------+--------------+------------+-----------+----------+------------------+-----------+
| 5 | 0000:88:10.1 | 10ed | 8086 | type-VF | pci_0000_88_10_1 | available |
| 5 | 0000:88:10.3 | 10ed | 8086 | type-VF | pci_0000_88_10_3 | available |
| 5 | 0000:88:10.5 | 10ed | 8086 | type-VF | pci_0000_88_10_5 | available |
| 5 | 0000:88:10.7 | 10ed | 8086 | type-VF | pci_0000_88_10_7 | available |
| 5 | 0000:88:11.1 | 10ed | 8086 | type-VF | pci_0000_88_11_1 | available |
| 5 | 0000:88:11.3 | 10ed | 8086 | type-VF | pci_0000_88_11_3 | available |
| 5 | 0000:88:11.5 | 10ed | 8086 | type-VF | pci_0000_88_11_5 | available |
| 5 | 0000:88:11.7 | 10ed | 8086 | type-VF | pci_0000_88_11_7 | available |
| 5 | 0000:88:12.1 | 10ed | 8086 | type-VF | pci_0000_88_12_1 | available |
| 5 | 0000:88:12.3 | 10ed | 8086 | type-VF | pci_0000_88_12_3 | available |
| 5 | 0000:88:12.5 | 10ed | 8086 | type-VF | pci_0000_88_12_5 | available |
| 5 | 0000:88:12.7 | 10ed | 8086 | type-VF | pci_0000_88_12_7 | available |
| 5 | 0000:88:13.1 | 10ed | 8086 | type-VF | pci_0000_88_13_1 | available |
| 5 | 0000:88:13.3 | 10ed | 8086 | type-VF | pci_0000_88_13_3 | available |
| 5 | 0000:88:13.5 | 10ed | 8086 | type-VF | pci_0000_88_13_5 | available |
| 5 | 0000:88:00.0 | 10fb | 8086 | type-PF | pci_0000_88_00_0 | available |
| 5 | 0000:88:00.1 | 10fb | 8086 | type-PF | pci_0000_88_00_1 | available |
| 2 | 0000:88:11.5 | 10ed | 8086 | type-VF | pci_0000_88_11_5 | available |
| 2 | 0000:88:10.5 | 10ed | 8086 | type-VF | pci_0000_88_10_5 | available |
| 2 | 0000:88:11.1 | 10ed | 8086 | type-VF | pci_0000_88_11_1 | available |
| 2 | 0000:88:10.7 | 10ed | 8086 | type-VF | pci_0000_88_10_7 | available |
| 2 | 0000:88:10.1 | 10ed | 8086 | type-VF | pci_0000_88_10_1 | available |
| 2 | 0000:88:10.3 | 10ed | 8086 | type-VF | pci_0000_88_10_3 | available |
| 2 | 0000:88:11.3 | 10ed | 8086 | type-VF | pci_0000_88_11_3 | available |
+-----------------+--------------+------------+-----------+----------+------------------+-----------+
Also the NICs seem to be available just fine :
# lspci -nnnn | grep 'Virtual Function'
88:10.1 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
88:10.3 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
88:10.5 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
88:10.7 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
88:11.1 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
88:11.3 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
88:11.5 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
88:11.7 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
88:12.1 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
88:12.3 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
88:12.5 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
88:12.7 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
88:13.1 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
88:13.3 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
88:13.5 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
VFs:
ip link show dev p4p2
11: p4p2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 14:02:ec:68:49:65 brd ff:ff:ff:ff:ff:ff
vf 0 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 1 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 2 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 3 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 4 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 5 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 6 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 7 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 8 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 9 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 10 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 11 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 12 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 13 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 14 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
iommu:
[ 0.000000] ACPI: DMAR 000000007b7fd000 000332 (v01 INTEL INTEL ID 00000001 ? 00000001)
[ 0.096499] dmar: Host address width 46
[ 0.096502] dmar: DRHD base: 0x000000fbffc000 flags: 0x0
[ 0.096508] dmar: IOMMU 0: reg_base_addr fbffc000 ver 1:0 cap d2078c106f0466 ecap f020df
[ 0.096510] dmar: DRHD base: 0x000000c7ffc000 flags: 0x1
[ 0.096515] dmar: IOMMU 1: reg_base_addr c7ffc000 ver 1:0 cap d2078c106f0466 ecap f020df
[ 0.096517] dmar: RMRR base: 0x0000007916d000 end: 0x0000007916ffff
[ 0.096518] dmar: RMRR base: 0x000000791eb000 end: 0x000000791eefff
[ 0.096520] dmar: RMRR base: 0x000000791db000 end: 0x000000791eafff
[ 0.096521] dmar: RMRR base: 0x000000791c8000 end: 0x000000791d8fff
[ 0.096523] dmar: RMRR base: 0x000000791d9000 end: 0x000000791dafff
[ 0.096524] dmar: RMRR base: 0x0000005a7a1000 end: 0x0000005a7e0fff
[ 1.488485] DMAR: No ATSR found
grep -i "Enabled IRQ" /var/log/dmesg
[ 0.097126] Enabled IRQ remapping in x2apic mode
The neutron-sriov-nic-agent seems to be only doing
2016-09-27 18:15:01.951 14081 DEBUG neutron.agent.linux.utils [req-0b259e93-97fc-45fa-8d2f-d0e408cc08dd - - - - -] Running command: ['sudo', '/openstack/venvs/neutron-13.3.4/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'link', 'show'] create_process /openstack/venvs/neutron-13.3.4/lib/python2.7/site-packages/neutron/agent/linux/utils.py:84
2016-09-27 18:15:02.003 14081 DEBUG neutron.agent.linux.utils [req-0b259e93-97fc-45fa-8d2f-d0e408cc08dd - - - - -] Exit code: 0 execute /openstack/venvs/neutron-13.3.4/lib/python2.7/site-packages/neutron/agent/linux/utils.py:142
in a loop which maybe ok, considering that the code mostly container
check conditions.
/etc/nova/nova.conf:
[DEFAULT]
pci_passthrough_whitelist = { "address":"*:88:*", "physical_network": "physnet1" }
/etc/neutron/plugins/ml2/ml2_conf.ini:
[ml2]
type_drivers = flat,vlan,vxlan,local
tenant_network_types = vxlan,vlan
mechanism_drivers = linuxbridge,l2population,sriovnicswitch
extension_drivers = port_security
[m2_sriov]
supported_pci_vendor_devs = 8086:10ed
agent_required = True
/etc/neutron/plugins/ml2/sriov_agent.ini
[DEFAULT]
verbose = True
debug = True
[securitygroup]
firewall_driver = neutron.agent.firewall.NoopFirewallDriver
[sriov_nic]
physical_device_mappings = physnet1:p4p2
exclude_devices =
[m2_sriov]
supported_pci_vendor_devs = 8086:10ed
agent_required = True
Sample network configuration:
neutron net-create --provider:physical_network=physnet1
--provider:network_type=vlan --provider:segmentation_id=123 --shared
INSIDE_NET
neutron subnet-create INSIDE_NET 1.2.3.0/22 --name INSIDE_SUBNET1
--gateway=1.2.3.1 --allocation-pool start=1.2.3.11,end=1.2.6.254
--dns-nameservers list=true 8.8.8.8 8.8.4.4
port_id=`neutron port-create INSIDE_NET --name sriov_port1
--binding:vnic_type direct --device_owner nova-compute | awk '/ id / {
print $4 }'`
Afterwards we tried to boot a simple instance with --nic port-id
$port_id parameter, leading to the error above.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1628301/+subscriptions
References