← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1556549] [NEW] too many qbr or qvo entries on compute node even though I have 7-8 instances on that compute node

 

Public bug reported:

I am seeing this weird behavior in our production environment. Right now, we are seeing an issue where launching of an instance is failing since the compute node and neutron is not cleaning up the qbr or qvo it had created even after we try to terminate the failed instance. Here are the logs from nova-conductor:-
2016-03-08 01:35:49.478 14041 ERROR nova.scheduler.utils [req-6ec7ee4b-9663-4f1b-910a-a87d99ac941c c665814ae07a4f71b666d04fcb99c2e9 a0288bedbb884e07bc0c602e7a343de8 - - -] [instance: fa9c27b4-06dd-4c04-9647-44e1fb8c1a81] Error from last host: compute-42 (node compute-42): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2254, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2400, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u"RescheduledException: Build of instance fa9c27b4-06dd-4c04-9647-44e1fb8c1a81 was re-scheduled: Error during following call to agent: ['ovs-vsctl', '--timeout=120', '--', '--if-exists', 'del-port', u'qvo3e44fa11-05', '--', 'add-port', 'br-int', u'qvo3e44fa11-05', '--', 'set', 'Interface', u'qvo3e44fa11-05', u'external-ids:iface-id=3e44fa11-05b5-44dc-8c0c-6b937fe7abe0', 'external-ids:iface-status=active', u'external-ids:attached-mac=fa:16:3e:60:aa:5e', 'external-ids:vm-uuid=fa9c27b4-06dd-4c04-9647-44e1fb8c1a81']\n"]

This qvo still exists on the compute node:-
[root@compute-42 rahul]# ifconfig | grep qvo3e44fa11-05
qvo3e44fa11-05: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 9000   <----- this still exists
[root@compute-42 rahul]# ifconfig | grep qvo | wc -l
392                   <------------------------ there are about 350+ such entries
[root@compute-42 rahul]# ifconfig | grep tap | wc -l
8                        <----------------------- the compute is running only 8 instances, still more than 350+ entries for qvo-XX alone
[root@compute-42 rahul]#

I am running Kilo release and RHEL 7 Openstack rpms.

Expected:-
Shouldn't the qvo and qvb be deleted if creation of instance has failed?

** Affects: nova
     Importance: Undecided
         Status: New

** Summary changed:

- too many qbr or qvo entries on compute node even though I have 2-3 instances on that compute node
+ too many qbr or qvo entries on compute node even though I have 7-8 instances on that compute node

** Description changed:

- I am seeing this wierd behavior in our production environment. Right now, we are seeing an issue where launching of an instance is failing since the compute node and neutron is not cleaning up the qbr or qvo it had created even after we try to terminate the failed instance. Here are the logs from nova-conductor:-
+ I am seeing this weird behavior in our production environment. Right now, we are seeing an issue where launching of an instance is failing since the compute node and neutron is not cleaning up the qbr or qvo it had created even after we try to terminate the failed instance. Here are the logs from nova-conductor:-
  2016-03-08 01:35:49.478 14041 ERROR nova.scheduler.utils [req-6ec7ee4b-9663-4f1b-910a-a87d99ac941c c665814ae07a4f71b666d04fcb99c2e9 a0288bedbb884e07bc0c602e7a343de8 - - -] [instance: fa9c27b4-06dd-4c04-9647-44e1fb8c1a81] Error from last host: compute-42 (node compute-42): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2254, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2400, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u"RescheduledException: Build of instance fa9c27b4-06dd-4c04-9647-44e1fb8c1a81 was re-scheduled: Error during following call to agent: ['ovs-vsctl', '--timeout=120', '--', '--if-exists', 'del-port', u'qvo3e44fa11-05', '--', 'add-port', 'br-int', u'qvo3e44fa11-05', '--', 'set', 'Interface', u'qvo3e44fa11-05', u'external-ids:iface-id=3e44fa11-05b5-44dc-8c0c-6b937fe7abe0', 'external-ids:iface-status=active', u'external-ids:attached-mac=fa:16:3e:60:aa:5e', 'external-ids:vm-uuid=fa9c27b4-06dd-4c04-9647-44e1fb8c1a81']\n"]
  
  This qvo still exists on the compute node:-
  [root@compute-42 rahul]# ifconfig | grep qvo3e44fa11-05
  qvo3e44fa11-05: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 9000   <----- this still exists
  [root@compute-42 rahul]# ifconfig | grep qvo | wc -l
  392                   <------------------------ there are about 350+ such entries
  [root@compute-42 rahul]# ifconfig | grep tap | wc -l
  8                        <----------------------- the compute is running only 8 instances, still more than 350+ entries for qvo-XX alone
- [root@compute-42 rahul]# 
+ [root@compute-42 rahul]#
  
  I am running Kilo release and RHEL 7 Openstack rpms.
  
  Expected:-
  Shouldn't the qvo and qvb be deleted if creation of instance has failed?

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1556549

Title:
  too many qbr or qvo entries on compute node even though I have 7-8
  instances on that compute node

Status in OpenStack Compute (nova):
  New

Bug description:
  I am seeing this weird behavior in our production environment. Right now, we are seeing an issue where launching of an instance is failing since the compute node and neutron is not cleaning up the qbr or qvo it had created even after we try to terminate the failed instance. Here are the logs from nova-conductor:-
  2016-03-08 01:35:49.478 14041 ERROR nova.scheduler.utils [req-6ec7ee4b-9663-4f1b-910a-a87d99ac941c c665814ae07a4f71b666d04fcb99c2e9 a0288bedbb884e07bc0c602e7a343de8 - - -] [instance: fa9c27b4-06dd-4c04-9647-44e1fb8c1a81] Error from last host: compute-42 (node compute-42): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2254, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2400, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u"RescheduledException: Build of instance fa9c27b4-06dd-4c04-9647-44e1fb8c1a81 was re-scheduled: Error during following call to agent: ['ovs-vsctl', '--timeout=120', '--', '--if-exists', 'del-port', u'qvo3e44fa11-05', '--', 'add-port', 'br-int', u'qvo3e44fa11-05', '--', 'set', 'Interface', u'qvo3e44fa11-05', u'external-ids:iface-id=3e44fa11-05b5-44dc-8c0c-6b937fe7abe0', 'external-ids:iface-status=active', u'external-ids:attached-mac=fa:16:3e:60:aa:5e', 'external-ids:vm-uuid=fa9c27b4-06dd-4c04-9647-44e1fb8c1a81']\n"]

  This qvo still exists on the compute node:-
  [root@compute-42 rahul]# ifconfig | grep qvo3e44fa11-05
  qvo3e44fa11-05: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 9000   <----- this still exists
  [root@compute-42 rahul]# ifconfig | grep qvo | wc -l
  392                   <------------------------ there are about 350+ such entries
  [root@compute-42 rahul]# ifconfig | grep tap | wc -l
  8                        <----------------------- the compute is running only 8 instances, still more than 350+ entries for qvo-XX alone
  [root@compute-42 rahul]#

  I am running Kilo release and RHEL 7 Openstack rpms.

  Expected:-
  Shouldn't the qvo and qvb be deleted if creation of instance has failed?

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1556549/+subscriptions


Follow ups