← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1750777] Re: openvswitch agent eating CPU, time spent in ip_conntrack.py

 

Reviewed:  https://review.openstack.org/548976
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=4c8b97eca32c9c2beadf95fef14ed5b7d8981c5a
Submitter: Zuul
Branch:    master

commit 4c8b97eca32c9c2beadf95fef14ed5b7d8981c5a
Author: Brian Haley <bhaley@xxxxxxxxxx>
Date:   Thu Mar 1 15:42:59 2018 +0000

    Do not start conntrack worker thread from __init__
    
    Instead, start it when the first entry is being added to
    the queue.  Also, log any exceptions just in case get()
    throws something so we can do further debugging.
    
    Changed class from Queue to LightQueue was done after going
    through the eventlet.queue code looking at usage, since
    it's a little smaller and should be faster.
    
    Change-Id: Ie84be88382f327ebe312bf17ec2dc5c80a8de35f
    Closes-bug: 1750777


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1750777

Title:
  openvswitch agent eating CPU, time spent in ip_conntrack.py

Status in neutron:
  Fix Released

Bug description:
  We just ran into a case where the openvswitch agent (local dev
  destack, current master branch) eats 100% of CPU time.

  Pyflame profiling show the time being largely spent in
  neutron.agent.linux.ip_conntrack, line 95.

  https://github.com/openstack/neutron/blob/master/neutron/agent/linux/ip_conntrack.py#L95

  The code around this line is:

          while True:
              pool.spawn_n(self._process_queue)

  The documentation of eventlet.spawn_n says: "The same as spawn(), but
  it’s not possible to know how the function terminated (i.e. no return
  value or exceptions). This makes execution faster. See spawn_n for
  more details."  I suspect that GreenPool.spaw_n may behave similarly.

  It seems plausible that spawn_n is returning very quickly because of
  some error, and then all time is quickly spent in a short circuited
  while loop.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1750777/+subscriptions


References