yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #58470
[Bug 1518430] Re: liberty: ~busy loop on epoll_wait being called with zero timeout
Reviewed: https://review.openstack.org/386656
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=a6c193f3eba62cdcbfe04d0fa93e95352bcfb1c3
Submitter: Jenkins
Branch: master
commit a6c193f3eba62cdcbfe04d0fa93e95352bcfb1c3
Author: John Eckersberg <jeckersb@xxxxxxxxxx>
Date: Fri Oct 14 11:02:47 2016 -0400
rabbit: Avoid busy loop on epoll_wait with heartbeat+eventlet
Calling threading.Event.wait() when using eventlet results in a busy
loop calling epoll_wait, because the Python 2.x
threading.Condition.wait() implementation busy-waits by calling
sleep() with very small values (0.0005..0.05s). Because sleep() is
monkey-patched by eventlet, this results in many very short timers
being added to the eventlet hub, and forces eventlet to constantly
epoll_wait looking for new data unecessarily.
This utilizes a new Event from eventletutils which conditionalizes the
event primitive depending on whether or not eventlet is being used.
If it is, eventlet.event.Event is used instead of threading.Event.
The eventlet.event.Event implementation does not suffer from the same
busy-wait sleep problem. If eventlet is not used, the previous
behavior is retained.
Change-Id: I5c211092d282e724d1c87ce4d06b6c44b592e764
Depends-On: Id33c9f8c17102ba1fe24c12b053c336b6d265501
Closes-bug: #1518430
** Changed in: oslo.messaging
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1518430
Title:
liberty: ~busy loop on epoll_wait being called with zero timeout
Status in OpenStack Compute (nova):
Invalid
Status in oslo.messaging:
Fix Released
Status in nova package in Ubuntu:
Invalid
Status in python-oslo.messaging package in Ubuntu:
New
Bug description:
Context: openstack juju/maas deploy using 1510 charms release
on trusty, with:
openstack-origin: "cloud:trusty-liberty"
source: "cloud:trusty-updates/liberty
* Several openstack nova- and neutron- services, at least:
nova-compute, neutron-server, nova-conductor,
neutron-openvswitch-agent,neutron-vpn-agent
show almost busy looping on epoll_wait() calls, with zero timeout set
most frequently.
- nova-compute (chose it b/cos single proc'd) strace and ltrace captures:
http://paste.ubuntu.com/13371248/ (ltrace, strace)
As comparison, this is how it looks on a kilo deploy:
- http://paste.ubuntu.com/13371635/
* 'top' sample from a nova-cloud-controller unit from
this completely idle stack:
http://paste.ubuntu.com/13371809/
FYI *not* seeing this behavior on keystone, glance, cinder,
ceilometer-api.
As this issue is present on several components, it likely comes
from common libraries (oslo concurrency?), fyi filed the bug to
nova itself as a starting point for debugging.
Note: The description in the following bug gives a good overview of
the issue and points to a possible fix for oslo.messaging:
https://bugs.launchpad.net/mos/+bug/1380220
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1518430/+subscriptions