← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1518430] Re: liberty: ~busy loop on epoll_wait being called with zero timeout

 

This has been uploaded to ubuntu zesty, yakkety, xenial and is awaiting
sru team review for yakkety and xenial.  This has also been uploaded to
kilo-staging and liberty-staging for the ubuntu cloud archive.  These
should all be available in -proposed soon for testing.

** Also affects: cloud-archive/kilo
   Importance: Undecided
       Status: New

** Changed in: cloud-archive/kilo
       Status: New => Fix Committed

** Changed in: cloud-archive/liberty
       Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1518430

Title:
  liberty: ~busy loop on epoll_wait being called with zero timeout

Status in Ubuntu Cloud Archive:
  Fix Committed
Status in Ubuntu Cloud Archive kilo series:
  Fix Committed
Status in Ubuntu Cloud Archive liberty series:
  Fix Committed
Status in Ubuntu Cloud Archive mitaka series:
  Fix Committed
Status in Ubuntu Cloud Archive newton series:
  Fix Committed
Status in OpenStack Compute (nova):
  Invalid
Status in oslo.messaging:
  Fix Released
Status in nova package in Ubuntu:
  Invalid
Status in python-oslo.messaging package in Ubuntu:
  Fix Released
Status in python-oslo.messaging source package in Xenial:
  New
Status in python-oslo.messaging source package in Yakkety:
  New
Status in python-oslo.messaging source package in Zesty:
  Fix Released

Bug description:
  Context: openstack juju/maas deploy using 1510 charms release
  on trusty, with:
    openstack-origin: "cloud:trusty-liberty"
    source: "cloud:trusty-updates/liberty

  * Several openstack nova- and neutron- services, at least:
  nova-compute, neutron-server, nova-conductor,
  neutron-openvswitch-agent,neutron-vpn-agent
  show almost busy looping on epoll_wait() calls, with zero timeout set
  most frequently.
  - nova-compute (chose it b/cos single proc'd) strace and ltrace captures:
    http://paste.ubuntu.com/13371248/ (ltrace, strace)

  As comparison, this is how it looks on a kilo deploy:
  - http://paste.ubuntu.com/13371635/

  * 'top' sample from a nova-cloud-controller unit from
     this completely idle stack:
    http://paste.ubuntu.com/13371809/

  FYI *not* seeing this behavior on keystone, glance, cinder,
  ceilometer-api.

  As this issue is present on several components, it likely comes
  from common libraries (oslo concurrency?), fyi filed the bug to
  nova itself as a starting point for debugging.

  Note: The description in the following bug gives a good overview of
  the issue and points to a possible fix for oslo.messaging:
  https://bugs.launchpad.net/mos/+bug/1380220

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1518430/+subscriptions