← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1607461] [NEW] nova-compute hangs while executing a blocking call to librbd

 

Public bug reported:

While executing a call to librbd nova-compute may hang for a while and
eventually go down in nova service-list output.

strace'ing shows that a process is stuck on acquiring a mutex:

root@node-153:~# strace -p 16675
Process 16675 attached
futex(0x7fff084ce36c, FUTEX_WAIT_PRIVATE, 1, NULL

gdb allows to see the traceback:

http://paste.openstack.org/show/542534/

^ which basically means calls to librbd (C library) are not monkey-
patched and do not allow to switch the execution context to another
green thread in an eventlet-based process.

To avoid blocking of the whole nova-compute process on calls to librbd
we should wrap them with tpool.execute()
(http://eventlet.net/doc/threading.html#eventlet.tpool.execute)

** Affects: nova
     Importance: Undecided
     Assignee: Roman Podoliaka (rpodolyaka)
         Status: New


** Tags: ceph compute

** Changed in: nova
     Assignee: (unassigned) => Roman Podoliaka (rpodolyaka)

** Tags added: ceph compute

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1607461

Title:
  nova-compute hangs while executing a blocking call to librbd

Status in OpenStack Compute (nova):
  New

Bug description:
  While executing a call to librbd nova-compute may hang for a while and
  eventually go down in nova service-list output.

  strace'ing shows that a process is stuck on acquiring a mutex:

  root@node-153:~# strace -p 16675
  Process 16675 attached
  futex(0x7fff084ce36c, FUTEX_WAIT_PRIVATE, 1, NULL

  gdb allows to see the traceback:

  http://paste.openstack.org/show/542534/

  ^ which basically means calls to librbd (C library) are not monkey-
  patched and do not allow to switch the execution context to another
  green thread in an eventlet-based process.

  To avoid blocking of the whole nova-compute process on calls to librbd
  we should wrap them with tpool.execute()
  (http://eventlet.net/doc/threading.html#eventlet.tpool.execute)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1607461/+subscriptions


Follow ups