← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1459726] [NEW] api servers hang with 100% CPU if syslog restarted

 

Public bug reported:

Affected:

glance-api
glance-registry
neutron-server
nova-api

If service was configured to use rsyslog and rsyslog was restarted after
API server started, it hangs on next log line with 100% CPU. If server
have few workers, each worker will eat own 100% CPU share.

Steps to reproduce:
1. Configure syslog:
use_syslog=true
syslog_log_facility=LOG_LOCAL4
2. restart api service
3. restart rsyslog

Execute some command to force logging. F.e.: neutron net-create foo,
nova boot, etc.

Expected result: normal operation

Actual result:
with some chance (about 30-50%) api server will hung with 100% CPU usage and will not reply to request.

Strace on hung service:

gettimeofday({1432827199, 745141}, NULL) = 0
poll([{fd=3, events=POLLOUT|POLLERR|POLLHUP}, {fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 60000) = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "<151>keystonemiddleware.auth_token[12502]: DEBUG Authenticating user token __call__ /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:650\0", 154, 0, NULL, 0) = -1 ENOTCONN (Transport endpoint is not connected)
gettimeofday({1432827199, 745226}, NULL) = 0
poll([{fd=3, events=POLLOUT|POLLERR|POLLHUP}, {fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 60000) = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "<151>keystonemiddleware.auth_token[12502]: DEBUG Authenticating user token __call__ /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:650\0", 154, 0, NULL, 0) = -1 ENOTCONN (Transport endpoint is not connected)
gettimeofday({1432827199, 745325}, NULL) = 0

Tested on:
nova, glance, neutron:  1:2014.2.3, Ubuntu version.

** Affects: glance
     Importance: Undecided
         Status: New

** Affects: neutron
     Importance: Undecided
         Status: New

** Affects: nova
     Importance: Undecided
         Status: New

** Description changed:

  Affected:
  
  glance-api
  glance-registry
  neutron-server
  nova-api
  
  If service was configured to use rsyslog and rsyslog was restarted after
  API server started, it hangs on next log line with 100% CPU. If server
  have few workers, each worker will eat own 100% CPU share.
  
  Steps to reproduce:
- 1. Configure syslog: 
+ 1. Configure syslog:
  use_syslog=true
  syslog_log_facility=LOG_LOCAL4
  2. restart api service
  3. restart rsyslog
  
  Execute some command to force logging. F.e.: neutron net-create foo,
  nova boot, etc.
  
  Expected result: normal operation
  
  Actual result:
  with some chance (about 30-50%) api server will hung with 100% CPU usage and will not reply to request.
  
  Strace on hung service:
  
- 
  gettimeofday({1432827199, 745141}, NULL) = 0
  poll([{fd=3, events=POLLOUT|POLLERR|POLLHUP}, {fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 60000) = 1 ([{fd=3, revents=POLLOUT}])
  sendto(3, "<151>keystonemiddleware.auth_token[12502]: DEBUG Authenticating user token __call__ /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:650\0", 154, 0, NULL, 0) = -1 ENOTCONN (Transport endpoint is not connected)
  gettimeofday({1432827199, 745226}, NULL) = 0
  poll([{fd=3, events=POLLOUT|POLLERR|POLLHUP}, {fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 60000) = 1 ([{fd=3, revents=POLLOUT}])
  sendto(3, "<151>keystonemiddleware.auth_token[12502]: DEBUG Authenticating user token __call__ /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:650\0", 154, 0, NULL, 0) = -1 ENOTCONN (Transport endpoint is not connected)
  gettimeofday({1432827199, 745325}, NULL) = 0
+ 
+ Tested on:
+ nova, glance, neutron:  1:2014.2.3, Ubuntu version.

** Also affects: neutron
   Importance: Undecided
       Status: New

** Also affects: glance
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1459726

Title:
  api servers hang with 100% CPU if syslog restarted

Status in OpenStack Image Registry and Delivery Service (Glance):
  New
Status in OpenStack Neutron (virtual network service):
  New
Status in OpenStack Compute (Nova):
  New

Bug description:
  Affected:

  glance-api
  glance-registry
  neutron-server
  nova-api

  If service was configured to use rsyslog and rsyslog was restarted
  after API server started, it hangs on next log line with 100% CPU. If
  server have few workers, each worker will eat own 100% CPU share.

  Steps to reproduce:
  1. Configure syslog:
  use_syslog=true
  syslog_log_facility=LOG_LOCAL4
  2. restart api service
  3. restart rsyslog

  Execute some command to force logging. F.e.: neutron net-create foo,
  nova boot, etc.

  Expected result: normal operation

  Actual result:
  with some chance (about 30-50%) api server will hung with 100% CPU usage and will not reply to request.

  Strace on hung service:

  gettimeofday({1432827199, 745141}, NULL) = 0
  poll([{fd=3, events=POLLOUT|POLLERR|POLLHUP}, {fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 60000) = 1 ([{fd=3, revents=POLLOUT}])
  sendto(3, "<151>keystonemiddleware.auth_token[12502]: DEBUG Authenticating user token __call__ /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:650\0", 154, 0, NULL, 0) = -1 ENOTCONN (Transport endpoint is not connected)
  gettimeofday({1432827199, 745226}, NULL) = 0
  poll([{fd=3, events=POLLOUT|POLLERR|POLLHUP}, {fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 60000) = 1 ([{fd=3, revents=POLLOUT}])
  sendto(3, "<151>keystonemiddleware.auth_token[12502]: DEBUG Authenticating user token __call__ /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:650\0", 154, 0, NULL, 0) = -1 ENOTCONN (Transport endpoint is not connected)
  gettimeofday({1432827199, 745325}, NULL) = 0

  Tested on:
  nova, glance, neutron:  1:2014.2.3, Ubuntu version.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1459726/+subscriptions


Follow ups

References