yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #33216
[Bug 1459726] [NEW] api servers hang with 100% CPU if syslog restarted
Public bug reported:
Affected:
glance-api
glance-registry
neutron-server
nova-api
If service was configured to use rsyslog and rsyslog was restarted after
API server started, it hangs on next log line with 100% CPU. If server
have few workers, each worker will eat own 100% CPU share.
Steps to reproduce:
1. Configure syslog:
use_syslog=true
syslog_log_facility=LOG_LOCAL4
2. restart api service
3. restart rsyslog
Execute some command to force logging. F.e.: neutron net-create foo,
nova boot, etc.
Expected result: normal operation
Actual result:
with some chance (about 30-50%) api server will hung with 100% CPU usage and will not reply to request.
Strace on hung service:
gettimeofday({1432827199, 745141}, NULL) = 0
poll([{fd=3, events=POLLOUT|POLLERR|POLLHUP}, {fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 60000) = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "<151>keystonemiddleware.auth_token[12502]: DEBUG Authenticating user token __call__ /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:650\0", 154, 0, NULL, 0) = -1 ENOTCONN (Transport endpoint is not connected)
gettimeofday({1432827199, 745226}, NULL) = 0
poll([{fd=3, events=POLLOUT|POLLERR|POLLHUP}, {fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 60000) = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "<151>keystonemiddleware.auth_token[12502]: DEBUG Authenticating user token __call__ /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:650\0", 154, 0, NULL, 0) = -1 ENOTCONN (Transport endpoint is not connected)
gettimeofday({1432827199, 745325}, NULL) = 0
Tested on:
nova, glance, neutron: 1:2014.2.3, Ubuntu version.
** Affects: glance
Importance: Undecided
Status: New
** Affects: neutron
Importance: Undecided
Status: New
** Affects: nova
Importance: Undecided
Status: New
** Description changed:
Affected:
glance-api
glance-registry
neutron-server
nova-api
If service was configured to use rsyslog and rsyslog was restarted after
API server started, it hangs on next log line with 100% CPU. If server
have few workers, each worker will eat own 100% CPU share.
Steps to reproduce:
- 1. Configure syslog:
+ 1. Configure syslog:
use_syslog=true
syslog_log_facility=LOG_LOCAL4
2. restart api service
3. restart rsyslog
Execute some command to force logging. F.e.: neutron net-create foo,
nova boot, etc.
Expected result: normal operation
Actual result:
with some chance (about 30-50%) api server will hung with 100% CPU usage and will not reply to request.
Strace on hung service:
-
gettimeofday({1432827199, 745141}, NULL) = 0
poll([{fd=3, events=POLLOUT|POLLERR|POLLHUP}, {fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 60000) = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "<151>keystonemiddleware.auth_token[12502]: DEBUG Authenticating user token __call__ /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:650\0", 154, 0, NULL, 0) = -1 ENOTCONN (Transport endpoint is not connected)
gettimeofday({1432827199, 745226}, NULL) = 0
poll([{fd=3, events=POLLOUT|POLLERR|POLLHUP}, {fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 60000) = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "<151>keystonemiddleware.auth_token[12502]: DEBUG Authenticating user token __call__ /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:650\0", 154, 0, NULL, 0) = -1 ENOTCONN (Transport endpoint is not connected)
gettimeofday({1432827199, 745325}, NULL) = 0
+
+ Tested on:
+ nova, glance, neutron: 1:2014.2.3, Ubuntu version.
** Also affects: neutron
Importance: Undecided
Status: New
** Also affects: glance
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1459726
Title:
api servers hang with 100% CPU if syslog restarted
Status in OpenStack Image Registry and Delivery Service (Glance):
New
Status in OpenStack Neutron (virtual network service):
New
Status in OpenStack Compute (Nova):
New
Bug description:
Affected:
glance-api
glance-registry
neutron-server
nova-api
If service was configured to use rsyslog and rsyslog was restarted
after API server started, it hangs on next log line with 100% CPU. If
server have few workers, each worker will eat own 100% CPU share.
Steps to reproduce:
1. Configure syslog:
use_syslog=true
syslog_log_facility=LOG_LOCAL4
2. restart api service
3. restart rsyslog
Execute some command to force logging. F.e.: neutron net-create foo,
nova boot, etc.
Expected result: normal operation
Actual result:
with some chance (about 30-50%) api server will hung with 100% CPU usage and will not reply to request.
Strace on hung service:
gettimeofday({1432827199, 745141}, NULL) = 0
poll([{fd=3, events=POLLOUT|POLLERR|POLLHUP}, {fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 60000) = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "<151>keystonemiddleware.auth_token[12502]: DEBUG Authenticating user token __call__ /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:650\0", 154, 0, NULL, 0) = -1 ENOTCONN (Transport endpoint is not connected)
gettimeofday({1432827199, 745226}, NULL) = 0
poll([{fd=3, events=POLLOUT|POLLERR|POLLHUP}, {fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 2, 60000) = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "<151>keystonemiddleware.auth_token[12502]: DEBUG Authenticating user token __call__ /usr/lib/python2.7/dist-packages/keystonemiddleware/auth_token.py:650\0", 154, 0, NULL, 0) = -1 ENOTCONN (Transport endpoint is not connected)
gettimeofday({1432827199, 745325}, NULL) = 0
Tested on:
nova, glance, neutron: 1:2014.2.3, Ubuntu version.
To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1459726/+subscriptions
Follow ups
References