← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1408612] Re: HTTP Keep-alive connections prevent keystone from terminating

 

*** This bug is a duplicate of bug 1361360 ***
    https://bugs.launchpad.net/bugs/1361360

This is related to a known bug in greenlet/eventlet. The general
solution is either to disable keepalives or to deploy under Apache. I
will mark this as a duplicate of the larger eventlet bug, but in short
there is relatively little that can be done.the answer is do not deploy
keystone under eventlet.

** This bug has been marked a duplicate of bug 1361360
   Eventlet green threads not released back to the pool leading to choking of new requests

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Keystone.
https://bugs.launchpad.net/bugs/1408612

Title:
  HTTP Keep-alive connections prevent keystone from terminating

Status in OpenStack Identity (Keystone):
  New

Bug description:
  Seen on RDO Juno, running on CentOS 7.

  Steps to reproduce:

  - Set admin_workers=1 and public_workers=1 in /etc/keystone/keystone.conf
  - Start the keystone service: `systemctl start openstack-keystone`
  - Start a 'persistent' TCP connection to keystone: `telnet localhost 5000 &`
  - Stop the service: `systemctl stop openstack-keystone`

  The final systemctl invokation will hang, as the process fails to
  terminate. Eventually it will time out and forcefully kill the
  process.

  Output of `systemctl status openstack-keystone`:

  Jan 08 05:07:38 mgoddard systemd[1]: openstack-keystone.service stopping timed out. Killing.
  Jan 08 05:07:38 mgoddard systemd[1]: openstack-keystone.service: main process exited, code=killed, status=9/KILL
  Jan 08 05:07:38 mgoddard systemd[1]: Stopped OpenStack Identity Service.
  Jan 08 05:07:38 mgoddard systemd[1]: Unit openstack-keystone.service entered failed state.

  The use of telnet here is just to demonstrate the problem. The same
  effect can be seen when OpenStack services maintain persistent
  connections to keystone.

  With multiple worker processes, the issue is not observed. It is
  believed that as systemd is able to kill the parent process, the child
  process holding the persistent connection is killed by systemd, so the
  issue is not observed (although this is speculation).

  When this issue was first observed, multiple workers were used and
  systemd was not in use. Rather, we used init scripts in /etc/init.d/.
  In this case the result was worse, as the `service openstack-keystone
  stop` command would exit successfully, but fail to terminate any child
  processes with persistent HTTP connections open. Subsequent attempts
  to start the keystone service would fail due to the lingering stale
  process.

  
  During the investigation of the issue, some root cause analysis was performed which will be presented below.

  - When a keystone process receives SIGTERM, it ends up waiting for all greenthreads in the greenpool to finish at https://github.com/eventlet/eventlet/blob/8d2474197de4827a7bca9c33e71a82573b6fc721/eventlet/wsgi.py#L267.
  - Persistent connections, when between HTTP requests, end up waiting at https://github.com/eventlet/eventlet/blob/8d2474197de4827a7bca9c33e71a82573b6fc721/eventlet/wsgi.py#L267 for the next request. The greenthread will not terminate until the connection is closed.

  The process will therefore not terminate until all connections have
  closed. It seems sensible to me to finish servicing individual
  requests for a graceful shutdown, but there needs to be a mechanism to
  close persistent connections between requests.

  This issue could (should?) be solved in eventlet.wsgi by a mechanism
  to trigger disconnection of persistent connections between requests
  when the server is stopped.

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1408612/+subscriptions


References