← Back to team overview

openstack team mailing list archive

Re: [SWIFT] Proxies Sizing for 90.000 / 200.000 RPM

 

John brought the concern over auth_token middleware up to me directly - 

I don't know of anyone that's driven the keystone middleware to these rates and determined where the bottlenecks are other than folks deploying swift and driving high performance numbers. 

The concern that John detailed to me is how the middleware handles memcache connections, which is directly impacted by how you're deploying it. From John:

"Specifically, I'm concerned with the way auth_token handles memcache connections. I'm not sure how well it will work in swift with eventlet. If the memcache module being used caches sockets, then concurrency in eventlet (different greenthreads) will cause problems. Eventlet detects and prevents concurrent access to the same socket (for good reason--data from the socket may be delivered to the wrong listener)."

I haven't driven any system this hard to suss out the issues, but there's the nut of it - how to keep from cascading that load out to validation of authorization tokens. The middleware is assuming that eventlet and any needed patching has already been done when it's invoked (i.e. no monkeypatching in there), and loads the "memcache" module and uses whatever it has in there directly. 

This is all assuming you're using the current standard of UUID based tokens. Keystone is also supporting PKI based tokens, which removes the need to constantly make the validation call, but at the computational cost of unspinning the decryption around the signed token. I don't know of any load numbers and analysis with that backing set up at this time, and would expect that any initial analysis would lead to some clear performance optimizations that may be needed.

- joe


On Oct 24, 2012, at 1:20 PM, Alejandro Comisario <alejandro.comisario@xxxxxxxxxxxxxxxx> wrote:
> Thanks Josh, and Thanks John.
> I know it was an exciting Summit! Congrats to everyone !
> 
> John, let me give you extra data and something that i've already said, that might me wrong.
> 
> First, the request size that will compose the 90.000RPM - 200.000 RPM will be from 90% 20K objects, and 10% 150/200K objects.
> Second, all the "GET" requests, are going to be "public", configured through ACL, so, if the GET requests are public (so, no X-Auth-Token is passed) why should i be worried about the keystone middleware ?
> 
> Just to clarify, because i really want to understand what my real metrics are so i can know where to tune in case i need to.
> Thanks !
> 
> ---
> Alejandrito
> 
> 
> On Wed, Oct 24, 2012 at 3:28 PM, John Dickinson <me@xxxxxx> wrote:
> Sorry for the delay. You've got an interesting problem, and we were all quite busy last week with the summit.
> 
> First, the standard caveat: Your performance is going to be highly dependent on your particular workload and your particular hardware deployment. 3500 req/sec in two different deployments may be very different based on the size of the requests, the spread of the data requested, and the type of requests. Your experience may vary, etc, etc.
> 
> However, for an attempt to answer your question...
> 
> 6 proxies for 3500 req/sec doesn't sound unreasonable. It's in line with other numbers I've seen from people and what I've seen from other large scale deployments. You are basically looking at about 600 req/sec/proxy.
> 
> My first concern is not the swift workload, but how keystone handles the authentication of the tokens. A quick glance at the keystone source seems to indicate that keystone's auth_token middleware is using a standard memcached module that may not play well with concurrent connections in eventlet. Specifically, sockets cannot be reused concurrently by different greenthreads. You may find that the token validation in the auth_token middleware fails under any sort of load. This would need to be verified by your testing or an examination of the memcache module being used. An alternative would be to look at the way swift implements it's memcache connections in an eventlet-friendly way (see swift/common/memcache.py:_get_conns() in the swift codebase).
> 
> --John
> 
> 
> 
> On Oct 11, 2012, at 4:28 PM, Alejandro Comisario <alejandro.comisario@xxxxxxxxxxxxxxxx> wrote:
> 
> > Hi Stackers !
> > This is the thing, today we have a 24 datanodes (3 copies, 90TB usables) each datanode has 2 intel hexacores CPU with HT and 96GB of RAM, and 6 Proxies with the same hardware configuration, using swift 1.4.8 with keystone.
> > Regarding the networking, each proxy / datanodes has a dual 1Gb nic, bonded in LACP mode 4, each of the proxies are behind an F5 BigIP Load Balancer ( so, no worries over there ).
> >
> > Today, we are receiving 5000 RPM ( Requests per Minute ) with 660 RPM per Proxies, i know its low, but now ... with a new product migration, soon ( really soon ) we are expecting to receive about a total of 90.000 RPM average ( 1500 req / s ) with weekly peaks of 200.000 RPM ( 3500 req / s ) to the swift api, witch will be 90% public gets ( no keystone auth ) and 10% authorized PUTS (keystone in the middle, worth to know that we have a 10 keystone vms pool, connected to a 5 nodes galera mysql cluster, so no worries there either )
> >
> > So, 3500 req/s divided by 6 proxy nodes doesnt sounds too much, but well, its a number that we cant ignore.
> > What do you think about this numbers? does this 6 proxies sounds good, or we should double or triple the proxies ? Does anyone has this size of requests and can share their configs ?
> >
> > Thanks a lot, hoping to ear from you guys !
> >
> > -----
> > alejandrito
> > _______________________________________________
> > Mailing list: https://launchpad.net/~openstack
> > Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> > Unsubscribe : https://launchpad.net/~openstack
> > More help   : https://help.launchpad.net/ListHelp
> 
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp


Follow ups

References