openstack team mailing list archive

Thread
Date
Re: Queue Service Implementation Thoughts

To: Sandy Walsh <sandy.walsh@xxxxxxxxxxxxx>
From: Eric Day <eday@xxxxxxxxxxxx>
Date: Tue, 8 Mar 2011 15:01:13 -0800
Cc: "openstack@xxxxxxxxxxxxxxxxxxx" <openstack@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <9588_1299624938_p28MtVPY026171_60A3427EF882A54BA0A1971AE6EF038807B5D4FC@SAT2EXD02.RACKSPACE.CORP>
User-agent: Mutt/1.5.20 (2009-06-14)
Yup, those are options we can try when we're ready to optimize.

-Eric

On Tue, Mar 08, 2011 at 10:55:34PM +0000, Sandy Walsh wrote:
> I'm sure you've seen this: http://nichol.as/benchmark-of-python-web-servers?source=g
> 
> -S
> ________________________________________
> From: openstack-bounces+sandy.walsh=rackspace.com@xxxxxxxxxxxxxxxxxxx [openstack-bounces+sandy.walsh=rackspace.com@xxxxxxxxxxxxxxxxxxx] on behalf of Eric Day [eday@xxxxxxxxxxxx]
> Sent: Tuesday, March 08, 2011 5:58 PM
> To: Todd Willey
> Cc: openstack@xxxxxxxxxxxxxxxxxxx
> Subject: Re: [Openstack] Queue Service Implementation Thoughts
> 
> On Tue, Mar 08, 2011 at 04:47:38PM -0500, Todd Willey wrote:
> > With this switch to python, does it make sense to revisit the concept
> > of openstack-common for things like logging, flag parsing, etc?  What
> > components would you like to just be able to drop in from nova,
> > glance, or swift?
> 
> Yes, I'm planning on putting as much as possible into openstack-common,
> picking the best from nova/swift/glance as we move along. Nova, swift,
> and glance can start using those modules as they see fit.
> 
> -Eric
> 
> >
> > -todd[1]
> >
> > On Tue, Mar 8, 2011 at 4:05 PM, Eric Day <eday@xxxxxxxxxxxx> wrote:
> > > Hi everyone,
> > >
> > > I added a sqlite backend to the prototype and ran some tests. Initially
> > > things were very slow, but after some further testing I was able
> > > to figure out where the time was being spent. In order to do this I
> > > added a very simple binary protocol interface to insert only. These
> > > tests are with a single server process and multiple client processes
> > > (so don't compare to previous email numbers that were two process). The
> > > numbers given are requests/second.
> > >
> > > echo (no parsing) - 17k
> > >
> > > binary - 13k
> > > binary+insert msg into dict - 11k
> > > binary+insert msg into sqlite - 8.7k
> > >
> > > wsgi - 4.9k
> > > wsgi+webob - 3.5k
> > > wsgi+webob+insert msg into dict - 3.4k
> > > wsgi+webob+insert msg into sqlite - 2.8k
> > >
> > > wsgi+webob+routes - 1.9k
> > > wsgi+webob+routes+insert msg into dict - 1.8k
> > > wsgi+webob+routes+insert msg into sqlite - 1.5k
> > >
> > > This shows that without wsgi/webob/routes, the performance is pretty
> > > decent). This would be the case when using an efficient binary protocol
> > > or perhaps a more efficient HTTP parser.
> > >
> > > Next, it shows WSGI adds significant overhead. The webob and routes
> > > modules also add a fair amount.
> > >
> > > I'm going to rework the current code in the prototype into a proper
> > > project in the burrow trunk with modular front-ends and back-ends so
> > > we can easily test new options. I'll stick with the current wsgi code
> > > for now just to get things functioning and we can look at optimizing
> > > later. For the proxy-server communication, we'll definitely need to
> > > use something more efficient than stock wsgi modules in the long run.
> > >
> > > No red flags yet with Python, and we're in the ballpark for efficiency
> > > with a binary protocol. A quick test with other servers showed
> > > rabbitmq at about 9k messages/sec (binary protocol, Erlang server)
> > > and Gearman at about 13k messages/sec (binary protocol, C server).
> > >
> > > -Eric
> > >
> > > On Mon, Mar 07, 2011 at 01:32:55PM -0800, Eric Day wrote:
> > >> I ran the tests again to verify:
> > >>
> > >> 500k requests - 10 processes each running 50k requests.
> > >>
> > >>                 time req/s     cs us sy id
> > >> 2 thread/proc
> > >>   echo c++      7.19 69541 142182 23 77  0
> > >>   echo erlang   9.53 52465 105871 39 61  0
> > >>   echo python   9.58 52192 108420 42 58  0
> > >> 2 thread/proc
> > >>   wsgi python  58.74 8512   18132 86 14  0
> > >>   webob python 78.75 6349   13678 89 11  0
> > >>   webmachine*  63.50 7874   11834 89 11  0
> > >>   openstack    20.48 24414  49897 68 32  0
> > >>
> > >> cs/us/sys/id are from vmstat while running the tests.
> > >>
> > >> * webmachine degrades over time with long-lived, multi-request
> > >>   connections. This number was estimated with 1000 requests per
> > >>   connection. At 50k requests per connection, the rate dropped to
> > >>   2582 req/s.
> > >>
> > >> As you can see I was able to reproduce the same numbers. If
> > >> anyone would like to do the same, you can grab lp:burrow for the
> > >> "openstack" Erlang application (compile and run ./start), webmachine
> > >> is at https://github.com/basho/webmachine (you need to create a demo
> > >> app and make sure you set nodelay for the socket options), and I've
> > >> attached the python server and client (start 10 client processes when
> > >> testing). Find me on irc (eday in #openstack) if you have questions.
> > >>
> > >> If we hit performance issues with this type of application, we'll
> > >> probably hit them around the same time with both Erlang and Python
> > >> (then we'll look to C/C++). Since most OpenStack developers are a lot
> > >> more comfortable with Python, I suggest we make the switch. Please
> > >> response with thoughts on this. I'll add a sqlite backend to the
> > >> Python prototype and run some performance tests against that to see
> > >> if any red flags come up.
> > >>
> > >> -Eric
> > >>
> > >> On Sat, Mar 05, 2011 at 10:39:18PM -0700, ksankar@xxxxxxxxxxxxxx wrote:
> > >> >    Eric,
> > >> >       Thanks for your experimentation and analysis. Somewhat illustrates the
> > >> >    point about premature optimization. First cut, have to agree with you and
> > >> >    conclude that python implementation is effective, overall. As you said,if
> > >> >    we find performance bottlenecks, especially as the payload size increases
> > >> >    (as well as if we require any complex processing at the http server layer)
> > >> >    we can optimize specific areas.
> > >> >        Just for sure, might be better for someone else to recheck. That way
> > >> >    we have done our due diligence.
> > >> >    Cheers
> > >> >    <k/>
> > >> >
> > >> >      -------- Original Message --------
> > >> >      Subject: [Openstack] Queue Service Implementation Thoughts
> > >> >      From: Eric Day <eday@xxxxxxxxxxxx>
> > >> >      Date: Sat, March 05, 2011 4:07 pm
> > >> >      To: openstack@xxxxxxxxxxxxxxxxxxx
> > >> >
> > >> >      Hi everyone,
> > >> >
> > >> >      When deciding to move forward with Erlang, I first tried out the Erlang
> > >> >      REST framework webmachine (it is built on top of mochiweb and used
> > >> >      by projects like Riak). After some performance testing, I decided to
> > >> >      write a simple wrapper over the HTTP packet parsing built into Erlang
> > >> >      (also used by mochiweb/webmachine) to see if I could make things a
> > >> >      bit more efficient. Here are the results:
> > >> >
> > >> >      Erlang (2 threads)
> > >> >      echo - 58823 reqs/sec
> > >> >      webmachine - 7782 reqs/sec
> > >> >      openstack - 24154 reqs/sec
> > >> >
> > >> >      The test consists of four concurrent connections focused on packet
> > >> >      parsing speed and framework overhead. A simple echo test was also
> > >> >      done for a baseline (no parsing, just a simple recv/send loop). As
> > >> >      you can see, the simple request/response wrapper I wrote did get some
> > >> >      gains, although it's a little more hands-on to use (looks more like
> > >> >      wsgi+webob in python).
> > >> >
> > >> >      I decided to run the same tests against Python just for comparison. I
> > >> >      ran echo, wsgi, and wsgi+webob decorators all using eventlet. I ran
> > >> >      both single process and two process in order to compare with Erlang
> > >> >      which was running with two threads.
> > >> >
> > >> >      Python (eventlet)
> > >> >      echo (1 proc) - 17857 reqs/sec
> > >> >      echo (2 proc) - 52631 reqs/sec
> > >> >      wsgi (1 proc) - 4859 reqs/sec
> > >> >      wsgi (2 proc) - 8695 reqs/sec
> > >> >      wsgi webob (1 proc) - 3430 reqs/sec
> > >> >      wsgi webob (2 proc) - 6142 reqs/sec
> > >> >
> > >> >      As you can see, the two process Python echo server was not too far
> > >> >      behind the two thread Erlang echo server. The wsgi overhead was
> > >> >      significant, especially with the webob decorators/objects. It was
> > >> >      still on par with webmachine, but a factor of three less than my
> > >> >      simple request/response wrapper.
> > >> >
> > >> >      A multi-process python server does have the drawback of not being
> > >> >      able to share resources between processes unless incurring the
> > >> >      overhead of IPC. When thinking about a horizontally scalable service,
> > >> >      where scaling-out is much more important than scaling-up, I think
> > >> >      this becomes much less of a factor. Regardless of language choice,
> > >> >      we will need a proxy to efficiently hash to a set of queue servers in
> > >> >      any large deployment (or the clients will hash), but if that set is a
> > >> >      larger number of single-process python servers (some running on the
> > >> >      same machine) instead of a smaller number of multi-threaded Erlang
> > >> >      servers, I don't think it will make too much of a difference (each
> > >> >      proxy server will need to maintain more connections). In previous
> > >> >      queue service threads I was much more concerned about this and was
> > >> >      leaning away from Python, but I think I may be coming around.
> > >> >
> > >> >      Another aspect I took a look at is options for message storage. For
> > >> >      the fast, in-memory, unreliable queue type, here are some numbers
> > >> >      for options in Python and Erlang:
> > >> >
> > >> >      Raw message = key(16) + ttl(8) + hide(8) + body(100) = 132 bytes
> > >> >      Python list/dict - 248 bytes/msg (88% overhead)
> > >> >      Python sqlite3 - 168 bytes/msg (27% overhead)
> > >> >      Erlang ets - 300 bytes/msg (127% overhead)
> > >> >
> > >> >      The example raw message has no surrounding data structure, so it is
> > >> >      obviously never possible to get down to 132 bytes. As the body grows,
> > >> >      the overhead becomes less significant since they all grow the same
> > >> >      amount. The best Python option is probably an in-memory sqlite table,
> > >> >      which is also an option for disk-based storage as well.
> > >> >
> > >> >      For Erlang, ets is really the only efficient in-memory option (mnesia
> > >> >      is built on ets if you're thinking of that), and also has a disk
> > >> >      counterpart called dets. The overhead was definitely more than I was
> > >> >      expecting and is less memory efficient than both Python options.
> > >> >
> > >> >      As we start looking at other stores to use, there are certainly more
> > >> >      DB drivers available for Python than Erlang (due to the fact that
> > >> >      Python is more popular). We'll want to push most of the heavy lifting
> > >> >      to the pluggable databases, which makes the binding language less of
> > >> >      a concern as well.
> > >> >
> > >> >      So, in conclusion, and going against my previous opinion, I'm starting
> > >> >      to feel that the performance gains of Erlang are really not that
> > >> >      significant compared to Python for this style of application. If
> > >> >      we're talking about a factor of three (and possibly less if we can
> > >> >      optimize the wsgi driver or not use wsgi), and consider the database
> > >> >      driver options for queue storage, Python doesn't look so bad. We'll
> > >> >      certainly have more of a developer community too.
> > >> >
> > >> >      We may still need to write parts in C/C++ if limits can't be overcome,
> > >> >      but that would probably be the case for Erlang or Python.
> > >> >
> > >> >      What do folks think?
> > >> >
> > >> >      -Eric
> > >> >
> > >> >      _______________________________________________
> > >> >      Mailing list: https://launchpad.net/~openstack
> > >> >      Post to : openstack@xxxxxxxxxxxxxxxxxxx
> > >> >      Unsubscribe : https://launchpad.net/~openstack
> > >> >      More help : https://help.launchpad.net/ListHelp
> > >
> > >> import socket
> > >> import sys
> > >>
> > >> connection = socket.socket()
> > >> connection.connect(('localhost', int(sys.argv[1])))
> > >> for x in xrange(50000):
> > >>     connection.sendall("GET / HTTP/1.1\r\nHost: localhost\r\n\r\n")
> > >>     connection.recv(1024)
> > >
> > >> import os
> > >> import sys
> > >>
> > >> import eventlet
> > >> import eventlet.wsgi
> > >> import webob.dec
> > >> import webob.exc
> > >>
> > >> COUNT = 0
> > >>
> > >> def handle_echo(fd):
> > >>   global COUNT
> > >>   while True:
> > >>     c = fd.recv(1024)
> > >>     if not c:
> > >>       break
> > >>     fd.sendall(c)
> > >>     COUNT += 1
> > >>     if COUNT % 1000 == 0:
> > >>       sys.stderr.write('%d\n' % COUNT)
> > >>       eventlet.sleep(0)
> > >>
> > >> def handle_wsgi(environ, start_response):
> > >>   global COUNT
> > >>   COUNT += 1
> > >>   if COUNT % 1000 == 0:
> > >>     sys.stderr.write('%d\n' % COUNT)
> > >>     eventlet.sleep(0)
> > >>   start_response('200 Ok', [('Content-Type', 'text/plain')])
> > >>   return "test"
> > >>
> > >> @webob.dec.wsgify
> > >> def handle_webob(req):
> > >>   global COUNT
> > >>   COUNT += 1
> > >>   if COUNT % 1000 == 0:
> > >>     sys.stderr.write('%d\n' % COUNT)
> > >>     eventlet.sleep(0)
> > >>   return webob.exc.HTTPOk(body="test")
> > >>
> > >> server = eventlet.listen(('localhost', int(sys.argv[2])))
> > >> os.fork()
> > >> eventlet.hubs.use_hub('poll')
> > >>
> > >> if sys.argv[1] == 'echo':
> > >>   while True:
> > >>     new_sock, address = server.accept()
> > >>     eventlet.spawn_n(handle_echo, new_sock)
> > >>     # Add a slight delay between accepts so they balance between processes.
> > >>     eventlet.sleep(0.010)
> > >> elif sys.argv[1] == 'wsgi':
> > >>   eventlet.wsgi.server(server, handle_wsgi, log=sys.stdout)
> > >> elif sys.argv[1] == 'webob':
> > >>   eventlet.wsgi.server(server, handle_webob, log=sys.stdout)
> > >> else:
> > >>   print 'Usage: %s echo|wsgi|webob <port>' % sys.argv[0]
> > >
> > >
> > > _______________________________________________
> > > Mailing list: https://launchpad.net/~openstack
> > > Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> > > Unsubscribe : https://launchpad.net/~openstack
> > > More help   : https://help.launchpad.net/ListHelp
> > >
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
> 
> 
> Confidentiality Notice: This e-mail message (including any attached or
> embedded documents) is intended for the exclusive and confidential use of the
> individual or entity to which this message is addressed, and unless otherwise
> expressly indicated, is confidential and privileged information of Rackspace.
> Any dissemination, distribution or copying of the enclosed material is prohibited.
> If you receive this transmission in error, please notify us immediately by e-mail
> at abuse@xxxxxxxxxxxxx, and delete the original message.
> Your cooperation is appreciated.
References

Re: Queue Service Implementation Thoughts
From: ksankar, 2011-03-06
Re: Queue Service Implementation Thoughts
From: Eric Day, 2011-03-07
Re: Queue Service Implementation Thoughts
From: Eric Day, 2011-03-08
Re: Queue Service Implementation Thoughts
From: Todd Willey, 2011-03-08
Re: Queue Service Implementation Thoughts
From: Eric Day, 2011-03-08
Re: Queue Service Implementation Thoughts
From: Sandy Walsh, 2011-03-08