openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #01319
Re: Queue Service Implementation Thoughts
With this switch to python, does it make sense to revisit the concept
of openstack-common for things like logging, flag parsing, etc? What
components would you like to just be able to drop in from nova,
glance, or swift?
-todd[1]
On Tue, Mar 8, 2011 at 4:05 PM, Eric Day <eday@xxxxxxxxxxxx> wrote:
> Hi everyone,
>
> I added a sqlite backend to the prototype and ran some tests. Initially
> things were very slow, but after some further testing I was able
> to figure out where the time was being spent. In order to do this I
> added a very simple binary protocol interface to insert only. These
> tests are with a single server process and multiple client processes
> (so don't compare to previous email numbers that were two process). The
> numbers given are requests/second.
>
> echo (no parsing) - 17k
>
> binary - 13k
> binary+insert msg into dict - 11k
> binary+insert msg into sqlite - 8.7k
>
> wsgi - 4.9k
> wsgi+webob - 3.5k
> wsgi+webob+insert msg into dict - 3.4k
> wsgi+webob+insert msg into sqlite - 2.8k
>
> wsgi+webob+routes - 1.9k
> wsgi+webob+routes+insert msg into dict - 1.8k
> wsgi+webob+routes+insert msg into sqlite - 1.5k
>
> This shows that without wsgi/webob/routes, the performance is pretty
> decent). This would be the case when using an efficient binary protocol
> or perhaps a more efficient HTTP parser.
>
> Next, it shows WSGI adds significant overhead. The webob and routes
> modules also add a fair amount.
>
> I'm going to rework the current code in the prototype into a proper
> project in the burrow trunk with modular front-ends and back-ends so
> we can easily test new options. I'll stick with the current wsgi code
> for now just to get things functioning and we can look at optimizing
> later. For the proxy-server communication, we'll definitely need to
> use something more efficient than stock wsgi modules in the long run.
>
> No red flags yet with Python, and we're in the ballpark for efficiency
> with a binary protocol. A quick test with other servers showed
> rabbitmq at about 9k messages/sec (binary protocol, Erlang server)
> and Gearman at about 13k messages/sec (binary protocol, C server).
>
> -Eric
>
> On Mon, Mar 07, 2011 at 01:32:55PM -0800, Eric Day wrote:
>> I ran the tests again to verify:
>>
>> 500k requests - 10 processes each running 50k requests.
>>
>> time req/s cs us sy id
>> 2 thread/proc
>> echo c++ 7.19 69541 142182 23 77 0
>> echo erlang 9.53 52465 105871 39 61 0
>> echo python 9.58 52192 108420 42 58 0
>> 2 thread/proc
>> wsgi python 58.74 8512 18132 86 14 0
>> webob python 78.75 6349 13678 89 11 0
>> webmachine* 63.50 7874 11834 89 11 0
>> openstack 20.48 24414 49897 68 32 0
>>
>> cs/us/sys/id are from vmstat while running the tests.
>>
>> * webmachine degrades over time with long-lived, multi-request
>> connections. This number was estimated with 1000 requests per
>> connection. At 50k requests per connection, the rate dropped to
>> 2582 req/s.
>>
>> As you can see I was able to reproduce the same numbers. If
>> anyone would like to do the same, you can grab lp:burrow for the
>> "openstack" Erlang application (compile and run ./start), webmachine
>> is at https://github.com/basho/webmachine (you need to create a demo
>> app and make sure you set nodelay for the socket options), and I've
>> attached the python server and client (start 10 client processes when
>> testing). Find me on irc (eday in #openstack) if you have questions.
>>
>> If we hit performance issues with this type of application, we'll
>> probably hit them around the same time with both Erlang and Python
>> (then we'll look to C/C++). Since most OpenStack developers are a lot
>> more comfortable with Python, I suggest we make the switch. Please
>> response with thoughts on this. I'll add a sqlite backend to the
>> Python prototype and run some performance tests against that to see
>> if any red flags come up.
>>
>> -Eric
>>
>> On Sat, Mar 05, 2011 at 10:39:18PM -0700, ksankar@xxxxxxxxxxxxxx wrote:
>> > Eric,
>> > Thanks for your experimentation and analysis. Somewhat illustrates the
>> > point about premature optimization. First cut, have to agree with you and
>> > conclude that python implementation is effective, overall. As you said,if
>> > we find performance bottlenecks, especially as the payload size increases
>> > (as well as if we require any complex processing at the http server layer)
>> > we can optimize specific areas.
>> > Just for sure, might be better for someone else to recheck. That way
>> > we have done our due diligence.
>> > Cheers
>> > <k/>
>> >
>> > -------- Original Message --------
>> > Subject: [Openstack] Queue Service Implementation Thoughts
>> > From: Eric Day <eday@xxxxxxxxxxxx>
>> > Date: Sat, March 05, 2011 4:07 pm
>> > To: openstack@xxxxxxxxxxxxxxxxxxx
>> >
>> > Hi everyone,
>> >
>> > When deciding to move forward with Erlang, I first tried out the Erlang
>> > REST framework webmachine (it is built on top of mochiweb and used
>> > by projects like Riak). After some performance testing, I decided to
>> > write a simple wrapper over the HTTP packet parsing built into Erlang
>> > (also used by mochiweb/webmachine) to see if I could make things a
>> > bit more efficient. Here are the results:
>> >
>> > Erlang (2 threads)
>> > echo - 58823 reqs/sec
>> > webmachine - 7782 reqs/sec
>> > openstack - 24154 reqs/sec
>> >
>> > The test consists of four concurrent connections focused on packet
>> > parsing speed and framework overhead. A simple echo test was also
>> > done for a baseline (no parsing, just a simple recv/send loop). As
>> > you can see, the simple request/response wrapper I wrote did get some
>> > gains, although it's a little more hands-on to use (looks more like
>> > wsgi+webob in python).
>> >
>> > I decided to run the same tests against Python just for comparison. I
>> > ran echo, wsgi, and wsgi+webob decorators all using eventlet. I ran
>> > both single process and two process in order to compare with Erlang
>> > which was running with two threads.
>> >
>> > Python (eventlet)
>> > echo (1 proc) - 17857 reqs/sec
>> > echo (2 proc) - 52631 reqs/sec
>> > wsgi (1 proc) - 4859 reqs/sec
>> > wsgi (2 proc) - 8695 reqs/sec
>> > wsgi webob (1 proc) - 3430 reqs/sec
>> > wsgi webob (2 proc) - 6142 reqs/sec
>> >
>> > As you can see, the two process Python echo server was not too far
>> > behind the two thread Erlang echo server. The wsgi overhead was
>> > significant, especially with the webob decorators/objects. It was
>> > still on par with webmachine, but a factor of three less than my
>> > simple request/response wrapper.
>> >
>> > A multi-process python server does have the drawback of not being
>> > able to share resources between processes unless incurring the
>> > overhead of IPC. When thinking about a horizontally scalable service,
>> > where scaling-out is much more important than scaling-up, I think
>> > this becomes much less of a factor. Regardless of language choice,
>> > we will need a proxy to efficiently hash to a set of queue servers in
>> > any large deployment (or the clients will hash), but if that set is a
>> > larger number of single-process python servers (some running on the
>> > same machine) instead of a smaller number of multi-threaded Erlang
>> > servers, I don't think it will make too much of a difference (each
>> > proxy server will need to maintain more connections). In previous
>> > queue service threads I was much more concerned about this and was
>> > leaning away from Python, but I think I may be coming around.
>> >
>> > Another aspect I took a look at is options for message storage. For
>> > the fast, in-memory, unreliable queue type, here are some numbers
>> > for options in Python and Erlang:
>> >
>> > Raw message = key(16) + ttl(8) + hide(8) + body(100) = 132 bytes
>> > Python list/dict - 248 bytes/msg (88% overhead)
>> > Python sqlite3 - 168 bytes/msg (27% overhead)
>> > Erlang ets - 300 bytes/msg (127% overhead)
>> >
>> > The example raw message has no surrounding data structure, so it is
>> > obviously never possible to get down to 132 bytes. As the body grows,
>> > the overhead becomes less significant since they all grow the same
>> > amount. The best Python option is probably an in-memory sqlite table,
>> > which is also an option for disk-based storage as well.
>> >
>> > For Erlang, ets is really the only efficient in-memory option (mnesia
>> > is built on ets if you're thinking of that), and also has a disk
>> > counterpart called dets. The overhead was definitely more than I was
>> > expecting and is less memory efficient than both Python options.
>> >
>> > As we start looking at other stores to use, there are certainly more
>> > DB drivers available for Python than Erlang (due to the fact that
>> > Python is more popular). We'll want to push most of the heavy lifting
>> > to the pluggable databases, which makes the binding language less of
>> > a concern as well.
>> >
>> > So, in conclusion, and going against my previous opinion, I'm starting
>> > to feel that the performance gains of Erlang are really not that
>> > significant compared to Python for this style of application. If
>> > we're talking about a factor of three (and possibly less if we can
>> > optimize the wsgi driver or not use wsgi), and consider the database
>> > driver options for queue storage, Python doesn't look so bad. We'll
>> > certainly have more of a developer community too.
>> >
>> > We may still need to write parts in C/C++ if limits can't be overcome,
>> > but that would probably be the case for Erlang or Python.
>> >
>> > What do folks think?
>> >
>> > -Eric
>> >
>> > _______________________________________________
>> > Mailing list: https://launchpad.net/~openstack
>> > Post to : openstack@xxxxxxxxxxxxxxxxxxx
>> > Unsubscribe : https://launchpad.net/~openstack
>> > More help : https://help.launchpad.net/ListHelp
>
>> import socket
>> import sys
>>
>> connection = socket.socket()
>> connection.connect(('localhost', int(sys.argv[1])))
>> for x in xrange(50000):
>> connection.sendall("GET / HTTP/1.1\r\nHost: localhost\r\n\r\n")
>> connection.recv(1024)
>
>> import os
>> import sys
>>
>> import eventlet
>> import eventlet.wsgi
>> import webob.dec
>> import webob.exc
>>
>> COUNT = 0
>>
>> def handle_echo(fd):
>> global COUNT
>> while True:
>> c = fd.recv(1024)
>> if not c:
>> break
>> fd.sendall(c)
>> COUNT += 1
>> if COUNT % 1000 == 0:
>> sys.stderr.write('%d\n' % COUNT)
>> eventlet.sleep(0)
>>
>> def handle_wsgi(environ, start_response):
>> global COUNT
>> COUNT += 1
>> if COUNT % 1000 == 0:
>> sys.stderr.write('%d\n' % COUNT)
>> eventlet.sleep(0)
>> start_response('200 Ok', [('Content-Type', 'text/plain')])
>> return "test"
>>
>> @webob.dec.wsgify
>> def handle_webob(req):
>> global COUNT
>> COUNT += 1
>> if COUNT % 1000 == 0:
>> sys.stderr.write('%d\n' % COUNT)
>> eventlet.sleep(0)
>> return webob.exc.HTTPOk(body="test")
>>
>> server = eventlet.listen(('localhost', int(sys.argv[2])))
>> os.fork()
>> eventlet.hubs.use_hub('poll')
>>
>> if sys.argv[1] == 'echo':
>> while True:
>> new_sock, address = server.accept()
>> eventlet.spawn_n(handle_echo, new_sock)
>> # Add a slight delay between accepts so they balance between processes.
>> eventlet.sleep(0.010)
>> elif sys.argv[1] == 'wsgi':
>> eventlet.wsgi.server(server, handle_wsgi, log=sys.stdout)
>> elif sys.argv[1] == 'webob':
>> eventlet.wsgi.server(server, handle_webob, log=sys.stdout)
>> else:
>> print 'Usage: %s echo|wsgi|webob <port>' % sys.argv[0]
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
>
Follow ups
References