← Back to team overview

launchpad-dev team mailing list archive

performance tuesday - single threaded appservers

 

So single threaded appservers - this has been discussed here and there
before - its recorded as rt 41361.

In an internal thread about this general topic, I did a bit of
analysis a while back. It may be interesting here (I've tweaked it
slightly) -  but see after what I did today for performance :)

I've put together a config file generator for the lp production
configs, this will let us generate new and updated configs (until we
hit 26 servers - initial version limit) much more easily. I filed
https://bugs.launchpad.net/launchpad-foundations/+bug/680375 over the
one part of this that isn't straight forward to create
algorithmically.

I then used this to update the configuration on wampee, the machine
that the sysadmins have been kind enough to put more memory in (12GB).

This should get deployed tonightish I hope, and we should see less
oopses from lpnet14->18, as well as slightly faster response times.

We should do some serious stats on this once its up and running - run
the PPR against all but those servers, and those servers only, and
compare the results. If they are promising - and I expect they will be
- we should be able to get significantly more headroom by doing this
across the board. OTOH this will put more pressure on our DB servers.

-Rob

*****
A gedanken on this: if your app is split 50% DB time, 50% python time,
and you run 4 threads, you need 50% of your CPU work to  happen
outside the GIL - in one second you'd expect the following totals:
 2 seconds of DB work (4 * .5)
 2 seconds of Python work (4 * .5)
 1 second of GIL held time (wallclock - single threaded)
 1 second of non-GIL held C API time (4seconds total work - 2secondsDB
-1secondGIL)

If the amount of GIL released time is less than than 50% of your work,
you'll be bottlenecked on bytecode interpretation.

If we think that 75% of our request time is DB work, then we get:
 3 seconds DB
 1 second Python
 1 second GIL held time
 0 non-GIL time

And if we think that 25% of our request time is DB work, then we get:
 1 second DB
 3 seconds Python
 1 second GIL held time
 2 seconds on non-GIL held time (4secondswork-1secondDB-1secondGIL)

If the amount of work in your app spends more than 30% holding the GIL
(per thread) you'll be bottlenecked on bytecode interpretation (and
you'll see long request times which will *skew* the perceived
DB-time/Python-time ratio).

So the greater the fraction of request work that is spent doing DB
work, the more GIL-held work that can be tolerated in a multithreaded
appserver. The less DB work - the faster the queries - the less Python
GIL held work that can be tolerated before you bottleneck on the GIL.

In this simple model of the interactions, more threads make the point
at which bottlenecks start dramatically worse; less threads make it
better.

Of course, if our DB time is also Python time in the same process, the
model is incomplete.

That said, benchmarks done a while back for LP show the service time
per request increasing immediately that the concurrent request rate
increases - a strong indicator to me that this model is sane.

In the experiment we're about to do, we're going to be running single
threaded, because it should give the clearest result and prevent
conflation from other variables.

We may do a 2-threaded one at some point as well, but there aren't any
specific plans to do so yet.
*****



Follow ups