mysql-proxy-discuss team mailing list archive
-
mysql-proxy-discuss team
-
Mailing list archive
-
Message #00033
Re: funnel - a multiplexer plugin for mysql-proxy
Hi Nick!
On Feb 5, 2009, at 6:23 PM, nick loeve wrote:
Hi,
I have created an experimental branch of the mysql-proxy code on
launchpad, in order to show what I have done to implement a connection
multiplexer with backlog for mysql-proxy, and hopefully get some
feedback on our implementation/design. We called the plugin 'funnel'.
[...]
Our existing solution works, but we are looking at how to get more
performance using a plugin to mysql-proxy. The plugin in the branch I
posted accomplishes the main tasks I described above, but there are
more features that we would like to implement, mainly support for more
statistics via the admin plugin. I had to make a few changes to the
core state-machine that handles the front-end/back-end connection
state in order to achieve the backlog.
Very interesting! I will take a closer look at what the actual
differences to the proxy plugin are later (Launchpad sadly doesn't
make it easy to diff two files...).
Some assumptions:
We have some hardcoded 'assumptions' in the code base, such as only
ever using one backend (as we always have the funnel sitting in front
of a mysqld on the same host) and we have a single user/database for
most of slave architectures, so multi-user and or complex permissions
may not work correctly. I would like to eventually remove these
limitations/assumptions.
Fair enough for a first version, I'd say.
We are currently testing our plugin in a live environment, and our
benchmarks are proving that the mysql-proxy design is giving us better
capacity and lower average query times at peak traffic times.
I'd be interested in some of the boundary conditions of your setup:
- how many queries/sec do you have?
- what is the average/stddev of the execution time of those queries?
- how large are the resultsets (approx in bytes or something)?
- how many clients are generally active at the same time (IOW what's
the concurrency)?
The reason I'm asking is because I've seen situations where the
relative timing of query arrival, the average size of the resultsets
and the execution times were favorable to the current single thread
implementation in that you would not really notice a slowdown compared
to going to mysqld directly.
In general, I think it would be a safe statement to say that for high
concurrency with very small execution times and tiny resultsets the
current single threaded Proxy has the most trouble (all because the
added latency is much larger than the time spent for the query itself).
It would be nice to see if this theory holds true for what you are
seeing, as well.
Im particularly interested in the blueprint on launchpad about
threaded
I/O. We did have an attempt at adding a thread pool to our plugin in
order to handle some backlog clearing and some I/O, but without large
changes to the main proxy engine we did not succeed in getting stable
enough to really test out in our high traffic environment.
In fact, Jan and I have met today and talked on this very topic.
Soon we will pick up our efforts in adding multithreading (mostly
revitalizing old patches).
The current plan is the following (and we need to add these to the
blueprints after our team meeting next week):
Step 1:
- accept connections on one thread
- have multiple worker threads the accepted filedescriptor gets handed
off to (via a queue)
- all subsequent events on this filedescriptor will be handled by the
thread it was handed to
this essentially means that all network traffic will be handled by
multiple threads
since we still have a global lock around the Lua state, everything
that needs to go into Lua will run as before in a single thread
Step 2:
- give each thread its own local Lua state (still sharing the script)
- remove the global mutex
- access to global structures (backends, usually) will need some kind
of synchronization
we would like to use a shared-nothing approach, basically making
copies of global structures and versioning them (checks can be done
with atomic ints, for example)
LuaLanes is another alternative.
Step 1 is relatively easy compared to Step 2.
There are few things to take into consideration, of course, even with
step 1.
My initial prototype picked a worker thread on every event, which
proved to be extremely heavyweight under high load, mostly because the
queue used was under high contention (reading data from a socket
tended to be not much slower than the overhead of putting the event
into a queue and letting a worker thread pull it out again. it was one
hot queue...)
The danger with making connections stick to one thread for their
entire lifetime is that one thread might end up with getting all the
active connections, and leave the other threads idling, thus turning
the entire thing into a more or less heavyweight single thread
implementation. I'm not yet sure how to solve this efficiently, but I
guess we will try different approaches before we pick a winner.
Step 2 is not without complications either. Since a copied global
state would only be "mostly up to date", it's fairly important to pick
the places where we update it. In most cases the global state is
relatively static, but in some applications it might not be, e.g. in
load balancing situations where backend weights are a function of
backend system load, number of queries executed or something along
those lines.
In those situations, it might actually be cheaper to use a mutex to
access global state rather than copying a lot, but that can lead to
high lock contention, too. Maybe a non-locking alternative would be
better, using atomic operations where they are available (currently
glib is lacking them on HP/UX for PA-RISC iirc, maybe some AIX, too).
Atomic ops are not without traps either, of course.
In either case, we need to have an implementation to make these kinds
of decisions, otherwise the effects are pure speculation.
(As an aside: We cannot get away with defining the lua_lock/lua_unlock
macros to acquire/release a mutex because those only make the Lua
interpreter itself threadsafe, not what we built on top of it...sadly)
Thanks for sharing!
cheers,
-k
--
Kay Roepke
Software Engineer, MySQL Enterprise Tools
Sun Microsystems GmbH Sonnenallee 1, DE-85551 Kirchheim-Heimstetten
Geschaeftsfuehrer: Thomas Schroeder, Wolfang Engels, Dr. Roland Boemer
Vorsitz d. Aufs.rat.: Martin Haering HRB MUC 161028
Follow ups
References