mysql-proxy-discuss team mailing list archive

Thread
Date
Fwd: funnel - a multiplexer plugin for mysql-proxy

To: mysql-proxy-discuss@xxxxxxxxxxxxxxxxxxx
From: nick loeve <trickie@xxxxxxxxx>
Date: Mon, 9 Feb 2009 13:20:21 +0100
In-reply-to: <83c2b2e00902090419v4f7d1839j6493b45289f6e34a@mail.gmail.com>
Woops! missed the list


---------- Forwarded message ----------
From: nick loeve <trickie@xxxxxxxxx>
Date: Mon, Feb 9, 2009 at 1:19 PM
Subject: Re: [Mysql-proxy-discuss] funnel - a multiplexer plugin for
mysql-proxy
To: Kay Röpke <Kay.Roepke@xxxxxxx>


On Fri, Feb 6, 2009 at 2:11 PM, Kay Röpke <Kay.Roepke@xxxxxxx> wrote:
> Hi!
>
> On Feb 6, 2009, at 9:35 AM, nick loeve wrote:
>
>> On Thu, Feb 5, 2009 at 8:54 PM, Kay Röpke <Kay.Roepke@xxxxxxx> wrote:
>>>
>>> I'd be interested in some of the boundary conditions of your setup:
>>> - how many queries/sec do you have?
>>> - what is the average/stddev of the execution time of those queries?
>>> - how large are the resultsets (approx in bytes or something)?
>>> - how many clients are generally active at the same time (IOW what's the
>>> concurrency)?
>>
>> [...]
>> Clients connected for the architecture above is around 500 per slave,
>> and can increase slightly at peak times. Those 500 client connections
>> are doing an average of 1K-1.5K queries per second per slave (at peak
>> times). Depending on slave hardware, sometimes up to 20% of queries
>> reach the backlog. We use persistent connections on that arch, so
>> average new connections per second is pretty low.
>
> Persistent connections are definitely the way to go here, yeah, since every
> event on the sockets will further limit the throughput capacity right now.
> Based on the numbers, I would say that having the network i/o multithreaded
> should show a tremendous performance boost, especially with some tuning of
> the number of worker threads.
>
>> We have around 10 slave architectures similar in ratio of
>> slaves/clients/queries/timings to the one mentioned above, and quite a
>> few more that have different replication setups, and are tuned for a
>> particular purpose.
>>
>>>
>>> The reason I'm asking is because I've seen situations where the relative
>>> timing of query arrival, the average size of the resultsets and the
>>> execution times were favorable to the current single thread
>>> implementation
>>> in that you would not really notice a slowdown compared to going to
>>> mysqld
>>> directly.
>>> In general, I think it would be a safe statement to say that for high
>>> concurrency with very small execution times and tiny resultsets the
>>> current
>>> single threaded Proxy has the most trouble (all because the added latency
>>> is
>>> much larger than the time spent for the query itself).
>>> It would be nice to see if this theory holds true for what you are
>>> seeing,
>>> as well.
>>
>> Yes that is exactly what we are seeing in our main slave
>> architectures. We have some beefy hardware for our databslaves, but we
>> struggle to push the queries in and out quick enough to really make
>> the database work hard and take advantage of the number of cores and
>> memory available. Across all our arches our biggest bottleneck is
>> connection handling and network I/O.  We do not see this problem so
>> much with the architectures tuned for returning larger result sets.
>
> Good to know that the theory holds :)
> It all comes down to the single thread, here's hoping we can quickly remedy
> that.
>
>>>> [...]
>>
>> Step one sounds similar to what we tried to do within our plugin, but
>> we more and more had to re-implement parts of the core engine within
>> our plugin to accommodate multiple threads using libevent. I would be
>> interested in helping out where possible to achieve what you described
>> above.
>
> I just briefly talked to Jan and it seems he still has a somewhat clean
> patch
> for our multithreading efforts. IIUC he's pushing that to Launchpad today,
> so maybe
> that helps to see what's involved. Of course it's totally experimental and
> might
> prove disastrous if used in production etc. *disclaim* ;)

Excellent! We are going to give it run in our benchmarks.

>
> Implementing a backlog mostly in a plugin is probably rather painful in the
> long run,
> so I'd like to see this functionality to go into the core proper.

Yes i agree. If you do not need a backlog, most of the implementation
can be fairly transparent IMHO.

>
> Last night I've had a brief diff-look on the funnel plugin, here are some
> observations in no particular order:
> * in funnel_read_auth I would refrain from doing `do { ... } while(0);` just
> because there's a break statement in there (i guess it was just the fastest
> way to implement...)
> * we are generally using G_STRLOC instead of __FILE__, __LINE__ now, this is
> not consistently in the proxy plugin either ATM, just noting (and less to
> type :))
> * the diff is showing what i've suspect for a long time:
>  we need to refactor the proxy plugin into some more manageable parts to
> avoid copying around this much code. I've noticed this before in a different
> plugin (for a completely different purpose, though) but somehow time was
> always to short to actually do it.
> * that you removed all Lua code gave me an interesting idea:
>  it looks like we can factor that out completely as well, to make it
> optional, and of course pluggable, for plugin writers. I have some nifty
> ideas on this already.
>  in general the amount of necessary code duplication right now bothers me
> greatly
> * the limitations you mentioned in your initial mail regarding the number of
> backends and different users look relatively easy to lift.
>  backends: since you always get the first backend replacing the constant
> with a function call that picks a backend based on some algorithm should be
> all that's necessary
>  users: if i'm not mistaken (and there's no bug in it) the connection pool
> code should provide this already.
> * moving to a multithreaded network io implementation, the code is obviously
> not 100% safe because of the backends. for now i'd take a global lock when
> actually handling them (as long as they aren't modified at runtime that
> should be "safe", barring the fact that UP/DOWN information has a race
> condition). this is something we have to properly fix in the core for sure,
> but clients should be prepared to handle failed queries by reexecuting them
> anyway.

Ok, great, depending on time etc, I hope to get a refresh of branch
soon, and Ill take into account the code style comments and locking
advice you mentioned.

>
> Other than that, I'm happy to see it took so little code to add this!
>
>>> Thanks for sharing!
>>>
>>
>> Np, I look forward to more :)
>
> yay :)
>
> Please note we are leaving for a team meeting in the US tomorrow and will be
> there the next week.
> I'll try to follow up on stuff, but responses will likely be delayed a bit.

Ok np, we are pressed for time etc now also. Thanks for your comments,
and I let you know when i have more.

>
> cheers,
> -k
> --
> Kay Roepke
> Software Engineer, MySQL Enterprise Tools
>
> Sun Microsystems GmbH    Sonnenallee 1, DE-85551 Kirchheim-Heimstetten
> Geschaeftsfuehrer: Thomas Schroeder, Wolfang Engels, Dr. Roland Boemer
> Vorsitz d. Aufs.rat.: Martin Haering                    HRB MUC 161028
>
>



--
Nick Loeve



-- 
Nick Loeve
Follow ups

Re: funnel - a multiplexer plugin for mysql-proxy
From: nick loeve, 2009-02-17
References

funnel - a multiplexer plugin for mysql-proxy
From: nick loeve, 2009-02-05
Re: funnel - a multiplexer plugin for mysql-proxy
From: Kay Röpke, 2009-02-05
Re: funnel - a multiplexer plugin for mysql-proxy
From: nick loeve, 2009-02-06
Re: funnel - a multiplexer plugin for mysql-proxy
From: Kay Röpke, 2009-02-06