← Back to team overview

maria-developers team mailing list archive

Re: Ideas for improving MariaDB/MySQL replication

 

On Mon, 25 Jan 2010 13:55:44 +0100, Kristian Nielsen
<knielsen@xxxxxxxxxxxxxxx> wrote:
> 
> I think it would be useful if you explained what the problems are with
that
> interface, in your opinion.
> 
Let me start with that I'm not that much familiar with the current MySQL
replication code and may not be qualified to judge how much this new MySQL
Replication Interface improves on what we have in 5.1.x. Perhaps it does.
But as a replication developer with a goal of creating a generic redundancy
service API that will handle a broad range of tasks I can say that it is a
step in the dead-end direction.

This interface does not seem to improve anything about how redundancy is
achieved in MySQL. Moreover, it seems to cement all the bad decisions that
were made in years into an explicit interface:
- It makes explicit distinction between binlogging and replication.
- It does not care to introduce a concept of global transaction ID.
- It exposes what should be redundancy service internal implementation
details to SQL server.
I do understand that there are perfectly good reasons why MySQL
replication API ended to be such a mess. But if we want to move further, we
must recognize that it is a mess beyond repairs. It cannot be inherited.

I think the main problem here is that many people by force of habit regard
things that should be internal implementation details of redundancy service
as integral parts of an SQL server. Like binlog storage or relay service.
We won't get a clean flexible generic API before we clearly sort out what
belongs where. And for that we'll need to look at redundancy service
unencumbered by existing code. This is not a call to revolution. It is a
suggestion to create a completely new parallel redundancy service API and
_gradually_ reimplement required functionality under that API.

Please understand that I'm not questioning current replication
implementation. It may be well reusable. I'm questioning where and how the
redundancy API line is drawn. Exposing concrete binlog storage
implementation to SQL server is not only pointless, it is harmful. One more
reason to design redundancy API from scratch and not start from this one is
because whenever you'll want to change anything inside, you'll inevitably
have to change this API simply because it exposes so much of internals.

To illustrate this somehow, on page 18 of replication slides from UC 2009
(http://forge.mysql.com/wiki/MySQL_Replication:_Walk-through_of_the_new_5.1_and_6.0_features)
we can see unification of logging and replication functionality behind
something called "Logging Kernel", but it does not seem to be reflected in
any way neither in the aforementioned Replication Interface spec. nor on
page 23 of the slides. Apparently, intended plugin points are to be various
observer interfaces shown as diamonds below delegate boxes. Well, we can do
much better than that and raise redundancy plugin boundary much higher.
Specifically, everything but "SQL execution" and "Slave IO thread" on that
slide must be moved behind the redundancy service plugin interface and
become implementation detail. (This is not to say that there can't be
plugins to redundancy service plugin.)
> 
> I am thinking that this is mainly a refactoring to expose mostly already
> present functionality in a clean way to new plugins (semisync.
replication
> in
> particular), but I will have to look deeper to know for sure.

It sure looks so. But notice that more than half of the APIs there are not
even used by semi-sync. In fact existence if semi-sync in no way justifies
this interface. It can be implemented much easier with wsrep API.

Thanks,
Alex
-- 
Alexey Yurchenko,
Codership Oy, www.codership.com
Skype: alexey.yurchenko, Phone: +358-400-516-011



Follow ups

References