← Back to team overview

maria-developers team mailing list archive

Re: Ideas for improving MariaDB/MySQL replication


Alex Yurchenko <alexey.yurchenko@xxxxxxxxxxxxx> writes:

> On Mon, 25 Jan 2010 13:55:44 +0100, Kristian Nielsen
> <knielsen@xxxxxxxxxxxxxxx> wrote:
>> I think it would be useful if you explained what the problems are with
> that
>> interface, in your opinion.

> This interface does not seem to improve anything about how redundancy is
> achieved in MySQL. Moreover, it seems to cement all the bad decisions that
> were made in years into an explicit interface:

> - It exposes what should be redundancy service internal implementation
> details to SQL server.

> We won't get a clean flexible generic API before we clearly sort out what
> belongs where. And for that we'll need to look at redundancy service
> unencumbered by existing code. This is not a call to revolution. It is a
> suggestion to create a completely new parallel redundancy service API and
> _gradually_ reimplement required functionality under that API.


So if I understand you correctly, with "internal implementation details" we do
not mean just that the APIs expose internals of the SQL server which we want
to shield plugins from. Rather, the way the interface is designed it makes
assumptions about how the plugin that will use the iterface will be
implemented, thus making it unsuitable for other plugins that have other ideas
about what to do.

So more concretely, what we want is an API that does not make assumptions
about the format of the binlog file, or even that there is a binlog stored in
a file. And an API that does not assume that events to be applied will be read
from a specific mysql connection to a master server, returning data in a
particular binlog format. Like you wrote:

> I think the main problem here is that many people by force of habit regard
> things that should be internal implementation details of redundancy service
> as integral parts of an SQL server. Like binlog storage or relay service.

I had a brief but inspiring discussion with Serg about this at our meeting two
weeks ago. So basically, what we could aim for is to make the entire current
MySQL replication into a set of plugins. These plugins would be made against a
new plugin interface that would support not only the existing MySQL
replication for backwards compatibility, but also things like Galera and
Tungsten, and other ideas. So while the compatibility *plugins* would contain
the legacy MySQL binlog storage and relay service, the plugin *interface*
would not.

I think this is what you had in mind?

So the basic for such an interface would be the ability to install hooks to be
called with row data for every handler::write_row(), handler::update_row(),
and handler::delete_row() invocation, just like the current row-based
binlogging does. And similar for SQL statement execution like statement-based
logging does now. That should be clear enough.

Then comes the need to hook into transaction start and commit and opening
tables.  At this point, more of the internals of the MySQL server start to
appear, and some careful thought will be needed to get an interface that
exposes enough that plugins can do what they need, without exposing too much
internal details of how MySQL query execution is implemented.

(But note that this is two different issues regarding "internal
implementations". One is how the *query execution* is implemented. The other
is how the *plugins* are implemented. If I understood you correctly, the
interface used for semisync in MySQL fails on the latter point).

One example of how a lot of details from query execution pop up is with regard
to the mixed-mode binlogging. This is where queries are logged as statements
when this is safe, and as row events when this is not safe (nondeterministic
queries). The concept of "mixed mode binlogging" certainly seems like
something that should be an implementation detail of the plugin, not part of
the interface. On the other hand, determining whether a query is safe for
statement-based logging is highly complex, and exposing enough of the server
for the plugin to be able to determine this by itself may be too much. (Maybe
just expose an is_safe_for_statement() function to plugins could be enough).

Another example of hairy details is all the extra information that can go with
an SQL statement into the binary log. Things like current timestamp, random
seed, user-set @variables, etc. To support a statement-based replication
plugin, we probably have to expose all of this on the interface in a clean

> - It does not care to introduce a concept of global transaction ID.

Right. As I wrote earlier, this seems to be central to many of the ideas
involved in this project.

What I am wondering at the moment is if the concept of global transaction ID
should be a part of the new API, or if it is really an implemtation detail of
the reduncancy service.

On the one hand, if we make it part of the API, can we make it general enough
to support everything we want? For example, some plugin will need the ID to be
generated at the start of a transaction. Some will need it to be generated at
the end of the transaction.

On the other hand, if we make it _not_ part of the API, we run the risk of
making the API overly general and just pushing the problem down for each
plugin to try to solve individually.

I'll start working more deeply into these issues of new API and global
transaction ID.

 - Kristian.

Follow ups