← Back to team overview

maria-developers team mailing list archive

Re: Ideas for improving MariaDB/MySQL replication

 

Hi, Alex!

Continuing the old discussion...

On Jan 22, Alex Yurchenko wrote:
> 
> 1) It is time to drop MASTER/SLAVE mentality. This has nothing to do
> with replication per se. For example multi-master Galera cluster is
> turned into master-slave simply by directing all writing transactions
> to a single node.  Without a single change in nodes' configuration,
> let alone our replication API. So master-slave is purely a load
> balancer thing - the node that receives writes IS the master even if
> the slaves think otherwise.

I may still use words "master" and "slave" below, in the sense that the
part of the code that takes the changes generated by local clients and
sends them out can be called "master" and the part of the code that
receives them and applies can be called "slave". Both can be active on
the same node though.
 
> 2) It is time to drop SYNCHRONOUS/ASYNCHRONOUS mentality. Although
> Galera cluster currently supports only synchronous operation, it can
> be turned into asynchronous with rather small changes in the code -
> again without any changes to API. This is merely a quality of
> replication engine.

Agree.
 
> So when refactoring replication code and API we suggest to think of
> replication as of redundancy service and establish a general API for
> such service that can be utilized by different implementations with
> different qualities of service. In other words - make a whole
> replication system a plugin (like storage engines are), not only some
> measly filters.

Ok, here I describe a possible model of what it can look like in the
server:

 * there are replication _events_ - they represent changes to the data,
   like creation of a table, or updating of a row.

 * there are event _generators_ or _producers_ - facilities that
   generate events, for example "SBR producer" generates a stream of
   events with the SQL statements - for a statement-based replication.
   There can also be "RBR producer", or, for example, "MyISAM physical
   producer" - that generates events in terms of pwrite() calls.

 * there are event _consumers_ - they connect to producers and consume
   the generated events. For example, a filter, such as that only allows
   changes to a certain table to be replicated, is both a consumer and a
   producer of events.

 * when events are sent to slaves - it's again just a pair of
   producer/consumer - events on the master dissapear in the consumer,
   events on the slave come out from a producer.

 * events can be _nested_ - one INSERT ... SELECT statement is one SBR
   event, but it corresponds to many RBR events, and every RBR event may
   correpond to many "MyISAM pwrite()" events.

 * not everything can be replicated at every level, for example table
   creation cannot be replicated row-based, InnoDB changes cannot be
   replicated with "MyISAM pwrite()" events

 * it is up to the event generation facility to make sure its stream of
   events is complete. It is implemented by fetching events from the
   upper level: for example, RBR producer connects - as a consumer - to
   the SBR producer, and when there are SBR events without nested RBR
   events it simply reads the corresponding SBR events and sends it out.

 * a consumer may know the event format and look at the data fields, or
   it may not. For example, a filter that adds checksums to events or
   a consumer that sends events to slaves do not need to care about event
   format. But a "final consumer" - the one that ultimately applies event
   on the slave side - apparently should know how to parse the event.

 * there's no explicit global transaction ID here, but I presume
   there can be a filter that adds it to events. That would work, as
   long as replication decides on the commit order (which is does, even
   now in MySQL/MariaDB).

this model seems to allow both native MySQL replication - sbr, rbr, and
mixed, with exactly the same protocol on the wire - and different
extensions, like semysync or fully synchronous replication,
heterogeneous replication, arbitrary transport protocols, and so on.
It looks like it can be completely compatible with MySQL replication if
necessary or use something absolutely different - depending on what
plugins are loaded and how they are connected. but the model itself has
no notion of "master node" or "slave node", synchronous or asynchronous,
binlog, MySQL protocol, relay log, or even SBR/RBR/MIXED modes.

Regards,
Sergei




Follow ups

References