← Back to team overview

maria-developers team mailing list archive

Documentation for Percona patch for row-based replication of pk-less tables

 

Hi Daniel,

I merged the Percona patch

    row_based_replication_without_primary_key.patch

into 5.2-percona. This will later be merged into MariaDB 5.3.

Here is some documentation, I was not sure where to put it, so hopefully you
can help.

The patch adds a new feature (performance improvement), and also fixes a bug.

Incidentally, do we have a list of all bugs fixed in MariaDB over MySQL? I
think that would be quite useful, even if it might get a bit out of date as
MySQL also fixes some bugs. I suppose it could mostly be compiled from your
release notes. This would also provide motivation for developers to actually
make a proper bug report on issues fixed and proper commit messages etc :-)
Anyway, just a suggestion.

 - Kristian.

-----------------------------------------------------------------------
Row-based replication of tables with no primary key.

MariaDB improves on row-based replication of tables which have no primary key
but do have some other index. This is based in part on the original Percona
patch "row_based_replication_without_primary_key.patch", with some additional
fixes and enhancements.

When row-based replication is used with UPDATE or DELETE, the slave needs to
locate each replicated row based on the value in columns. If the table
contains at least one index, an index lookup will be used (otherwise a table
scan is needed for each row, which is extremely inefficient for all but the
smallest table and generally to be avoided).

>From MariaDB 5.3, the slave will try to choose a good index among any
available:

 - The primary key is used, if there is one.

 - Else, the first unique index without NULL-able columns is used, if there is
   one.

 - Else, a choice is made among any normal indexes on the table (eg. a
   FULLTEXT index is not considered).

The choice of which of several non-unique indexes to use is based on the
cardinality of indexes; the one that is most selective (has the smallest
average number of rows per distinct tuple of column values) is prefered. Note
that for this choice to be effective, it is for most storage engines (like
MyISAM, InnoDB) necessary to make sure ANALYZE TABLE has been run on the
slave, otherwise statistics about index cardinality will not be available. In
the absense of index cardinality, the first unique index will be chosen, if
any, else the first non-unique index.

Prior to MariaDB 5.3, the slave would always choose the first index without
considering cardinality. The slave could even choose an unusable index (like
FULLTEXT) if no other index was available (Bug #58997), causing row-based
replication to break in this case; this is also fixed in MariaDB 5.3.



Follow ups