← Back to team overview

maria-developers team mailing list archive

答复: Idea for parallel replication with statement-based binlogging

 

Hi, Kristian,

  I am on vocation till today, sorry for delaying the response.

  Yes, it is indeed nice to have parallel replication when using statement-based binloging.

  In fact, the first version of my patch is based on table. The function "mysql_test_parse_for_slave"  (or its simplified version) can be used to get the table names in one query, a little more CPU consumed, acceptable.

  What should be pointed out is the improvement effect is not that perfect, if we want to keep the transaction isolation in the slave.
  My first version of patch has come across this situation, described as follow:
  Assume that there are 11 tables in one DB.
   A, B0, B1, B2,B3,B4,B5,B6,B7,B8,B9
  Table A is a table for statistics, so every transaction need to update one row in it. The transaction list like:
  tx_0:  update B0, B1, B2, A
  tx_1: update  B3, B4, B5, A
  tx_2: update B6, A
  tx_3: update B7, B8, A

 It makes all transactions should be run serially.
 Of course, it is the worst case, and unfortunately it is the case in our production, so I had to change it to row-based.

 Another hand, the "trigger" makes matter complicated. When using statement-based, and update table t1, the binlog only record the raw statement. If a trigger sets the rule that updates one row in table t2 , we can not recognize it. Perhaps some new events needed here.

  However, it seems table-based is the best solution for statement-based binloging.

Best Regards,
Xiaobin
________________________________________
发件人: Kristian Nielsen [knielsen@xxxxxxxxxxxxxxx]
发送时间: 2013年2月15日 16:53
到: 丁奇
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
主题: Idea for parallel replication with statement-based binlogging

Hi Xiaobin,

I hope you are enjoying the Spring Festival! All the best wishes for you and
your family for the coming year.

I had an idea for your parallel replication work, inspired by my work on
in-order commit. I wanted to hear your opinion on it, if you are interested.

The idea is to make your patch work with statement-based replication.

For row-based InnoDB events, you hash all the unique keys to check for
conflicts, and if no conflict you can run transactions in parallel even
against the same table.

For MyISAM, I get deadlock issues in my in-order-commit patch, if I try to
replicate in parallel against the same table. So I hash just the table names,
so that two transactions on the same table always conflict. Then I can still
replicate in parallel events against different tables.

Now suppose that for statement-based events, we obtain the names of all tables
used in the queries. We could add a new event to the binlog with the table
names, or perhaps parse the statement on the slave, if that is not too
expensive. Then we could hash just the names to check for conflicts, just like
I did for MyISAM with row-based events.

It seems to me that we could then safely replicate statement-based events in
parallel on the slave if they are to different tables. This would be great for
people that do not want to use row-based replication, it would surely be
better than the MySQL 5.6 multi-threaded slave which can only do queries
against different databases in parallel.

What do you think?

 - Kristian.

________________________________

This email (including any attachments) is confidential and may be legally privileged. If you received this email in error, please delete it immediately and do not copy it or use it for any purpose or disclose its contents to any other person. Thank you.

本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。

References